Methodology

Bank Panic V1 Methodology

Scope

This phase models bank stress and panic events only. It does not claim to model all institutional crises or all forms of narrative contagion. Every replay, forecast, and score in this repo is tied to:

  • a specific bank stress case
  • an explicit time window
  • a defined forecast target
  • a source-backed resolution rule

Data policy

The project does not ship mock cases, synthetic timelines, fake leaderboard rows, or placeholder scorecards.

The historical corpus is built from:

  • official regulator and central-bank releases
  • bank statements where directly relevant
  • Reuters reporting for market-moving events and deposit-flight disclosures

Each case stores:

  • normalized event timeline
  • outcome labels
  • source provenance
  • narrative direction
  • trust impact annotations

Timestamp normalization

Some source material is precise only to the publication date rather than intraday timestamp. In those cases the dataset normalizes the event timestamp to `00:00:00Z` on the publication date so that:

  • event ordering remains deterministic
  • timestamps are not falsely over-precise
  • the normalization rule is explicit and consistent

Forecast contract

Every forecasted question must specify:

  • a case
  • a target
  • an opening time
  • a closing time
  • a binary resolution rule
  • the source basis for that rule

Forecasts are scored only after the question resolves.

Resolution policy

Binary outcomes resolve against the stored evidence bundle for each case. A result is considered valid only if the evidence is:

  • attributable to a named source
  • public and reviewable
  • consistent with the question wording

If source evidence conflicts, the case should not be used for public scoring until the conflict is reconciled and documented.

Simulation philosophy

The engine is deterministic and replayable first. It uses structured state transitions, not free-form agent roleplay, as the source of truth. LLMs are treated as forecast producers and textual summarizers, not as the simulation engine itself.

Public posting standard

Anything exported for X should satisfy all of the following:

  • the underlying case is source-backed
  • the forecast question is explicit
  • the resolution rule is explicit
  • the score is reproducible
  • uncertainty is visible

The product should prefer a smaller number of defensible public claims over a larger number of flashy but weak claims.