Methodology
Bank Panic V1 Methodology
Scope
This phase models bank stress and panic events only. It does not claim to model all institutional crises or all forms of narrative contagion. Every replay, forecast, and score in this repo is tied to:
- a specific bank stress case
- an explicit time window
- a defined forecast target
- a source-backed resolution rule
Data policy
The project does not ship mock cases, synthetic timelines, fake leaderboard rows, or placeholder scorecards.
The historical corpus is built from:
- official regulator and central-bank releases
- bank statements where directly relevant
- Reuters reporting for market-moving events and deposit-flight disclosures
Each case stores:
- normalized event timeline
- outcome labels
- source provenance
- narrative direction
- trust impact annotations
Timestamp normalization
Some source material is precise only to the publication date rather than intraday timestamp. In those cases the dataset normalizes the event timestamp to `00:00:00Z` on the publication date so that:
- event ordering remains deterministic
- timestamps are not falsely over-precise
- the normalization rule is explicit and consistent
Forecast contract
Every forecasted question must specify:
- a case
- a target
- an opening time
- a closing time
- a binary resolution rule
- the source basis for that rule
Forecasts are scored only after the question resolves.
Resolution policy
Binary outcomes resolve against the stored evidence bundle for each case. A result is considered valid only if the evidence is:
- attributable to a named source
- public and reviewable
- consistent with the question wording
If source evidence conflicts, the case should not be used for public scoring until the conflict is reconciled and documented.
Simulation philosophy
The engine is deterministic and replayable first. It uses structured state transitions, not free-form agent roleplay, as the source of truth. LLMs are treated as forecast producers and textual summarizers, not as the simulation engine itself.
Public posting standard
Anything exported for X should satisfy all of the following:
- the underlying case is source-backed
- the forecast question is explicit
- the resolution rule is explicit
- the score is reproducible
- uncertainty is visible
The product should prefer a smaller number of defensible public claims over a larger number of flashy but weak claims.