Methodology

Bank Panic V1 Methodology

Scope

This phase models bank stress and panic events only. It does not claim to model all institutional crises or all forms of narrative contagion. Every replay, forecast, and score in this repo is tied to:

a specific bank stress case
an explicit time window
a defined forecast target
a source-backed resolution rule

Data policy

The project does not ship mock cases, synthetic timelines, fake leaderboard rows, or placeholder scorecards.

The historical corpus is built from:

official regulator and central-bank releases
bank statements where directly relevant
Reuters reporting for market-moving events and deposit-flight disclosures

Each case stores:

normalized event timeline
outcome labels
source provenance
narrative direction
trust impact annotations

Timestamp normalization

Some source material is precise only to the publication date rather than intraday timestamp. In those cases the dataset normalizes the event timestamp to `00:00:00Z` on the publication date so that:

event ordering remains deterministic
timestamps are not falsely over-precise
the normalization rule is explicit and consistent

Forecast contract

Every forecasted question must specify:

a case
a target
an opening time
a closing time
a binary resolution rule
the source basis for that rule

Forecasts are scored only after the question resolves.

Resolution policy

Binary outcomes resolve against the stored evidence bundle for each case. A result is considered valid only if the evidence is:

attributable to a named source
public and reviewable
consistent with the question wording

If source evidence conflicts, the case should not be used for public scoring until the conflict is reconciled and documented.

Simulation philosophy

The engine is deterministic and replayable first. It uses structured state transitions, not free-form agent roleplay, as the source of truth. LLMs are treated as forecast producers and textual summarizers, not as the simulation engine itself.

Public posting standard

Anything exported for X should satisfy all of the following:

the underlying case is source-backed
the forecast question is explicit
the resolution rule is explicit
the score is reproducible
uncertainty is visible

The product should prefer a smaller number of defensible public claims over a larger number of flashy but weak claims.