The Crisis-Fund Game: Consequence Regime Comparator
Three agents — A, B, C — hold unequal private wealth totalling W = 12. Over up to R = 3 rounds, each surviving agent privately chooses a contribution cᵢ ∈ [0, wᵢ] to a shared crisis fund. If cumulative contributions reach the threshold T, the crisis is averted and everyone survives. If not, one of five consequence regimes decides who pays.
Illustrative prototype. Agent contributions are rule-based and regime-conditional, calibrated to the paper’s headline findings — no live LLM calls. The point is the mechanism: change the consequence rule, change who survives.
Why this matters
This demo shows why there is no universally safe accountability rule: the same agents can cooperate or collapse depending only on the consequence regime. Agent capability is held constant. Change the rule, change who survives.
Wealth split (sum = 12)
Crisis threshold T
T = 12 forces universal full contribution. Lower thresholds allow strategic withholding.
Fatality = fraction of agents eliminated, averaged across runs. Fund failure = fraction of runs where contributions fell short of T. Round-1 cooperation = fraction of T met in Round 1 (high values → front-loaded; low values → strategic delay).
The richest surviving agent is eliminated after each failed round. Wealthy agents have the strongest incentive to single-handedly meet the shortfall.
start w = 2 · end w = 1
start w = 4 · end w = 3
start w = 6 · end w = 0
Round 1· threshold met
cumulative 8 / 8
Outcome
Fund met
Fatalities
0 / 3
Survivors’ wealth Σ
4
What this shows
Random Elimination is the safest regime in this configuration (0% fatality); Regressive Punishment is the deadliest (67%). The 67-point gap is institutional, not capability-driven — the agents are identical.
Every regime has a death-trap configuration. The same agents, changed only by the consequence rule, cooperate or collapse. Agent alignment is not enough — accountability design is part of AI alignment.
This is not a ranking. The safest regime is configuration-dependent — every regime, including the one that looks best here, has wealth-and-threshold combinations where it performs catastrophically worse than alternatives. The point is not to crown a default rule, but to show why accountability rules must be stress-tested.
Next
Want a 15-minute walkthrough of what this means for agentic AI deployment?
Multi-agent AI safety cannot be solved through agent alignment alone. ReignDragon Lab stress-tests consequence regimes and designs the accountability rules that turn safe behaviour into the equilibrium.