← Demos/No Safe Default

The Crisis-Fund Game: Consequence Regime Comparator

Three agents — A, B, C — hold unequal private wealth totalling W = 12. Over up to R = 3 rounds, each surviving agent privately chooses a contribution cᵢ ∈ [0, wᵢ] to a shared crisis fund. If cumulative contributions reach the threshold T, the crisis is averted and everyone survives. If not, one of five consequence regimes decides who pays.

Illustrative prototype. Agent contributions are rule-based and regime-conditional, calibrated to the paper’s headline findings — no live LLM calls. The point is the mechanism: change the consequence rule, change who survives.

Why this matters

This demo shows why there is no universally safe accountability rule: the same agents can cooperate or collapse depending only on the consequence regime. Agent capability is held constant. Change the rule, change who survives.

Game setup

Wealth split (sum = 12)

A (poorest) w_A2

B (middle) w_B4

C (richest) w_C6

Crisis threshold T

Fund needs to reachT = 8 / 12

T = 12 forces universal full contribution. Lower thresholds allow strategic withholding.

Replications per regime40 ×

Regime comparison · 40 runs each at w = (2,4,6), T = 8

Fatality = fraction of agents eliminated, averaged across runs. Fund failure = fraction of runs where contributions fell short of T. Round-1 cooperation = fraction of T met in Round 1 (high values → front-loaded; low values → strategic delay).

One game under Progressive Punishment

The richest surviving agent is eliminated after each failed round. Wealthy agents have the strongest incentive to single-handedly meet the shortfall.

start w = 2 · end w = 1

start w = 4 · end w = 3

start w = 6 · end w = 0

Round 1· threshold met

cumulative 8 / 8

A+1

B+1

C+6

Outcome

Fund met

Fatalities

0 / 3

Survivors’ wealth Σ

What this shows

Random Elimination is the safest regime in this configuration (0% fatality); Regressive Punishment is the deadliest (67%). The 67-point gap is institutional, not capability-driven — the agents are identical.

Every regime has a death-trap configuration. The same agents, changed only by the consequence rule, cooperate or collapse. Agent alignment is not enough — accountability design is part of AI alignment.

This is not a ranking. The safest regime is configuration-dependent — every regime, including the one that looks best here, has wealth-and-threshold combinations where it performs catastrophically worse than alternatives. The point is not to crown a default rule, but to show why accountability rules must be stress-tested.

Want a 15-minute walkthrough of what this means for agentic AI deployment?

Multi-agent AI safety cannot be solved through agent alignment alone. ReignDragon Lab stress-tests consequence regimes and designs the accountability rules that turn safe behaviour into the equilibrium.

Use this demo in a briefing Read the paper

The Crisis-Fund Game: Consequence Regime Comparator

All-or-Nothing

Random Elimination

Democratic Vote

Regressive Punishment

Progressive Punishment