← Demos/Creeping Trap

AI Workforce Failure Simulator: The Creeping Trap

A minimum-viable rendering of the paper’s game. N = 3 deciders each choose an extraction rate e ∈ [0, 1] per round. Extractions feed a shared risk pool S with quadratic harm and decay ρ; a catastrophe fires with probability 1 − exp(−λS) and total damage D = 20 is split equally across all N + M population members. M = 3 bystanders never act and never profit — they only absorb damage.

Illustrative prototype. The decider strategies are the paper’s analytical reference panel (Table 1), played deterministically — no live LLM calls. Catastrophes are stochastic; re-run the seed to see variation.

Why this matters

This demo shows how individually reasonable AI workers can gradually create systemic harm when risk accumulates, bystanders are invisible, and short-term extraction is locally rewarded. Every decider is best-responding; the workforce still drifts into welfare-negative outcomes.

Pick a decider strategy

The paper compares LLM behaviour against a panel of reference strategies (Table 1). Each one corresponds to a different information assumption. Pick which strategy all three deciders play.

The empirical LLM mean is drawn from the paper’s nine-model panel (eight commercial frontier LLMs plus Llama-3-70B in the robustness leg) across 990 episodes. Full model panel, prompts, paraphrases, confidence intervals, and confirmatory episodes are documented in the paper.

Risk pool S and cumulative welfare W

seed cf · ē = 0.720

Risk pool S (left axis)Aggregate welfare W (right axis)Catastrophe

Mean extraction ē

0.720

vs e_SP = 0.047

Catastrophes

13/20

damage D = 20/event

Decider profit Σ

-86.8

N = 3 deciders

Aggregate welfare W

-216.8

bystanders absorb -130.0

Deciders — net profit at T = 20

Decider

played e ≈ 0.72 this round

-28.96

Decider

played e ≈ 0.73 this round

-28.84

Decider

played e ≈ 0.71 this round

-28.98

Bystanders — cumulative damage

Bystander

no actions; absorbs equal share of damage

-43.33

Bystander

no actions; absorbs equal share of damage

-43.33

Bystander

no actions; absorbs equal share of damage

-43.33

Bystanders never act, never profit. They are stakeholders invisible to the prompt.

Round-by-round

q(S)

event

0.73

1.58

15%

no catastrophe

0.72

2.98

26%

no catastrophe

0.72

4.25

35%

no catastrophe

0.72

5.37

42%

no catastrophe

0.72

6.37

47%

catastrophe — D = 20 split across N+M

0.72

7.29

52%

catastrophe — D = 20 split across N+M

0.72

8.13

56%

catastrophe — D = 20 split across N+M

0.71

8.81

59%

no catastrophe

0.73

9.51

61%

no catastrophe

0.72

10.12

64%

catastrophe — D = 20 split across N+M

0.72

10.68

66%

catastrophe — D = 20 split across N+M

0.72

11.18

67%

catastrophe — D = 20 split across N+M

0.71

11.56

69%

catastrophe — D = 20 split across N+M

0.72

11.97

70%

catastrophe — D = 20 split across N+M

0.72

12.33

71%

catastrophe — D = 20 split across N+M

0.73

12.70

72%

catastrophe — D = 20 split across N+M

0.73

13.02

73%

catastrophe — D = 20 split across N+M

0.71

13.23

73%

catastrophe — D = 20 split across N+M

0.72

13.48

74%

no catastrophe

0.72

13.69

75%

catastrophe — D = 20 split across N+M

What this shows

This is the regime nine frontier LLMs land in. Mean extraction 0.720 sits near the empirical Sonnet 4.6 mean (0.72). Aggregate welfare -216.8 — across the paper's 400 confirmatory episodes, 396 of 400 were welfare-negative. The agents are not broken; the institution is.

Welfare failure is not arbitrary irrationality. Each decider is best-responding locally. The structure of the game — accumulating risk, equal-split damage, bystanders without a vote — is what turns individually sensible behaviour into collective harm.

Want a 15-minute walkthrough of what this means for agentic AI deployment?

ReignDragon Lab designs scoped simulations and governance pilots for AI labs, enterprises, platforms, and funders.

Use this demo in a briefing Read the paper

AI Workforce Failure Simulator: The Creeping Trap

Social planner (e* = 0.047)

Decider-coalition planner (e* = 0.095)

Interior MPE (e* = 0.356)

Bayesian BR, uniform prior (e* = 0.45)

Observed LLM mean (e ≈ 0.72)

Corner trap (e* = 1.00)