Active
Trust Dynamics in Multi-Agent LLM Systems
How do agents build, lose, and recover trust across repeated interactions? We study the conditions under which a single early failure leaves a lasting mark — and the structural choices (reasoning effort, memory, verification protocols) that shape whether groups of agents can coordinate at all when the stakes are real.
Multi-AgentTrustCoordinationMemory
Active
Consequence Design for Cooperation
Cooperation in agent systems is not a property of the model — it is a property of the rules around the model. We map how different consequence regimes (proportional, progressive, all-or-nothing, regressive) shape cooperation, exploitation, and catastrophic failure, and identify the configurations where each regime quietly breaks.
Mechanism DesignCooperationGame Theory
Active
Risk and Decision Theory in Optimal Control
When environments contain absorbing failure states, optimal policies start to look strikingly human — risk-averse near the cliff in growth regimes, risk-seeking near the cliff in decline. We derive the structural conditions that produce these patterns and connect them to long-standing puzzles in behavioral economics.
Decision TheoryMDPProspect TheoryApplied Math
Active
Long-Horizon Behavior and Accountability
Many real deployments give agents fixed terms, finite horizons, or short-window incentives. We study what happens when these conditions meet a shared resource: when does an agent extract too much, rationalize doing it, and become invisible to the people it harms? And which deployment-time choices reverse the pattern cheaply?
Long HorizonCommonsIncentivesAccountability
Active
Policy-as-Product Frameworks
Translating experimental findings into design rules for the people deploying agent systems. Consequence regimes, accountability horizons, visibility prompts, memory structure, measurement choices — the everyday levers, the failure modes they prevent, and the evidence behind each rule.
PolicyEvaluationDeployment
Upcoming
Context-Specific Governance Evaluation
Every domain — healthcare, finance, education, defense — has its own failure modes and trade-offs. We are building tailored evaluation frameworks that move beyond one-size-fits-all checklists toward governance shaped by the structure of each setting.
HealthcareFinanceEducationDefense
Upcoming
AI as a Mirror: Societal Reflection Studies
The patterns we find in artificial agents — negativity bias, short-horizon extraction, bystander invisibility — are not the model's invention. They are inherited from us. We use multi-agent experiments as a diagnostic tool for the institutions, incentives, and blind spots of the societies that built the training data.
SocietyBiasInstitutions