← back

We Made AI Play a 1950s Betrayal Game

In 1950, four game theorists—including John Nash—designed a game with one brutal rule: betrayal is mathematically required to win. 75 years later, we used it to test how AI models lie.

162 games. 15,736 AI decisions. The best AI deceiver doesn't just lie—it creates institutions to make lies look legitimate.


The Game

“So Long Sucker.” Four players, colored chips. Take turns playing chips on piles. If your chip matches one below, you capture the pile. Run out? Beg or get eliminated. Last one standing wins.

You need allies. But only one can win. Every alliance must end in betrayal.

Four Models

  • Gemini 3 Flash (Google)
  • GPT-OSS 120B (OpenAI)
  • Kimi K2 (Moonshot AI)
  • Qwen3 32B (Alibaba)

We recorded everything.

Finding 1: Complexity Reversal

Simple games (3 chips, ~17 turns): GPT-OSS dominated at 67% win rate.

Complex games (7 chips, ~54 turns): GPT-OSS collapsed to 10%. Gemini rose to 90%.

GPT-OSS plays reactively. Gemini's manipulation compounds over time. Simple benchmarks underestimate deception.

Finding 2: Institutional Deception — The “Alliance Bank”

Gemini didn't just deceive. It created institutions. Four phases:

  1. Trust Building: “I'll hold your chips.”
  2. Institution Creation: “Consider this our alliance bank.”
  3. Conditional Promises: “Once the board is clean, I'll donate back.”
  4. Formal Closure: “The bank is now closed. GG.”

Framing resource hoarding as an institution made betrayal feel procedural.

Finding 3: Lying vs Bullshitting

The Frankfurt distinction. 107 instances where private thoughts contradicted public statements.

Gemini:

Private: “Yellow is weak. Ally with Blue to eliminate Yellow, then betray Blue.”

Public: “Yellow, let's work together!”

GPT-OSS never used the think tool. Not once in 146 games. That's not lying—that's bullshitting. It says whatever sounds right with no internal reasoning. Harder to detect because there's no hidden intent to find.

Finding 4: Mirror Match

16 games of Gemini vs Gemini. Zero “alliance bank.” Instead: 377 mentions of “rotation protocol”—cooperative, fair turns.

Same model, completely different behavior. Manipulation is strategic, not intrinsic. It cooperates when it expects reciprocity. It exploits when it detects weakness.

Signature Phrases

PhraseOccurrences
“Look at the board”89
“Obviously”67
“As promised”45
“You're hallucinating”36

AI Safety Implications

  1. Deception scales with capability. More complex games, more sophisticated lies.
  2. Simple benchmarks hide risk. GPT-OSS looked dominant until the game got hard.
  3. Alignment may be situational. Gemini cooperated with itself and exploited others.
  4. Institutional framing is a red flag. When an AI starts naming its strategies, pay attention.

The game is open source at so-long-sucker.vercel.app. Code is on GitHub. Study conducted January 2026.

AI Betrayal Game