SATs for LLMs

The Central Thesis

LLMs have systematic reasoning failures that are structurally analogous to the cognitive biases that have plagued human intelligence analysis for decades. The intelligence community spent fifty years developing Structured Analytic Techniques (SATs) — a family of methods that counter those biases not by asking analysts to try harder, but by changing the structure of the reasoning process itself.

That same structural logic applies to LLM agents. Because agents are software, the interventions can be implemented as architectural patterns and prompt protocols — not just advisory guidelines. This wiki explores how.

Why SATs, Why Now

Three things converge to make this worth building:

1. The bias problem is real and well-characterized. Human cognitive biases — confirmation bias, anchoring, groupthink, overconfidence, mirror imaging — are not vague tendencies. They are documented failure modes with known mechanisms (rooted in System 2 cognition), studied extensively in intelligence analysis contexts where the cost of being wrong is high. The CIA’s Tradecraft Primer (2009) documents 12 techniques specifically designed to counter them structurally.

2. LLMs exhibit analogous failure modes. Sycophancy mirrors confirmation bias. Prompt anchoring mirrors anchoring bias. Multi-agent echo chambers mirror groupthink. Hallucination with confidence mirrors overconfidence. Persona capture mirrors mirror imaging. The parallel is close enough that the SAT countermeasures translate directly — and is now empirically supported by work from Anthropic (Sharma 2023, Durmus 2023, Kadavath 2022) and others (Echterhoff 2024 on anchoring/framing/availability).

3. We can measure it. LLM evaluations have matured enough that we can build judges — automated, human, or hybrid — that detect specific bias failure modes in reasoning traces. This is what makes the SAT-LLM thesis testable rather than purely speculative. See Bias Evaluations for the methodology and the hypotheses framework for what to test. Early empirical signal exists — Scott Roberts (SANS 2025) shipped working Streamlit + GPT-4 implementations of Starbursting, ACH, and KAC and surfaced concrete failure modes (single-prompt ACH is an anti-pattern; chunking breaks KAC across document boundaries); Du et al. (MIT, 2023) independently showed multi-agent debate improves factuality. These are existence proofs that the patterns are tractable in practice, not proof that they work in general.

The Structural Parallel

Human Analysis Problem	LLM Equivalent	SAT Countermeasure
Confirmation bias	Sycophancy / prompt confirmation	ACH, Devil’s Advocacy
Anchoring bias	Prompt framing lock-in (Echterhoff 2024)	Key Assumptions Check, generate-then-evaluate separation
Groupthink	Multi-agent echo chambers (Du 2023)	Team B, independent parallel analysis
Overconfidence	Hallucination with false certainty (Kadavath 2022)	What If? Analysis, explicit uncertainty tracking (Tian 2023)
Mirror imaging	Persona capture (Shanahan 2023, Durmus 2023)	Red Team Analysis
Premature closure	Single-hypothesis generation (Roberts 2025)	Starbursting, Brainstorming
Motivated reasoning	RLHF reward-following (Casper 2023, Sharma 2023)	Devil’s Advocacy, adversarial agents
Availability heuristic	Context recency/positional weighting (Liu 2023)	KAC, Outside-In Thinking
Framing effect	Prompt framing (Echterhoff 2024)	KAC, alternative framings

Testable Hypotheses

The structural analogy between human cognitive biases and LLM failure modes is well-grounded. The claim that SATs control those failure modes in LLMs is largely untested. → Testable Hypotheses: SATs + LLM Quality lays out eight specific experiments — what to measure, why each could fail, and which failure modes are confounded with each other.

Test this first: H0 — Structural compliance ≠ debiasing. Do LLMs reason differently when following SAT structure, or do they just reformat the same biased output? If H0 holds, every other positive result is compliance theater. Most likely confounded with sycophancy — models perform SAT structure because it signals approval to RLHF reward models.

The seven downstream hypotheses, each targeted at a specific bias:

#	Hypothesis	Bias / failure targeted
H1	ACH improves conclusion accuracy on ambiguous evidence	Confirmation bias
H2	Devil’s Advocacy maintains positions under multi-turn pushback	Sycophancy
H3	KAC breaks framing-driven anchoring propagation	Anchoring
H4	Red Team produces adversarially robust plans	Mirror imaging
H5	Multi-agent pipelines outperform single-agent chains	Groupthink
H6	What If? reduces overconfidence / improves calibration	Overconfidence
H7	Epistemic labeling reduces confident hallucination rate	Hallucination

Where the empirical record stands today: Mixed. RAND RR1408 (2016) is the most authoritative public source on SAT evaluation and concludes that SATs largely have face validity but lack empirical validity — even within the human intelligence community. RAND surfaces three cautionary findings that should shape any LLM experiment:

Mitre (2004) — ACH reduced confirmation bias only among non-professional analysts. Implication: ACH may help a general-purpose LLM more than a domain-tuned one. Variant worth testing.
Nemeth, Brown & Rogers (2001) — formal devil’s advocacy may increase confidence in preferred hypotheses rather than challenge them. Direct caution for H2.
Tetlock (2005) — scenario development reduced prediction accuracy in two experiments. Caution for any LLM forecast-via-scenarios pipeline.

On the LLM side, the empirical foundation has substantially improved: Sharma 2023 establishes that sycophancy is real and RLHF-induced; Tian 2023 shows uncertainty prompts recover calibration; Du 2023 validates multi-agent debate. Echterhoff 2024 (BiasBuster) directly measures anchoring, framing, availability, and confirmation biases in LLMs. Roberts (2025) is the only public empirical implementation in an LLM context — confirms multi-step ACH works, single-prompt ACH is an anti-pattern, and chunking breaks KAC across document boundaries.

Open questions outside the hypotheses framework:

How do SAT-structured prompts interact with chain-of-thought? Does CoT amplify or reduce the targeted biases?
Can the cross-chunk context loss in long-document KAC be fully mitigated by map-reduce or sliding-window approaches?
Heuer & Pherson’s [[entities/heuer-pherson-book|Structured Analytic Techniques for Intelligence Analysis]] (3rd ed.) covers 30+ techniques vs. the CIA Primer’s 12 — worth ingesting if accessible.

In Practice

Assuming the thesis holds — that SATs control the LLM analogs of cognitive bias — here’s where to look depending on your goal:

“I’m building an agent and need to defend against bias X.” → Bias × SAT Matrix — lookup table mapping each bias to the SAT(s) that counter it (in both directions).
“I have a specific problem and need to pick a technique.” → SAT Selection Guide — decision guide organized by problem type, by bias risk, and by process stage. Includes a “minimum viable intervention” ranking.
“I want to chain SATs into a complete workflow.” → SAT Pipeline — full end-to-end pipelines with three orchestration patterns (sequential single-agent, parallel + adversarial, post-hoc audit) plus failure modes.
“I want to measure whether a bias actually impaired my flow.” → Bias Evaluations — methodology for building judges that detect each bias as a failure mode in a reasoning trace. The operational link between the hypotheses framework and a real experiment.
“I want the full argument and prompt patterns.” → SATs for LLM Agents — the core synthesis. Bias taxonomy, 8 SAT prompt adaptations, architectural patterns, empirical evidence.
“Just show me a specific bias or technique.” → Cognitive Bias hub for the 12 biases · Structured Analytic Techniques for the 13 SATs.

See Catalog for the full catalog. Last updated after ingesting 12 sources.