Base-Rate Neglect — Orange Pill Wiki
CONCEPT

Base-Rate Neglect

The cognitive bias of ignoring prior probabilities in favor of case-specific information — a systematic error Tetlock documented in expert forecasters and that AI's narrative fluency amplifies.

Base-rate neglect is the tendency to underweight or ignore the statistical frequency of an outcome in a reference class when making predictions about a specific case. Physicians estimating the probability that a patient with a positive test result actually has a disease often ignore the base rate of the disease in the population, focusing instead on the test's sensitivity. Forecasters predicting whether a specific technology will be transformative often ignore the base rate of transformative-technology predictions coming true, focusing instead on the technology's impressive features. Tetlock's research demonstrated that experts are particularly vulnerable to base-rate neglect because their domain knowledge provides rich case-specific details that overwhelm the austere discipline of considering the prior probability.

In the AI Story

Hedcut illustration for Base-Rate Neglect
Base-Rate Neglect

The canonical demonstration involves a screening test with ninety-five-percent sensitivity (correctly identifies ninety-five percent of sick patients) and five-percent false-positive rate, applied to a population where the disease prevalence is one in a thousand. A positive result has only a two-percent probability of indicating actual disease, because the false positives (five percent of 999 healthy patients ≈ 50) overwhelm the true positives (95 percent of one sick patient ≈ 0.95). Most physicians, when given this problem, estimate ninety-five percent — confusing the test's sensitivity with its positive predictive value. The error is not mathematical incompetence but base-rate neglect: the doctors attend to the case-specific information (this patient tested positive) and ignore the prior probability (the disease is rare).

In technological forecasting, base-rate neglect manifests as the systematic overweighting of a technology's impressive features relative to the historical frequency of technologies with impressive features actually transforming civilization within predicted timeframes. The printing press did transform civilization — but the transformation took a century, not a decade, and the specific predictions made by contemporaries about how it would transform were mostly wrong. Nuclear power, artificial intelligence (1960s version), nanotechnology, virtual reality (1990s version) — each was predicted to revolutionize everything within a generation. The base rate for such predictions coming true on the predicted timeline is low. The superforecaster uses this base rate as the starting estimate and adjusts upward only when the specific technology demonstrates features that genuinely distinguish it from the reference class.

AI tools exacerbate base-rate neglect by providing fluent, detailed, case-specific narratives on demand. Asked to analyze whether a particular AI application will succeed, an LLM generates a sophisticated assessment based on the application's features, the market opportunity, the team's capabilities. The assessment sounds like analysis. It is, in fact, an inside-view simulation that has not been calibrated against the base rate of startup success (low), the base rate of AI application success (lower), or the base rate of confident AI-generated business predictions being accurate (unknown but probably not high). The professional who accepts the AI's analysis without independently considering the base rate has outsourced exactly the cognitive operation that Tetlock's research shows is most essential to accuracy.

Origin

The phenomenon was first systematically documented by Kahneman and Tversky in their 1973 paper 'On the Psychology of Prediction,' which introduced the base-rate fallacy as a violation of Bayesian reasoning. Bayes' theorem specifies how to update a prior probability (the base rate) in light of new evidence (case-specific information), and the theorem is mathematically optimal. Human intuition violates the theorem systematically, underweighting priors and overweighting case-specific data. Subsequent research extended the finding across domains, and Tetlock's forecasting studies provided the largest-scale demonstration that the bias persisted even in experts making predictions in their own fields. The phenomenon is robust, replicable, and resistant to correction through simple education.

Key Ideas

Prior probability anchoring. Every forecast should begin with the base rate — the frequency of the outcome in a reference class — before adjusting for case-specific evidence.

Case-specific seduction. Vivid details, causal narratives, and unique features are cognitively available and emotionally compelling — they overwhelm the abstract base rate even when the base rate is more informative.

Expert vulnerability. Domain experts are more susceptible to base-rate neglect than novices, because expertise provides more case-specific information to attend to.

Bayes' theorem as corrective. Formal application of Bayesian updating forces appropriate weighting of priors and evidence — the mathematics is a discipline the unaided mind does not spontaneously practice.

AI amplification of bias. Large language models generate rich inside-view narratives without base-rate consideration, providing users with sophisticated-sounding analyses that have not been Bayesian-adjusted.

Appears in the Orange Pill Cycle

Further reading

  1. Kahneman, D., & Tversky, A. (1973). 'On the Psychology of Prediction.' Psychological Review, 80(4), 237–251.
  2. Bar-Hillel, M. (1980). 'The Base-Rate Fallacy in Probability Judgments.' Acta Psychologica, 44, 211–233.
  3. Koehler, J.J. (1996). 'The Base Rate Fallacy Reconsidered.' Behavioral and Brain Sciences, 19(1), 1–53.
Part of The Orange Pill Wiki · A reference companion to the Orange Pill Cycle.
0%
CONCEPT