CONCEPT

Reference Class Forecasting

The methodology — developed by Flyvbjerg and adopted as UK government policy — that corrects planning forecasts by forcing the planner to compare the current project against the statistical distribution of outcomes for the reference class of structurally similar completed projects.

Reference class forecasting is Flyvbjerg's operational remedy for the planning fallacy. The method is simple in structure and extraordinarily effective in practice. Rather than relying on the planner's inside view — the detailed, optimistic narrative of why this particular project will succeed — the forecaster identifies a reference class of structurally similar completed projects and calibrates the current forecast against the empirical distribution of their outcomes. The method works because it replaces the optimism-producing cognitive mode with the outside view: the statistical reality of what actually happened when comparable projects were attempted. The UK government adopted it as official policy for infrastructure appraisal in 2003, and it has since influenced planning practice across dozens of countries.

In the AI Story

Hedcut illustration for Reference Class Forecasting — Reference Class Forecasting

The intellectual lineage runs through Kahneman and Tversky's distinction between inside and outside views. The inside view is narrative, specific, intuitive; it is the mode in which planners naturally operate and the mode that produces the planning fallacy. The outside view is statistical, comparative, base-rate-sensitive; it is the mode that correctly identifies the planner's project as a member of a population whose outcomes can be empirically characterized. Reference class forecasting is the procedural discipline that forces planners out of the inside view and into the outside view.

The method has three operational steps. First, identify the reference class: the set of structurally similar completed projects whose outcomes are documented. Second, compile the distribution of outcomes in that class — cost overruns, schedule delays, benefit shortfalls. Third, calibrate the current forecast against that distribution, typically by applying an uplift to the naive estimate equal to the average overrun in the reference class. The UK Treasury's Supplementary Green Book Guidance specifies standardized uplifts by project category, producing dramatically more accurate aggregate forecasts than the prior practice of project-specific estimation.

Applied to AI, reference class forecasting would require proponents of current systems to identify the reference class of previous technologies predicted to achieve general intelligence and calibrate their forecasts against actual outcomes. The exercise would be sobering. The reference class is large — expert systems, connectionist networks, deep learning — and the outcomes are uniformly disappointing relative to predictions. The current claims bear a structural resemblance to previous claims that honest comparison would reveal. But the exercise is not performed, because uniqueness bias prevents proponents from acknowledging that a reference class exists.

The resistance to reference class forecasting in AI is not accidental. The method's corrective force depends on forcing the comparison that the bias most wants to avoid. Proponents of current systems are not unaware of previous AI winters; they simply insist that this time is categorically different, which is unfalsifiable and is precisely the argumentative structure that produced every previous AI winter. Reference class forecasting would treat the unfalsifiable insistence as evidence of the bias it is designed to correct rather than as evidence that the bias does not apply.

Origin

Flyvbjerg developed the methodology in the early 2000s through his megaproject research. The UK Treasury adopted it as policy through the Supplementary Green Book Guidance in 2003. Flyvbjerg and COWI's 2004 procedures report codified the standardized uplifts that became the operational backbone of UK capital investment appraisal.

Key Ideas

Outside view discipline. The method forces the planner to treat the current project as one member of a population whose outcomes can be empirically characterized.

Three operational steps. Identify the reference class, compile the distribution of outcomes, apply the distribution to calibrate the current forecast.

Empirically validated. Systematic comparison of forecasts produced with and without reference class forecasting shows substantial accuracy improvements across project types.

Resistant to uniqueness bias. The main obstacle to adoption is the planner's insistence that no comparable reference class exists — precisely the distortion the method is designed to detect.

Applicable to AI. The reference class of previous general-intelligence predictions exists and is documented; its application to current claims is simply not performed.

Appears in the Orange Pill Cycle

Bent Flyvbjerg — On AI

Reference Class Forecasting

In the AI Story

Origin

Key Ideas

Appears in the Orange Pill Cycle

Related Entries

Further reading