Tversky's work on overconfidence, conducted with Kahneman and Baruch Fischhoff through the 1970s and 1980s, established that calibration errors are not random but systematic — biased in the direction of too much confidence. Even experts, even subjects explicitly warned about the bias, even subjects offered financial incentives for accurate calibration, continued to overestimate their reliability. The bias operates below deliberation.
The AI-era manifestation connects to Byung-Chul Han's critique of smoothness, translated into cognitive terms. Human calibration relies on cues: effort, difficulty, time-to-answer, visible uncertainty. When a human expert produces a judgment with difficulty, the difficulty itself signals appropriate humility; when she produces it easily, the ease signals fluent expertise. AI output is uniformly effortless from the evaluator's perspective, which breaks the signal. Both accurate statements and hallucinations arrive with identical surface properties.
The Deleuze incident in You On AI illustrates the pattern precisely. The passage Claude produced was elegant, structured, referenced. It read as insight. The philosophical reference was wrong, but the wrongness was invisible at the surface level. The evaluator using representativeness judges good output by surface match to the prototype of good output, and the overconfidence induced by smoothness confirms the match.
The problem compounds recursively. Recent work suggests that LLMs trained on human-generated text absorb the cognitive biases present in that corpus — including patterns consistent with loss aversion and overconfidence. The human evaluator's miscalibration therefore meets AI output that has itself absorbed miscalibration. The system-level overconfidence is not corrected by either side. It is amplified through their interaction.
The calibration research program began with Fischhoff, Slovic, and Lichtenstein's work in the 1970s on hindsight bias and confidence assessment. Tversky contributed both theoretical framing and key experiments showing that even experts exhibit poor calibration on tasks within their domain.
The application to AI was developed after Tversky's death, but the framework applies directly. Mechanistic interpretability research has begun to identify cases in which AI systems' internal confidence correlates poorly with actual accuracy, making them structurally analogous to human overconfidence and similarly resistant to correction.
Systematic miscalibration. Confidence assessments are not randomly inaccurate but biased toward excess confidence, especially for judgments near the limits of knowledge.
Cue decoupling. AI output decouples the normal calibration cues (effort, difficulty, struggle) from the underlying quality, breaking the calibration mechanism.
Smoothness as seduction. The polish of AI output flatters the evaluator's judgment — accepting it feels like exercising taste rather than failing to verify.
Bidirectional amplification. Biased humans evaluating AI trained on biased human output produces a system-level overconfidence that neither component alone would generate.
Ascending friction as remedy. The verification work that smoothness makes it easy to skip is precisely the ascending friction of the AI era.
Some researchers argue that AI-induced overconfidence is a transient problem, solvable through better interfaces that display uncertainty estimates or through training on calibration. Others argue that the deeper problem is structural — that the smooth surface is a feature rather than a bug, optimized for engagement at the cost of epistemic honesty — and that interface solutions cannot overcome optimization pressures.