CONCEPT

Jagged Intelligence

Andrej Karpathy’s adjective for the uneven capability profile of large language models—brilliant on some peaks, startlingly weak in adjacent valleys, with no smooth surface connecting them and no representative point from which a single test can tell you about the whole.

The intelligence of large language models is not uniform. It does not rise and fall together the way human competence tends to. A model can solve a problem that would stump a graduate student and then fail at a task a child would find trivial, sometimes within the same conversation. Andrej Karpathy calls this jagged intelligence, and the word does real work: it corrects two symmetrical errors people make when they encounter these systems. The first error extrapolates from the peaks—watching a model write an elegant essay and concluding that something approaching a mind is clearly present. The second extrapolates from the valleys—watching a model miscount letters in a word and concluding that the whole thing is a parlor trick. Both errors assume the intelligence is uniform, that a single sample tells you about the whole. Karpathy’s point is that neither the peaks nor the valleys are representative, because there is no representative point. The surface is jagged all the way down, and the only safe assumption is that somewhere just out of view it drops away. The concept connects directly to the Orange Pill cycle’s central concern: the fluency of a language model is seductive, and its confident wrongness is the signature hazard of the age—the decorrelation of fluency from authority that leaves users unable to tell, from surface alone, where the jagged surface is high and where it falls away.

In the [YOU] on AI Field Guide

The jaggedness concept is the cycle’s most precise empirical description of why the orange pill moment is accompanied by both genuine wonder and genuine danger. Edo Segal observes that Claude Code can produce a twenty-fold productivity multiplier and still fail at elementary tasks that no human collaborator would get wrong. The experience is disorienting precisely because human competence does not work this way: a person who is expert enough to draft a legal brief is also expert enough to count the letters in a word. Jaggedness names the difference and explains it structurally.

The concept also grounds the cycle’s insistence that the autonomy slider must be calibrated domain by domain rather than set once for all tasks. A model that aces a benchmark may still fail catastrophically just outside the benchmark’s coverage, in a region no one thought to test. Karpathy’s point is that confidence in one region of capability licenses no confidence in any other—which is why human oversight must remain calibrated to the cost of a confident error in the specific domain at hand.

Origin

Karpathy developed the concept through his years building and studying neural networks, and articulated it most clearly in the context of the large language models that emerged in the early 2020s. The jaggedness, he argues, is not a transitional artifact of immature systems but a structural consequence of how these systems are made. Large language models are optimized by a process utterly unlike the one that produced human minds. We were shaped by evolution under pressure to survive and reproduce in a physical and social world, which gave our competence a certain coherence—the things we are good at hang together because they all served the same underlying project of staying alive.

Large language models were shaped by imitating human text, then by collecting rewards on narrow tasks, then by being tuned to satisfy human preference judgments. These pressures are unrelated to one another and unrelated to any coherent project. The result is a capability profile with no center of gravity—strong wherever the training happened to be strong, weak wherever it happened to be weak, with no reason for the strengths and weaknesses to correlate. The jaggedness is the statistical shadow of a training process that aimed at many incoherent targets at once.

Key Ideas

Neither peak nor valley is representative. The two natural responses to encountering a capable AI system—extrapolating from an impressive demonstration or dismissing the whole on the basis of an elementary failure—both assume a uniform surface. Jaggedness refutes both. The model that passes a bar exam and cannot reliably count the letters in “strawberry” is not almost a mind with a few bugs remaining, nor is it a statistical puppet. It is a system whose capability profile has no smooth surface connecting the peaks and valleys, and the only reliable approach is to probe each domain separately rather than assuming from one sample.

Consequences for deployment. If intelligence were uniform, you could establish a model’s general competence with a few tests and trust it broadly. Because intelligence is jagged, you cannot. The same system capable of sophisticated reasoning can be defeated by an elementary manipulation—coaxed into revealing what it should not, or led into an obvious error by a slightly unusual prompt. The jaggedness means that confidence in a model’s performance on one task licenses no confidence on the next, and that deployment in high-stakes settings requires domain-by-domain verification rather than a single general-competence assessment.

The understanding question reframed. Jaggedness reframes the debate about whether AI systems truly understand. The understanding question assumes a binary: either the system understands or it does not. The binary does not fit a jagged surface. The honest description is that the system exhibits behaviors that look like understanding in some regions and behaviors that look like its absence in others, and the regions do not respect the boundaries our intuitions expect. Karpathy is comfortable leaving the metaphysical question open while insisting on the empirical description. What matters operationally is not whether to call it understanding but where the jagged surface is high and where it is low, which can only be discovered by probing.

A discipline of humility. To work well with these systems is to internalize their jaggedness: to admire the peaks without trusting them, to expect the valleys without being surprised, and to treat every impressive output as a local rather than global fact about the system. The fluency that makes these systems so useful is the same fluency that makes them dangerous: they speak in the register of expertise even when wrong, and their mistakes do not announce themselves. Jaggedness is the constant reminder that competence here is local, not global, and that the only safe assumption is that somewhere just out of view the surface drops away.

Debates & Critiques

The primary debate around jagged intelligence is whether the jaggedness is a permanent feature of systems trained as current language models are, or an artifact of insufficient scale and training that will be smoothed out as the systems improve. Optimists point to the steady advance of benchmarks across many domains as evidence that the profile is becoming less jagged over time; Karpathy’s response is that the surface smooths in the regions that benchmarks cover while remaining jagged at the edges the benchmarks do not test—and that the edges are where the consequential failures occur. A related dispute concerns whether jaggedness is unique to AI or a general feature of any sufficiently narrow optimization process: a chess engine is superhuman at chess and helpless at everything else; a language model is simply jagged over a much larger domain. Critics argue this observation should make us more cautious, not less, since the wider domain means the valleys are harder to locate in advance.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

Related Entries

Further Reading