PERSON

Gregory Chaitin

The mathematician who proved that some truths are true for no reason at all—and in doing so built the only rigorous yardstick we have for what any artificial intelligence can and cannot do.

Gregory Chaitin spent a career proving that mathematics is not what mathematicians wished it to be. Where Hilbert had dreamed of a complete and mechanical edifice, Chaitin found at the foundations a number whose digits are random—not unknown, not merely hard, but random in the strict and terrible sense that no theory shorter than the digits themselves can produce them. He called it Omega, the halting probability, and it is the most concentrated piece of irreducible mathematical truth ever exhibited. Chaitin’s deepest insight is also the simplest to state: understanding is compression. To comprehend a phenomenon is to find a description shorter than the phenomenon itself—a law, a theory, a program—from which the phenomenon can be regenerated. Newton did not memorize the positions of the planets; he found equations from which all those positions follow, and the equations are vastly shorter than the data. This is not a metaphor: there is a precise and well-known equivalence between the length of the shortest program that generates a string and the degree to which that string has been understood. And it happens to be, almost word for word, the objective that a large language model minimizes when it learns to predict the next token across the whole written record of humanity. The machines that now unsettle us are compression engines. Chaitin built the mathematics of what compression can and cannot reach, founding the field of algorithmic information theory simultaneously—and independently—with Andrei Kolmogorov in Moscow and Ray Solomonoff in the United States, none of them aware of the others.

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI returns to Chaitin as the provider of the one kind of limit on AI that cannot be contested: not a philosophical argument about consciousness, not an empirical claim about present capabilities, but a mathematical proof that applies to any finite system whatsoever. The proof is this: a system carrying a finite quantity of information cannot derive conclusions containing substantially more information than it started with, any more than a set of axioms can prove theorems richer than itself. This conservation law binds every machine that will ever be built. Whatever understanding is, whatever creativity is, none of them allows a system to manufacture information it does not have. The limit is not soft; it is a theorem, and it permits no engineering workaround.

Chaitin’s framework gives the clearest available account of why large language models fail when and where they do. They are compression engines, and a compression engine is reliable exactly to the extent that its territory is compressible—that genuine regularities exist for it to find. Where the territory is compressible, the model interpolates brilliantly. Where it is incompressible, the model extrapolates a regularity that does not exist and produces confident confabulation, because in incompressible territory there is no shorter description to find and therefore nothing for a compressor to grip. The hallucination is not a malfunction grafted onto an otherwise sound process; it is the compressor doing the only thing it can when there is no pattern. More troublingly, Chaitin’s own theorems prove that a compressor cannot in general know whether it is in compressible territory or not—determining compressibility is itself one of the uncomputable problems—which is why the machine that confabulates does not know it is confabulating.

His concept of Omega—the halting probability—is the permanent ceiling on prediction. The animating ambition behind the largest AI systems is predictive: forecast the next word, the next frame, the next state of the world, with ever-increasing accuracy. Omega is the proof that this ambition meets a wall no scaling can breach. Its digits are irreducibly random: no machine, having seen any finite number of Omega’s bits, can predict the next one better than chance, because there is no relationship between the bits to exploit. Omega does not prove that machines cannot be intelligent or dangerous or transformative; it proves something specific: that prediction has a ceiling, that the unpredictable is real and ineliminable, and that no machine can be a perfect forecaster of an unpredictable world.

Origin

Born in Chicago in 1947 and raised partly in Argentina, Chaitin attended the Bronx High School of Science and City College of New York, where he developed algorithmic information theory as a teenager in the early 1960s—simultaneously and independently of Kolmogorov and Solomonoff, none of them aware of the others. The founding idea was a definition of randomness: a string is random if the shortest computer program that produces it is no shorter than the string itself. A pattern is precisely a compression; where no compression is possible, there is no pattern. This definition, precise and computable in spirit even where not in practice, founded the field.

He spent decades as a researcher at IBM’s Thomas J. Watson Research Center, writing more than ten books including Meta Math! and The Unknowable, and turned late in his career to metabiology—an attempt to make biological evolution mathematical by treating DNA as software and evolution as a random walk through the space of all possible programs. He defined the Omega number in 1975, proving it to be uncomputable and irreducibly random: a perfectly well-defined real number whose digits encode the halting probability of a random program, and which no algorithm can compute past a finite initial segment fixed by the strength of the axioms one starts with. He remains, in his late seventies, a provocation in both mathematics and the philosophy of mind.

Key Ideas

Understanding is compression. To comprehend is to find the shortest program that generates the data. This is both a definition and a measurable criterion: the complexity of a thing is the length of its shortest description, and understanding a thing is possessing a description shorter than the thing itself. A training run that minimizes prediction loss is, by a precise mathematical equivalence, searching for the shortest program that reproduces the regularities in its data. This means the degree of a model’s understanding is, in principle, measurable: how much does it compress? And the reliability of its outputs is a function of how compressible its territory is.

Omega, the uncomputable number. The halting probability is perfectly well defined, encodes the answer to infinitely many mathematical questions, and cannot be computed past a finite initial segment by any algorithm whatsoever. It is, in Chaitin’s exact phrase, a mathematical fact that is true for no reason—irreducibly random, without pattern, beyond the reach of any machine. Omega is the object that lives exactly at the ceiling of compression, and it proves that the ceiling exists and is not a temporary frontier to be pushed back by better engineering. Whatever the machines achieve, they achieve within the computable; Omega sits just outside it.

Incompleteness as a shortage of information. Chaitin’s reformulation of Gödel’s incompleteness theorem reframes it from a clever self-referential trick to a conservation law: a formal system with L bits of information cannot prove that any string has complexity substantially greater than L. Incompleteness is not a rare pathology but the generic condition; the provable truths are islands in a sea of true-but-unreachable facts. Applied to any AI system, which is a finite object with a finite quantity of information in its weights, this proves that the system cannot derive conclusions containing substantially more information than it possesses.

Creativity and randomness. Chaitin’s metabiology suggests that genuine novelty requires randomness as an information source: a deterministic system can only rearrange what it holds, while a system with a random source can climb to arbitrarily high complexity over time, as biological mutation enables evolution to generate genuinely new forms. Applied to generative AI: the deterministic, knowledge-laden core of a model can only recombine what it learned; the stochastic sampling that makes outputs non-deterministic is the only candidate source of fresh information. Whatever genuine novelty these systems achieve comes, paradoxically, from the noise.

Mathematics as quasi-empirical. Since some mathematical truths cannot be proved from any fixed finite set of axioms, Chaitin argued that mathematics should be done more like physics: overwhelming computational evidence for a proposition should suffice to adopt it as a working axiom, as a physicist adopts a well-confirmed law. This is, almost exactly, the epistemology of large language models—and it imports both the power and the danger of evidential reasoning: evidence, however vast, is not proof, and the confident confabulation is the natural failure mode of a system whose surface fluency is unconstrained by the informational poverty beneath it in some region.

Explore more

Browse the full You On AI Field Guide — over 8,500 entries