PERSON

Pierre-Simon Laplace

The French mathematician who imagined the perfect predictor, built the probability theory that makes imperfect prediction rigorous, and whose demon—and whose honesty about why it could never exist—is the most precise ancestor of everything artificial intelligence is reaching for and running up against.

Pierre-Simon Laplace sits at the headwaters of two rivers that converge in modern AI. The first is determinism—the conviction that the world is a vast mechanism whose future is fixed by its present, expressed in the thought experiment now called Laplace’s demon: an intelligence vast enough to know the position and momentum of every particle in the universe, to whom nothing would be uncertain. The second is probability—the mathematics of reasoning under uncertainty, which Laplace did more than almost anyone to build, and which he defined, with startling honesty, as the measure of our ignorance. Not a property of the world but of the knower. The demon is the metaphysics; probability is the method. Every predictive AI system alive today runs on both, and both bear Laplace’s fingerprints. The dream of the perfect predictor—the system that ingests enough data to forecast what comes next—is Laplace’s demon translated into silicon. And the walls that forbid that dream—chaos, quantum mechanics, the irreducible residue of a person’s first-person experience—are the walls Laplace himself was the first to see, or the first to reveal by building the mathematics whose limits they expose. To understand what AI is reaching for and why it cannot fully arrive, you must understand the man who wrote both the mission statement and the proof of impossibility.

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI reads Laplace as the precise ancestor of the AI age’s most consequential ambition and most consequential confusion. The ambition is the demon’s dream: that with enough data and enough computation, the apparently unpredictable becomes predictable. The confusion is the identification of prediction with understanding—the assumption that a system which forecasts reliably thereby explains, that the answer severed from the why is the same kind of thing as the answer with the why attached. Laplace is indispensable here because he was explicit, as his successors have not been, about the difference between the two: he set aside a kind of explanation in his celestial mechanics and lost nothing, because his prediction rested on a mechanism that was itself an explanation of the right scope. The machine sets aside all explanation, and the cycle asks whether something essential is lost in the setting aside.

His framework for probability as the measure of ignorance cuts in two directions in the cycle. It makes the machine’s outputs honest—a probability is a measure of what the model does not know, not a property of the world—and it exposes the machine’s characteristic dishonesty: most deployed systems collapse the posterior distribution to a point, reporting the most probable answer and discarding the uncertainty distribution that would tell the user how much to trust it. This is the loss Laplace would never have accepted—he built the method precisely because the shape of the uncertainty matters, not just its peak. The cycle identifies this collapse as one source of the calibration failures that make AI dangerous in high-stakes domains.

The most important Laplacean question the cycle asks is about the boundary of probability itself. Probability as the measure of ignorance presupposes a determinate truth behind the uncertainty—a fact of the matter that fuller knowledge would reveal. When the machine assigns a probability to a contested judgment—whether text is toxic, whether a face is trustworthy, whether a person is a risk—it is applying a framework built for the determinate to the genuinely indeterminate: questions whose answers are not facts awaiting discovery but human determinations awaiting decision. The precise number implies a precision the question does not possess. This, the cycle argues, is where algorithmic confidence becomes not merely a technical problem but an ethical one.

Origin

Pierre-Simon Laplace was born in 1749 in a Normandy village, the son of a small farmer, and died in 1827 one of the most celebrated scientists in Europe. He was called the French Newton, and the title was not flattery. In the five volumes of his Mécanique Céleste he took Newton’s law of gravitation and showed, where Newton himself had wavered, that the Solar System was stable—that its wobbles and perturbations were self-correcting oscillations, that the whole vast clockwork would run on its own without divine adjustment. He worked on the figure of the Earth, the tides, the speed of sound, the behavior of heat. He served briefly as Napoleon’s Interior Minister, was made a marquis by Louis XVIII, and is said to have told Napoleon, when asked why his great book on the system of the world never mentioned God, that he had no need of that hypothesis.

His intellectual legacy divides into the two rivers. In celestial mechanics he demonstrated the power of determinism as a practical tool: knowing the equations and enough data, he could predict the heavens with unprecedented precision. In probability theory, his Théorie analytique des probabilités (1812) put the calculus of uncertainty on a rigorous foundation, and the Philosophical Essay on Probabilities (1814) made it accessible to general readers and stated the demon thought experiment in the same breath as the definition of probability as the measure of ignorance. The two together—the dream of total prediction and the mathematics of partial prediction—constitute Laplace’s complete legacy to the AI age: the ambition and the instrument.

Laplace also made specific technical contributions that are in direct, unbroken mathematical lineage to modern machine learning. His development of inverse probability—the method for inferring probable causes from observed effects, now called Bayesian inference—is the engine underneath the training of probabilistic machine learning models. His rule of succession—a method for assigning probability to the next instance after a run of observations, which notably never returns absolute certainty from finite evidence—is the formalization of the humility that deployed AI systems most consistently violate.

Key Ideas

The demon and its limits. Laplace’s demon is the cleanest statement of determinism in the Western canon: an intelligence knowing the position and momentum of every particle at one instant could compute the entire future and reconstruct the entire past. But Laplace himself stressed that human minds would always remain infinitely removed from such an intelligence—the demon was never a prediction but a limit, a way of grounding the necessity of probabilistic reasoning by contrast. Chaos theory (discovered by Poincaré after Laplace) adds a further wall: deterministic systems can be practically unpredictable because arbitrarily small differences in initial conditions produce arbitrarily large differences in outcomes. Perfect prediction requires not just a lot of data but infinitely precise data, which is not a quantity that can be approached by scaling up.

Probability as the measure of ignorance. Laplace’s definition of probability as the measure of our ignorance—not a property of the world but of the knower—is the unexamined foundation of how every machine learning system represents uncertainty. A model’s output probability is, in the purest Laplacean reading, a measure of the model’s ignorance: it does not know the answer, and the probability quantifies how its belief is distributed across possibilities given what it has seen. The honest reading of a model’s output is therefore Laplacean: this is the posterior, the best revision of belief given the data and the priors, not a pronouncement from the demon’s chair.

Inverse probability and Bayesian learning. Laplace’s method of inverse probability—reasoning backward from observed effects to probable causes—is the literal mathematics inside machine learning training. Begin with a prior (what you believe before seeing the data); update proportionally as evidence arrives; arrive at the posterior (your revised belief). Machine learning training is Bayesian inference executed by an algorithm on a problem too large to solve in closed form. The neural network is a vast apparatus for inferring hidden parameters from observed data, which is exactly what Laplace’s inverse probability was invented to do. The machines are his method, industrialized.

The lost posterior. Laplace’s method does not merely produce a best estimate; it produces a full posterior distribution—a complete account of how belief is spread across possibilities, and therefore of how uncertain the conclusion is. Most deployed AI collapses this distribution to a point: it returns the single most probable answer and discards the shape of the uncertainty. This is the central loss in the translation from Laplace to the machine. A prediction stripped of its posterior spread is a number with no known reliability. The field’s work on calibration and uncertainty quantification is, in this light, an attempt to restore the full posterior that the standard approach discards—rediscovering, expensively and late, what Laplace never threw away.

Prediction without explanation. Laplace set aside the hypothesis of divine intervention in his celestial mechanics not because he had no explanation but because he had a better one—the lawful mechanism of gravitation. The machine trading away all explanation for prediction is doing something structurally different: it achieves more predictive power precisely by abandoning the explanatory commitment. The loss matters not only philosophically but practically: a model that predicts without understanding has no internal way to know when it has left the territory its predictions are valid for, no grasp of the boundary of its own competence, and so it fails without warning when the world shifts beneath it.

Debates & Critiques

The sharpest debate concerns Laplace’s definition of probability as epistemic—as the measure of ignorance of a determinate world—versus a frequentist or ontological reading. Frequentists, who define probability as the long-run frequency of outcomes in repeated experiments, reject the Laplacean prior as subjective and unmathematical. This debate is live in machine learning: the Bayesian framework (explicitly Laplacean) and the frequentist framework (maximum likelihood, confidence intervals) produce different inferences and different guidance about how to report uncertainty. In the AI governance context, the debate becomes urgent when probability outputs are used to make decisions about individuals: a Laplacean reading insists the probability measures the model’s ignorance of this particular person and demands humility proportional to that ignorance; a frequentist reading grounds the probability in population rates and permits aggregation. Judea Pearl’s critique adds a third dimension: neither the Bayesian nor the frequentist framework reaches causal claims, and a predictive model that cannot distinguish correlation from causation is, in Pearl’s framework, trapped on the first rung of the ladder of causation regardless of how well its probabilities are calibrated. Laplace would have agreed with the direction of Pearl’s argument, since his own celestial mechanics was explicitly causal in structure—it explained not merely that the planets moved but why, in terms of the gravitational mechanism. The machine that predicts without mechanism has both the Laplacean and the Pearlian diagnosis against it.

Laplace’s Three Legacies

The dream, the method, and the boundary

The Demon

The Dream of Total Prediction

An intelligence knowing all forces and all positions could compute the entire future. Not a prediction, but a limit—a way of grounding the necessity of probability by showing what complete knowledge would entail. Every AI system is the demon’s apprentice, working with a fraction of the information and asymptotically approaching an omniscience it cannot reach.

Inverse Probability

Reasoning Backward from Evidence

The method that is the literal engine of machine learning training. Prior belief plus evidence yields posterior belief. The prior is everywhere in a trained model: in its architecture, its regularization, its training data. The machines are Bayesian in mechanism but often silent about their priors.

The Boundary

What Probability Cannot Reach

Probability as the measure of ignorance presupposes a determinate truth being approximated. Applied to contested judgments, human values, and first-person experience, it imports a false objectivity: dressing decision as discovery, treating a contested human determination as a fact awaiting a sufficiently informed predictor.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

Laplace&rsquo;s Three Legacies

Related Entries

Further Reading

Laplace’s Three Legacies