You On AI Field Guide · Isaac Newton The You On AI Field Guide Home
TxtLowMedHigh
PERSON

Isaac Newton

The mathematician who compressed the cosmos into three laws and one of gravity, and whose legacy now haunts artificial intelligence as the ghost of everything these machines refuse to do—derive, explain, and understand.
Newton gave Western civilization its template for knowledge: find the law, write it down, derive the rest. In the Principia Mathematica of 1687 he showed that the same force pulling an apple toward the ground holds the Moon in its orbit—and from three laws of motion and one of gravitation he derived the tides, the precession of the equinoxes, and Kepler's planetary ellipses. That act of compression—the buzzing cosmos reduced to symbols a student can write on an index card—set the standard for what understanding itself was supposed to look like. Then we built large language models that predict magnificently without possessing any such law, and the confrontation became unavoidable. Newton's most famous sentence, Hypotheses non fingo—I frame no hypotheses—sits like a rebuke beside an industry that ships behavior it cannot interpret; yet the calculus he co-invented to read planetary orbits now powers every gradient descent run in every data center on earth. He is the figure against whom the shock of the present moment registers most sharply, the patron saint of comprehension meeting the age of prediction without comprehension.
Isaac Newton
Isaac Newton

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI asks what it means to take the orange pill—to see the machine clearly, without narcotic hype or paralyzing fear. Newton is the most exacting instrument for that seeing, because he defines the very thing these machines refuse to be. To read him alongside contemporary AI is to be handed a measuring instrument with two scales: comprehension on one side, capability on the other. The models score off the charts on capability and register almost nothing on comprehension in Newton's sense, and he is the thinker who makes the gap between the two scales legible as something more than a marketing complaint.

His calculus—the method of fluxions Newton developed in the 1660s to turn his laws into predictions—runs in direct descent through the backpropagation algorithm that trains every neural network. The chain rule he and Leibniz bequeathed is the mathematical spine of every training run. But the directionality has inverted: Newton's calculus ran from known law to deduced consequence; backpropagation runs from desired consequence to an unknown function that approximately produces it. The tool is continuous; the epistemology has reversed. This reversal is, in microcosm, the entire transformation the cycle traces.

The Mechanistic Paradigm
The Mechanistic Paradigm

Newton's clockwork universe—every future state fixed by the present, a determinism so complete that Laplace imagined an intelligence who could predict all of history from a single snapshot—is the limiting case against which the cycle tests what, if anything, in a human life exceeds mechanism. A trained model is Laplacean in exactly this sense: identical input, identical weights, identical output, every time. Engineers inject randomness deliberately, to prevent rigidity. Strip it away and the model is the clockwork Newton built, scaled up and trained on human text, talking back. The cycle's question—what remains irreducibly ours?—is sharpest when held against Newton's clock.

And then there is the secret Newton the legend omits: the man who wrote over a million words on alchemy and biblical prophecy, who believed nature and scripture alike encoded a hidden rational order he might decode. He was not two men—the rigorous physicist and the occult enthusiast. He was one man whose faith was that the world is fully writable, that a sufficiently penetrating mind can read it. The large language models that work without finding any writable law are the first serious evidence that this faith may apply only to the simple corner he explored.

Large Language Models
Large Language Models

Origin

Isaac Newton was born on Christmas Day 1642 (Old Style) in Woolsthorpe, Lincolnshire, a premature infant so small, his mother later said, that he could have fit in a quart pot. His father died before his birth; his mother remarried and left him with his grandmother for nine years, a wound he never fully healed. He entered Trinity College, Cambridge in 1661, found the curriculum medieval and supplemented it obsessively with Descartes, Galileo, and Kepler. During the plague years of 1665–66, when Cambridge closed and Newton retreated to Woolsthorpe, he invented the calculus, discovered the composition of white light, and formulated his theory of universal gravitation. He was twenty-three. He published almost none of it for decades.

Gradient Descent
Gradient Descent

The Philosophiæ Naturalis Principia Mathematica appeared in 1687, under pressure from Edmund Halley, who paid for its printing. It was immediately recognized as a work without precedent: the universe reduced to a single mathematical framework, every known celestial phenomenon derivable from first principles. Newton had achieved what no scientist before or since has achieved in a comparable sweep—he had compressed the cosmos. His method was unmistakable: find the generating rule, derive the rest, and refuse to assert what the evidence does not force. Hypotheses non fingo, from the General Scholium of the 1713 edition, became the motto of a new standard of rigor.

The Fluency-Authority Decorrelation
The Fluency-Authority Decorrelation

He served as Warden and Master of the Royal Mint, pursued alchemy and biblical chronology with the same obsessive intensity he brought to physics, presided over the Royal Society, and was knighted in 1705. He died in 1727, leaving behind a Principia that would steer spacecraft three centuries later and a private archive that revealed a man far stranger, and far more human, than the legend.

Mechanistic Interpretability
Mechanistic Interpretability

Key Ideas

Comprehension as the engine of prediction. For Newton, to predict and to understand were the same act. You could foretell a comet's return because you grasped the gravitational mechanism governing it; the successful prediction certified the understanding. Machine learning has pried these apart: a weather model can outpredict physics-based simulations without containing any physics, a protein-folding system predicts structures without modeling the forces that fold them, a language model predicts the next token in a legal argument without comprehending law. The divorce is genuine and disorienting. It was nearly unthinkable in the framework Newton built, and now it is a daily industrial fact.

Prediction vs. Construction
Prediction vs. Construction

Laws versus correlations. A Newtonian law states a necessary, mechanism-backed relationship that supports counterfactuals—if you doubled the mass, the force would double, whether or not anyone ever measures it. A correlation is a regularity in observed data carrying none of these guarantees. Neural networks are, at bottom, magnificent correlation engines. They detect and exploit statistical regularities at a scale no human could match, and the fluency-authority decorrelation—the model's tendency to assert false claims with the confidence of true ones—is exactly this: the correlation-is-not-causation problem wearing a linguistic face.

Neural Networks
Neural Networks

Hypotheses non fingo. When pressed to explain why gravity acts across empty space, Newton refused to invent a mechanism he could not justify. The contemporary temptation runs the other way: to dress up opaque systems with confident post-hoc narratives, to let a model emit a plausible-sounding “explanation” of its own behavior that is itself just more generated text, unmoored from actual computation. Mechanistic interpretability is the field's most Newtonian response—the attempt to actually see what is happening inside, rather than narrating it.

The world-as-text wager. Newton's deepest faith—expressed through physics, alchemy, and biblical chronology alike—was that reality is fundamentally intelligible, fully writable, available to a sufficiently penetrating mind. Large language models work without finding any such compact structure, through billions of inscrutable parameters, suggesting that for the most complex and human-relevant domains there may be no short text to read. The success of the unintelligible method is the most unsettling evidence the wager has ever faced.

Debates & Critiques

The central debate Newton poses for our moment concerns whether the Newtonian standard—comprehension as the precondition of reliable prediction—was a universal truth or a local one, true of the simple corner of reality Newton happened to explore and false everywhere else. Optimists argue that emergent capabilities in large models constitute something like understanding, that a system which predicts text well enough must build an internal model amounting to genuine comprehension. The Newtonian counter is that this confuses the shadow for the substance: the model learns what typically follows from what, not why anything follows from anything, and it fails, confidently and without insight, the moment the world departs from its training distribution—exactly as Newton would predict of a system that knows correlations rather than laws. A second question concerns interpretability: whether Newton's discipline of asserting only what the evidence forces, rather than inventing comfortable explanations, should require the AI field to say plainly that its systems work and it does not yet know why, or whether the post-hoc narratives these systems generate constitute a new kind of self-knowledge that Newton's framework cannot evaluate. The deepest open question is the most Newtonian: whether the world is the kind of thing that can be fully written down, and whether a mind is.

The Newtonian Triad

Three fracture lines between Newton's science and machine learning
Fracture One
Comprehension vs. Prediction
Newton welded prediction and understanding into a single act: you forecast because you grasp the cause, and grasping the cause is what prediction proves. Machine learning has falsified this in the most concrete possible way—systems predict magnificently while comprehending, in Newton's sense, nothing. The divorce is not a bug. It is what the method is.
Fracture Two
Laws vs. Correlations
A Newtonian law holds under intervention and across all conditions; a correlation is a regularity that may dissolve the moment the world departs from the data it was learned from. Neural networks are correlation engines, and every hallucination, every confident wrongness, is the correlation-is-not-causation problem dressed in fluent prose.
Fracture Three
Transparency vs. Opacity
Newton believed that genuine knowledge is transparent knowledge—a law you can write on a card, a deduction you can check. Machine competence lives in billions of weights no one can read. The opacity is not incidental: it is the price of the method, and Hypotheses non fingo demands we name that price rather than paper over it.

Further Reading

  1. Isaac Newton, Philosophiæ Naturalis Principia Mathematica (1687; trans. I. Bernard Cohen and Anne Whitman, University of California Press, 1999)
  2. Isaac Newton, Opticks: or, A Treatise of the Reflections, Refractions, Inflections and Colours of Light (1704; Dover reprint, 1952)
  3. Richard Westfall, Never at Rest: A Biography of Isaac Newton (Cambridge University Press, 1980)
  4. James Gleick, Isaac Newton (Pantheon, 2003)
  5. Pedro Domingos, The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World (Basic Books, 2015) — for the contrast between rule-based and learned intelligence
Explore more
Browse the full You On AI Field Guide — over 8,500 entries
← Home0%
PERSONBook →