You On AI Field Guide · Terrence Sejnowski The You On AI Field Guide Home
TxtLowMedHigh
PERSON

Terrence Sejnowski

Physicist-turned-neuroscientist who co-invented the Boltzmann machine, built NETtalk, and helped found computational neuroscience—the man who built the early learning machines by hand, and has spent forty years refusing to mistake what they achieved for what they understood.
Terrence Sejnowski is one of a small number of people who can say, without exaggeration, that they helped build the thing the rest of us are now arguing about. Trained as a physicist at Princeton under John Hopfield, he crossed into neurobiology at Harvard Medical School and never fully left either world, holding throughout his career the double question: how does the brain learn, and how can a machine be made to learn the same way? In 1985 he co-invented the Boltzmann machine with Geoffrey Hinton—a stochastic, energy-based network that learned by reducing the gap between what it dreamed on its own and what the world had shown it—and two years later built NETtalk, a program that learned to read English aloud starting from babble, which people who heard it described as watching a machine cross a line they had assumed only living things could cross. He helped found computational neuroscience as a field, started the journal Neural Computation, and for decades led the NeurIPS conference where most of the breakthroughs of the modern era were first announced. What makes him indispensable to the cycle is his particular vantage: an optimist about the science and a realist about its limits, he will say in the same breath that the language models astonished even him and that you cannot trust your intuition about any of it—a discipline earned by half a century of being surprised by his own field, and the most useful posture available to anyone trying to think clearly about what we have made.
Terrence Sejnowski
Terrence Sejnowski

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI asks what it would mean to take the orange pill—to see the machine clearly, without the narcotic of hype or the paralysis of fear. Sejnowski is among the clearest seers available, not because he was right about everything but because he watched his field be wrong about a great deal and built his judgment around the discipline of not trusting his intuition in a domain that punishes certainty. When he says the transformer architecture astonished him—that nobody expected it—he is being honest about the limits of foresight in a field where surprise is the rule. That honesty is worth more than any confident prophecy.

His lens reframes the cycle's central wager: that capable machines do not dissolve the human question but sharpen it. Sejnowski sharpens it toward learning. If intelligence is something that is learned from data rather than installed by design, then the line between the natural and the artificial mind runs through a process both share—and the question of what we are becomes inseparable from the question of how anything learns at all. His computational neuroscience framework insists on studying brains and machines as members of one family, each illuminating the other, and that two-way method is exactly what the cycle's comparative intelligence project requires.

Co-Evolution of Brain and Learning
Co-Evolution of Brain and Learning

The diagnosis he supplies on learning explains a pattern that has puzzled observers of contemporary AI: systems of extraordinary fluency that fail in absurd, basic ways. Sejnowski locates the failure not in inadequate scale but in the nature of what is learned. These systems distill the statistical shadow of human knowledge—the accumulated regularity of human text—without the embodied contact with the world that produced the text. The knowledge is real but secondhand, learned from the record rather than from the world, which is why a system can draft a competent legal brief and then invent a case that does not exist with equal confidence. It has learned the surface with extraordinary fidelity. It has not touched the substance. He stands in the cycle's gallery as the witness who built the early systems and can therefore testify, with authority no outsider could claim, about both how far they have come and exactly where they still are not.

Origin

Born in 1947 in Cleveland, Ohio, Sejnowski trained as a physicist and earned his doctorate at Princeton under John Hopfield—who would shortly demonstrate that ideas from statistical physics could describe how a network of neurons stores memories. The migration from physics to biology was the key to everything that followed. Physicists are trained to look past surface complexity for the simple law beneath, to treat a tangled system as the expression of a few underlying principles. Sejnowski applied that instinct to the brain and to its artificial imitations alike, refusing to let the messiness of biology or the artificiality of the machine obscure the shared question: how does a network of simple elements, none of them intelligent on its own, come to do intelligent things?

Ludwig Boltzmann

The approach was a minority position for most of his career. The dominant paradigm held that intelligence was symbol manipulation—write down knowledge as rules and have the machine reason over them. Neural networks were regarded, after a wave of early enthusiasm collapsed in the 1970s, as a refuted idea. Sejnowski worked through the lean years, building the Boltzmann machine with Geoffrey Hinton in 1985 and NETtalk with Charles Rosenberg in 1986, demonstrating that networks could learn what no one could program—that the approach was starved of scale, not broken in principle. He was generous in retrospect about the critics: they had good reasons, rooted in the genuine limitations of small networks on slow hardware. What they could not see was that the failures were failures of scale, not of principle.

Large Language Models
Large Language Models

The hardware eventually caught up, driven in part by graphics chips developed for video games. When it did, the dormant approach woke almost overnight and did things that decades of symbolic engineering had failed to achieve. Sejnowski watched the vindication with characteristic candor: he had been right about the broad bet and repeatedly surprised by the specifics. The transformer architecture was a genuine astonishment. The capacity of language models trained only on text to do as much as they do without grounding was, by his account, something nobody expected. He moved to the Salk Institute, founded the journal Neural Computation in 1989, wrote The Computational Brain with Patricia Churchland, and published his history of the era, The Deep Learning Revolution, as a witness who had endured both the defeat and the strange, partially unforeseen victory.

Emergent Capabilities
Emergent Capabilities

Key Ideas

The Boltzmann machine and energy-based learning. Borrowing from Ludwig Boltzmann's physics of thermal systems, Sejnowski and Hinton built a network whose units decided their states stochastically—noisy on purpose—and learned by reducing the difference between what it produced when the data was clamped onto it and what it produced when running free. The machine learned by closing the gap between its dream and the world. This is the conceptual ancestor of today's generative models: diffusion systems add and remove noise in a process that rhymes deeply with annealing a system toward low energy, and the insight that learning and generating are two faces of one competence—the same connection strengths serving both—runs from 1985 to the present.

The Disembodied Generative Model
The Disembodied Generative Model

NETtalk and the proof of principle. When NETtalk learned to read English aloud—arriving from babble at intelligible speech through exposure to examples, with no pronunciation rules provided—it demonstrated the connectionist thesis made audible: meaningful internal representations are not designed in advance but emerge from the pressure to perform a task. The network had spontaneously organized its hidden units by phonetic category, grouping vowels from consonants, sounds by how they are made in the mouth—distinctions linguists recognize but that NETtalk had been told nothing about. The corollary that Sejnowski has never let go is that NETtalk did not know what the words meant: it had learned the map from spelling to sound without any contact with the things the words referred to—a limitation that returns, vastly magnified, in today's language models.

Patricia Churchland

Computational neuroscience as a two-way bridge. The founding conviction of computational neuroscience is that brains and machines that learn are best understood in conversation, each correcting and informing the other. When a deep network trained to recognize objects develops internal layers whose responses resemble those of neurons in the visual cortex, that resemblance is data. It does not prove the brain works the same way, but it constrains hypotheses and suggests experiments. The brain also teaches the machines what they still lack: it learns from a fraction of the data, runs on twenty watts, and learns new things without overwriting the old. Each gap marks something the brain knows how to do that the engineers have not yet discovered.

Geoffrey Hinton

The primacy of learning. Sejnowski's deepest conviction, expressed across four decades, is that intelligence is not installed but acquired: the capacities we care about arise from adjustment in response to data, not from a design laid down in advance. A programmed system does what its designers specified; a learning system discovers competences its designers did not specify and often cannot describe. The knowledge in these systems is the accumulated regularity of human text or experience, compressed into weights: real but secondhand, learned from the record rather than the world. This explains both the power and the limit.

The grounding surprise. NETtalk taught Sejnowski that a system could learn the surface of language without touching its meaning—that grounding was necessary for genuine understanding. He expected ungrounded systems to hit a wall. When large language models trained only on text did far more than the surface should allow, he acknowledged the surprise candidly. His interpretation is careful: human language encodes in its structure an enormous amount of the knowledge humans use it to express, so a system that has mastered the structure inherits much of that knowledge secondhand. The line between understanding and sophisticated symbol manipulation is blurrier than the field expected—and intellectual honesty requires admitting the surprise rather than explaining it away.

Debates & Critiques

The deepest debate Sejnowski's work opens is whether scale will eventually bridge the gap between learning the surface of intelligence and possessing its substance. Optimists argue that sufficiently large models, trained on the traces of grounded human experience, already display something functionally equivalent to grounded understanding—that the distinction Sejnowski draws between secondhand knowledge and real knowledge is a distinction without a practical difference at sufficient resolution. Sejnowski counters with the same empirical humility he has always modeled: the field has repeatedly mistaken what was hard for what was easy and been humbled, and confidently predicting either that grounding is necessary or that it is dispensable has been proven wrong in both directions. A second debate concerns the two-way bridge of computational neuroscience: critics argue that the resemblances between artificial networks and brain recordings are superficial, artifacts of the training objective rather than genuine convergence on biological mechanism, while Sejnowski holds that even superficial resemblances constrain hypotheses in ways that matter scientifically. The sharpest unresolved problem is consciousness: Sejnowski, who built his career on the claim that the brain is a computational system, cannot say whether consciousness is implementation-independent—whether a sufficiently capable artificial system might have an inner life—or whether it depends on something specific to biology that no engineering can reproduce. He holds the question open with the same discipline he has brought to every other question the field kept declaring settled.

The Three Levels of a Learning System

Sejnowski's framework for reading brains and machines together
The Algorithm
What It Computes
The information-processing task a system solves — its objective and the strategy it uses. Sejnowski insists that asking what a brain or network computes is the right level of analysis, borrowing Marr's framework to make brains and machines commensurable.
The Mechanism
How It Learns
The specific procedure by which a system adjusts to data — stochastic energy minimization in the Boltzmann machine, backpropagation in deep networks, Hebbian plasticity in biology. Mechanisms can be compared across the two kinds of learning system, and the differences are as informative as the similarities.
The Substrate
What It Runs On
Silicon or wetware — and the differences are vast: neurons are slow, analog, and massively parallel; chips are fast, discrete, and largely sequential. Sejnowski insists on taking substrate seriously: the brain runs on twenty watts and learns from far less data, which means biology has discovered efficiencies the engineers have not.

Further Reading

  1. Terrence J. Sejnowski, The Deep Learning Revolution (MIT Press, 2018)
  2. Patricia S. Churchland & Terrence J. Sejnowski, The Computational Brain (MIT Press, 1992; 25th anniversary ed. 2016)
  3. David H. Ackley, Geoffrey E. Hinton & Terrence J. Sejnowski, “A Learning Algorithm for Boltzmann Machines,” Cognitive Science 9 (1985), 147–169
  4. Terrence J. Sejnowski & Charles R. Rosenberg, “Parallel Networks That Learn to Pronounce English Text,” Complex Systems 1 (1987), 145–168
  5. Barbara Oakley & Terrence Sejnowski, “Learning How to Learn” (Coursera, 2014–present) — among the most widely enrolled online courses ever offered
Explore more
Browse the full You On AI Field Guide — over 8,500 entries
← Home0%
PERSONBook →