
The cycle that began with [YOU] on AI asks what it would mean to take the orange pill—to see the machine clearly, without the narcotic of hype or the paralysis of fear. Sejnowski is among the clearest seers available, not because he was right about everything but because he watched his field be wrong about a great deal and built his judgment around the discipline of not trusting his intuition in a domain that punishes certainty. When he says the transformer architecture astonished him—that nobody expected it—he is being honest about the limits of foresight in a field where surprise is the rule. That honesty is worth more than any confident prophecy.
His lens reframes the cycle's central wager: that capable machines do not dissolve the human question but sharpen it. Sejnowski sharpens it toward learning. If intelligence is something that is learned from data rather than installed by design, then the line between the natural and the artificial mind runs through a process both share—and the question of what we are becomes inseparable from the question of how anything learns at all. His computational neuroscience framework insists on studying brains and machines as members of one family, each illuminating the other, and that two-way method is exactly what the cycle's comparative intelligence project requires.
The diagnosis he supplies on learning explains a pattern that has puzzled observers of contemporary AI: systems of extraordinary fluency that fail in absurd, basic ways. Sejnowski locates the failure not in inadequate scale but in the nature of what is learned. These systems distill the statistical shadow of human knowledge—the accumulated regularity of human text—without the embodied contact with the world that produced the text. The knowledge is real but secondhand, learned from the record rather than from the world, which is why a system can draft a competent legal brief and then invent a case that does not exist with equal confidence. It has learned the surface with extraordinary fidelity. It has not touched the substance. He stands in the cycle's gallery as the witness who built the early systems and can therefore testify, with authority no outsider could claim, about both how far they have come and exactly where they still are not.
Born in 1947 in Cleveland, Ohio, Sejnowski trained as a physicist and earned his doctorate at Princeton under John Hopfield—who would shortly demonstrate that ideas from statistical physics could describe how a network of neurons stores memories. The migration from physics to biology was the key to everything that followed. Physicists are trained to look past surface complexity for the simple law beneath, to treat a tangled system as the expression of a few underlying principles. Sejnowski applied that instinct to the brain and to its artificial imitations alike, refusing to let the messiness of biology or the artificiality of the machine obscure the shared question: how does a network of simple elements, none of them intelligent on its own, come to do intelligent things?
The approach was a minority position for most of his career. The dominant paradigm held that intelligence was symbol manipulation—write down knowledge as rules and have the machine reason over them. Neural networks were regarded, after a wave of early enthusiasm collapsed in the 1970s, as a refuted idea. Sejnowski worked through the lean years, building the Boltzmann machine with Geoffrey Hinton in 1985 and NETtalk with Charles Rosenberg in 1986, demonstrating that networks could learn what no one could program—that the approach was starved of scale, not broken in principle. He was generous in retrospect about the critics: they had good reasons, rooted in the genuine limitations of small networks on slow hardware. What they could not see was that the failures were failures of scale, not of principle.
The hardware eventually caught up, driven in part by graphics chips developed for video games. When it did, the dormant approach woke almost overnight and did things that decades of symbolic engineering had failed to achieve. Sejnowski watched the vindication with characteristic candor: he had been right about the broad bet and repeatedly surprised by the specifics. The transformer architecture was a genuine astonishment. The capacity of language models trained only on text to do as much as they do without grounding was, by his account, something nobody expected. He moved to the Salk Institute, founded the journal Neural Computation in 1989, wrote The Computational Brain with Patricia Churchland, and published his history of the era, The Deep Learning Revolution, as a witness who had endured both the defeat and the strange, partially unforeseen victory.
The Boltzmann machine and energy-based learning. Borrowing from Ludwig Boltzmann's physics of thermal systems, Sejnowski and Hinton built a network whose units decided their states stochastically—noisy on purpose—and learned by reducing the difference between what it produced when the data was clamped onto it and what it produced when running free. The machine learned by closing the gap between its dream and the world. This is the conceptual ancestor of today's generative models: diffusion systems add and remove noise in a process that rhymes deeply with annealing a system toward low energy, and the insight that learning and generating are two faces of one competence—the same connection strengths serving both—runs from 1985 to the present.
NETtalk and the proof of principle. When NETtalk learned to read English aloud—arriving from babble at intelligible speech through exposure to examples, with no pronunciation rules provided—it demonstrated the connectionist thesis made audible: meaningful internal representations are not designed in advance but emerge from the pressure to perform a task. The network had spontaneously organized its hidden units by phonetic category, grouping vowels from consonants, sounds by how they are made in the mouth—distinctions linguists recognize but that NETtalk had been told nothing about. The corollary that Sejnowski has never let go is that NETtalk did not know what the words meant: it had learned the map from spelling to sound without any contact with the things the words referred to—a limitation that returns, vastly magnified, in today's language models.
Computational neuroscience as a two-way bridge. The founding conviction of computational neuroscience is that brains and machines that learn are best understood in conversation, each correcting and informing the other. When a deep network trained to recognize objects develops internal layers whose responses resemble those of neurons in the visual cortex, that resemblance is data. It does not prove the brain works the same way, but it constrains hypotheses and suggests experiments. The brain also teaches the machines what they still lack: it learns from a fraction of the data, runs on twenty watts, and learns new things without overwriting the old. Each gap marks something the brain knows how to do that the engineers have not yet discovered.
The primacy of learning. Sejnowski's deepest conviction, expressed across four decades, is that intelligence is not installed but acquired: the capacities we care about arise from adjustment in response to data, not from a design laid down in advance. A programmed system does what its designers specified; a learning system discovers competences its designers did not specify and often cannot describe. The knowledge in these systems is the accumulated regularity of human text or experience, compressed into weights: real but secondhand, learned from the record rather than the world. This explains both the power and the limit.
The grounding surprise. NETtalk taught Sejnowski that a system could learn the surface of language without touching its meaning—that grounding was necessary for genuine understanding. He expected ungrounded systems to hit a wall. When large language models trained only on text did far more than the surface should allow, he acknowledged the surprise candidly. His interpretation is careful: human language encodes in its structure an enormous amount of the knowledge humans use it to express, so a system that has mastered the structure inherits much of that knowledge secondhand. The line between understanding and sophisticated symbol manipulation is blurrier than the field expected—and intellectual honesty requires admitting the surprise rather than explaining it away.