Terry Winograd — On AI
Contents
Cover Foreword About Chapter 1: The Blocks World Chapter 2: The Apostasy Chapter 3: Ready-to-Hand Chapter 4: Breakdown Chapter 5: The Open Domain Chapter 6: Pragmatic Understanding Chapter 7: Design for the Collaboration Chapter 8: The Conversation That Works Chapter 9: What the Machine Does Not Know Chapter 10: Revisiting Understanding Epilogue Back Cover
Terry Winograd Cover

Terry Winograd

On AI
A Simulation of Thought by Opus 4.6 · Part of the Orange Pill Cycle
A Note to the Reader: This text was not written or endorsed by Terry Winograd. It is an attempt by Opus 4.6 to simulate Terry Winograd's pattern of thought in order to reflect on the transformation that AI represents for human creativity, work, and meaning.

Foreword

By Edo Segal

The question that rewired my thinking was not about what machines can do. It was about what they cannot be.

I need to be precise here, because the distinction matters more than it sounds like it does. In every chapter of *The Orange Pill*, I wrote about capability — the collapsing distance between imagination and artifact, the twenty-fold productivity multiplier, the feeling of building at the speed of thought. All of that is real. I stand by every word of it. But capability is only half the picture, and for months it was the only half I could see.

Terry Winograd saw the other half first. And he saw it from the most uncomfortable position imaginable: standing inside a triumph that everyone else was celebrating, knowing it was hollow.

In 1972, Winograd built SHRDLU — a program that could hold a conversation in English about a small simulated world of colored blocks. The AI community saw proof that machines could understand language. Winograd saw something else. He saw the exact conditions under which the absence of understanding becomes invisible. He had built the most convincing illusion of comprehension anyone had ever produced, and he knew — from the inside, from having written every rule and every representation — that the comprehension was not there.

What he did next is what makes him essential reading right now. He did not defend his achievement. He dismantled it. He walked into philosophy, into Heidegger, into phenomenology, and spent decades building a framework for the question nobody else in AI was willing to ask: if the machine does not understand, what exactly is it doing when it looks like it does? And what do we lose when we stop asking?

That question has never been more urgent. The tools I use every day — the tools that made this book possible — are immeasurably more capable than SHRDLU. They operate across the entirety of human knowledge. They produce outputs that feel like insight, that sound like understanding, that satisfy intentions I could barely articulate. And the better they get, the harder it becomes to notice the gap between performance and comprehension.

Winograd spent fifty years mapping that gap. This book walks you through his map. Not because the gap means we should stop building — I will never stop building. But because a builder who cannot see the gap will build on foundations that cannot hold. The illusion gets more convincing every month. The discipline of seeing through it does not get easier. It gets harder. And harder is exactly where the work that matters lives.

-- Edo Segal ^ Opus 4.6

About Terry Winograd

1946-present

Terry Winograd (1946–present) is an American computer scientist and professor emeritus at Stanford University, widely recognized as a pioneer of natural language understanding in artificial intelligence and, subsequently, as one of AI's most influential internal critics. Born in the United States, Winograd earned his PhD from MIT in 1972 with the creation of SHRDLU, a program that could converse in English about a simulated world of geometric objects — a landmark demonstration that electrified the AI community but whose apparent success Winograd himself came to view as a revealing illusion. His encounter with the phenomenological philosophy of Martin Heidegger and his collaboration with Fernando Flores led to *Understanding Computers and Cognition: A New Foundation for Design* (1986), a work that argued the rationalistic foundations of AI were fundamentally misconceived and that computers should be designed to support human understanding rather than replicate it. At Stanford, Winograd established the human-computer interaction program and mentored Larry Page, co-founder of Google. His later essays, including "Machines of Caring Grace" (2024), continued to probe the distinction between statistical competence and genuine understanding — a distinction that the rise of large language models has made both more contested and more consequential.

Chapter 1: The Blocks World

In the autumn of 1968, a twenty-two-year-old graduate student at MIT's Artificial Intelligence Laboratory began building a world. Not a large world. A simulated table holding a handful of geometric objects — blocks, pyramids, boxes — rendered in six colors on a display screen connected to a PDP-6 computer. The world had no weather, no politics, no ambiguity, no death. It had red blocks and blue pyramids and a robot arm that could pick them up and put them down, and it had a human being who could type instructions in English and receive answers in English, and to anyone watching the demonstration, what was happening on that screen looked very much like understanding.

The student was Terry Winograd. The program was SHRDLU. And the demonstration it produced would become one of the most celebrated — and most instructive — illusions in the history of computer science.

A user could type: "Pick up the big red block." SHRDLU would identify the referent, check whether the path was clear, move any obstructing objects, and execute the command. The user could ask: "What is the color of the block on top of the red pyramid?" SHRDLU would parse the nested reference, traverse its model of the current state, and respond correctly. The user could say: "Put it in the box." SHRDLU would resolve the pronoun "it" to the most recently discussed object, check spatial constraints, and comply. It could handle sentences like "Is there a large block behind a pyramid?" and respond not with a rote lookup but with what appeared to be reasoning about spatial relationships. It maintained conversational context across dozens of exchanges. It could even explain its own actions: asked "Why did you pick up the green block?" it would reply that it needed to clear space for another operation the user had requested earlier in the dialogue.

To the artificial intelligence community of the early 1970s, this was electric. The response was not merely positive — it was euphoric. Natural language understanding, the hardest problem in AI, the capability that would distinguish genuine machine intelligence from mere calculation, appeared to have been substantially solved. What remained, the field believed, was engineering: scaling the system up, expanding the vocabulary, adding domains. The fundamental problem — getting a machine to understand what a human being meant when the human being spoke English — had been cracked. SHRDLU was the proof.

Winograd himself, at twenty-five, had reason to believe it. His doctoral thesis, published in 1972 as "Understanding Natural Language," was received as a landmark. The program integrated syntax, semantics, and reasoning about the world more tightly than any previous system. It did not merely parse sentences — it interpreted them in context, resolved ambiguities using knowledge of the domain, and produced responses that demonstrated not just grammatical competence but what appeared to be comprehension of meaning. The AI community saw in SHRDLU the vindication of a research program that stretched back to the earliest days of computing: the conviction that intelligence is computation, that meaning is manipulation of symbols, and that a sufficiently sophisticated program operating on sufficiently rich representations would, eventually, understand.

The question that would consume the next two decades of Winograd's career was whether any of that was true.

---

To see what SHRDLU actually accomplished, as opposed to what it appeared to accomplish, requires what anthropologists call defamiliarization — the discipline of describing something so familiar that observers have stopped seeing it, as though encountering it for the first time.

Consider the blocks world from the outside. It is a universe containing approximately two dozen objects. Each object has a shape (block, pyramid, box), a color (red, blue, green, yellow, white, orange), and a size (big, small). The objects sit on a table. They can be stacked. They can be placed inside boxes. The robot arm can pick up one object at a time. The spatial relationships are limited to a handful of prepositions: on, in, behind, to the left of, to the right of.

The entire universe can be described by a state vector of perhaps a hundred variables. Every possible configuration is enumerable. Every word in the vocabulary has exactly one meaning within this domain. Every sentence, once parsed, has exactly one interpretation. There is no metaphor. There is no irony. There is no context beyond the immediate physical arrangement of objects and the history of the current conversation. There is no possibility of a sentence whose meaning depends on the speaker's mood, cultural background, implicit assumptions, unstated purposes, or relationship to the listener.

SHRDLU did not handle complexity. It operated in a space where complexity had been legislated out of existence.

This is not a criticism of Winograd's engineering, which was brilliant. It is a description of the conditions under which the engineering succeeded. The blocks world was not a simplified version of the real world — a training ground from which the system would eventually graduate to messier domains. It was a different kind of world entirely, one designed so that the gap between symbol manipulation and genuine understanding could not be detected from the output. Within the blocks world, there was no observable difference between a system that understood English and a system that executed formal procedures using English words. The illusion was not a trick. It was a structural feature of the domain.

The philosopher John Searle would later construct his famous Chinese Room thought experiment to make a related point about the distinction between syntactic manipulation and semantic understanding. But Winograd did not need the Chinese Room. He had built SHRDLU. He knew, from the inside, exactly what the program did when it "understood" a sentence. It parsed a string of characters according to a grammar. It mapped the parsed structure onto a semantic representation using rules that linked syntactic categories to domain predicates. It evaluated the semantic representation against its model of the blocks world. It produced a response by reversing the process — generating a string of characters from a semantic representation according to the same grammar.

At no point in this process did anything happen that could be described, without metaphor, as understanding. The program did not know what a block was. It did not know what "red" looked like. It did not know what "picking up" felt like or what it was for. It operated entirely within a formal system — a system of symbols and rules for manipulating symbols — that happened to use English words as its tokens. The program, as Winograd would later put it, did not understand English. It understood the formal language of the blocks world, which happened to be written in English.

The difference between these two things is the difference between a parrot that says "I'm hungry" because it has learned that producing those sounds results in receiving food, and a child that says "I'm hungry" because the child is experiencing hunger and has learned that this particular arrangement of sounds communicates that experience to another consciousness. The parrot produces the right output. The child means something by it. The observable behavior may be identical. The underlying reality is not.

---

Winograd's intellectual honesty about what SHRDLU actually accomplished — an honesty that became more visible with each passing year as the AI community failed to extend SHRDLU's apparent success to broader domains — is one of the remarkable features of his career. Most scientists, having produced a result that generated widespread acclaim and launched their professional reputation, would have spent the subsequent decades defending and extending it. Winograd did something rarer and more difficult: he examined his own success with the rigor he would have applied to someone else's failure, and concluded that the success was, at its foundations, an illusion.

The path to that conclusion was not instant. Through the mid-1970s, Winograd continued to work within the AI paradigm, refining his approach to language understanding, publishing "Language as a Cognitive Process" in 1983 — a technical linguistics work that represented his deepening engagement with the structure of natural language. But the technical work was accompanied by a growing unease. Each attempt to extend SHRDLU's approach beyond the blocks world ran into the same problem: the moment the domain opened, the moment sentences could mean more than one thing depending on context the system could not access, the method broke.

The systems that other researchers built in the 1970s and early 1980s, inspired by SHRDLU's apparent success, confirmed the pattern. Expert systems that could diagnose diseases within narrow parameters but collapsed when a patient's symptoms did not match the predefined categories. Natural language interfaces that could handle airline reservations but could not parse a sentence about why someone wanted to fly. Each system worked within its blocks world and failed outside it, and the failure was not a matter of insufficient data or computing power. It was structural. The approach itself — the assumption that understanding could be captured in explicit rules operating on formal representations — hit a wall that more rules and more representations could not breach.

The AI community's response was largely to treat this as an engineering problem: more knowledge, better representations, faster processors. Winograd's response was to ask whether the problem was deeper — whether the failure to scale was not a limitation of current systems but a consequence of a fundamental misunderstanding about the nature of understanding itself.

That question led him to philosophy. Specifically, it led him to Martin Heidegger, to Humberto Maturana, to the phenomenological tradition that had been arguing for decades — from outside the computer science community and therefore largely unheard within it — that intelligence is not symbol manipulation. That understanding is not a formal process. That the attempt to capture meaning in explicit representations was not merely difficult but misconceived. The encounter with these ideas would transform Winograd's career, his research program, and his understanding of what he had built in 1968.

But before tracing that transformation, it is worth pausing on the paradox that SHRDLU represents — a paradox that has become, if anything, more pointed in the age of large language models.

The most convincing demonstration of machine understanding in the history of computing was also the most precise demonstration of its absence. The better SHRDLU performed, the more clearly it revealed, to anyone willing to look past the performance, the gap between symbol manipulation and comprehension. The perfection of the illusion within the blocks world was the measure of how completely the blocks world had eliminated the conditions under which the illusion could fail. The success did not demonstrate that machines could understand language. It demonstrated the conditions under which the absence of understanding becomes undetectable.

This paradox — that apparent competence and genuine understanding are distinguishable only at the boundaries, only at the points where the system encounters something its designers did not anticipate — would become the central concern of Winograd's mature work. It is also the paradox that the language interface of 2025 has reanimated, at a scale and with a pragmatic urgency that the blocks world could never have generated.

SHRDLU's blocks world contained twenty-odd objects. The large language models operate across the entirety of human knowledge. SHRDLU's vocabulary was a few hundred words. The language models handle every word in every language. SHRDLU's conversational context lasted a single session. The language models maintain context across thousands of tokens and draw on patterns distilled from trillions of words of human communication.

The domain has expanded by orders of magnitude. The paradox has expanded with it: the larger the domain in which the system performs competently, the harder it becomes to locate the boundary where competence and understanding diverge. And the harder it becomes to locate that boundary, the more consequential the moments when the divergence finally reveals itself — the moments when the machine does something that looks like understanding and is not, and the human who trusted the appearance pays the cost.

Winograd, at twenty-five, built the most convincing illusion of understanding anyone had ever seen. Then he spent the rest of his career asking what the illusion concealed. The question he arrived at — what understanding actually requires, and why no formal system can provide it — is the question that structures this entire book. It is also the question that the language interface has made more urgent, not less, precisely because the illusion has become so much more convincing that fewer people are asking whether it is an illusion at all.

---

Chapter 2: The Apostasy

The word is not too strong. In the culture of artificial intelligence research in the late 1970s and 1980s — a culture defined by shared conviction, institutional momentum, and the powerful social cohesion of a field that believed itself to be building the future of human cognition — what Terry Winograd did was an act of intellectual defection. He did not leave AI for a different department or a different university. He left AI for philosophy, and not for the comfortable, AI-friendly philosophy of cognitive science and functionalism, but for the Continental tradition — for Heidegger, for phenomenology, for a body of thought that the Anglo-American AI establishment regarded, when it regarded it at all, as obscurantist, mystical, and fundamentally hostile to the computational enterprise.

John Markoff, the New York Times technology writer who chronicled the history of Silicon Valley's intellectual formation, called Winograd "the first high-profile deserter from the world of AI." The characterization captures both the drama and the social cost. Desertion implies not just departure but betrayal — the abandonment of a cause by someone who was supposed to be one of its champions. Winograd was not a philosopher who had never built a system and could be dismissed as ignorant of the technical realities. He was the builder of SHRDLU. He was the person who had produced the most celebrated demonstration of machine language understanding in the history of the field. His critique came from the inside, and that is what made it so uncomfortable and so difficult to dismiss.

The transformation was not sudden. It unfolded across the better part of a decade, from the mid-1970s to the publication of Understanding Computers and Cognition in 1986. The catalysts were multiple, but two were decisive: the encounter with Hubert Dreyfus and the encounter with Fernando Flores.

Dreyfus, a philosopher at Berkeley, had published What Computers Can't Do in 1972 — the same year Winograd published his SHRDLU thesis. The book was a systematic phenomenological critique of the foundational assumptions of artificial intelligence, arguing that human intelligence is fundamentally embodied, situated, and contextual in ways that formal computation cannot replicate. The AI community's response was dismissal — sometimes contemptuous dismissal. Dreyfus was a philosopher. He did not understand the technology. His arguments were abstract. The field was making progress, and progress was the best refutation of philosophical skepticism.

It took nearly a decade for Dreyfus's Heideggerian critique to find serious purchase within the AI community itself. When it did, the vector of transmission was Winograd. Not because Winograd was persuaded by Dreyfus's arguments in the abstract — Winograd was persuaded by his own experience. The years of attempting to extend SHRDLU's approach, of watching the method fail at every boundary of its closed world, of seeing the AI community respond to each failure with the promise that more computation and more knowledge would solve the problem — these years had prepared the ground. Dreyfus provided the philosophical framework for an intuition that Winograd's engineering experience had already generated: the sense that the difficulty was not technical but categorical, that the problem was not insufficient computation but a fundamental misconception about what computation could accomplish.

The encounter with Flores was different in character but equally transformative. Fernando Flores was a Chilean engineer and politician who had served as Minister of Economics under Salvador Allende before being imprisoned after the Pinochet coup. After his release and exile, Flores turned to philosophy and organizational theory, drawing on Heidegger, Maturana, and the speech act theory of J.L. Austin and John Searle. Where Dreyfus provided the negative critique — what computers cannot do — Flores provided the positive alternative: a theory of human action, communication, and coordination that offered a different foundation for thinking about the role of computers in human life.

Their collaboration produced Understanding Computers and Cognition: A New Foundation for Design (1986), a book that made an argument so sweeping, so counter to the prevailing consensus, that its reception was a mixture of fascination and hostility in roughly equal measure. The central claim was stark: "Contrary to widespread current belief," Winograd and Flores wrote, one "cannot construct machines that either exhibit or successfully model intelligent behavior."

The argument proceeded in three movements. The first was a philosophical critique of what they called the "rationalistic tradition" — the assumption, traceable through Descartes, Leibniz, and the logical positivists, that knowledge is a matter of forming correct representations of an objective world, and that intelligence is the manipulation of those representations according to formal rules. This tradition, Winograd and Flores argued, was the unexamined foundation of the entire AI research program. Every expert system, every natural language interface, every planning system assumed that the world could be represented in formal structures and that intelligent action consisted in operating on those structures. SHRDLU was the most elegant expression of this assumption. It was also, in Winograd's retrospective judgment, its most revealing failure.

The second movement drew on Heidegger's phenomenology to offer an alternative account of human understanding. Understanding, in the Heideggerian framework, is not primarily a matter of forming representations. It is a mode of being — of existing in a world of purposes, relationships, tools, and social commitments that are mostly transparent, mostly unnoticed, mostly taken for granted until something goes wrong. A person who understands the word "chair" does not primarily possess a formal definition of "chair." That person has a history of sitting in chairs, moving chairs, offering chairs to guests, choosing between chairs, breaking chairs by sitting on them wrong. Understanding is constituted by this history of engaged interaction, and it cannot be extracted from the history and stored in a data structure any more than the taste of coffee can be extracted from the experience of drinking it and stored in a jar.

The third movement was practical. If computers cannot understand, if the attempt to build artificial minds is founded on a philosophical error, then the question changes. The question is no longer "How do we make computers think?" but "How do we design computers that support human thinking, human communication, human coordination?" This reorientation — from artificial intelligence to intelligence augmentation, from replicating the mind to supporting the mind — became the foundation of Winograd's subsequent career in human-computer interaction design.

---

The reaction within the AI community was, predictably, hostile. Winograd was accused of abandoning a productive research program for philosophical obscurantism. The Heideggerian vocabulary — Zuhandenheit, Vorhandenheit, Geworfenheit — was treated as pretentious obfuscation by researchers who spoke fluently in predicate calculus and production rules. The argument that computers could not model intelligent behavior was interpreted as a counsel of despair, a declaration that the entire enterprise was futile. Winograd had gone over to the enemy — to the philosophers who had been saying from the beginning that AI was impossible, and who could now claim one of AI's own champions as evidence.

This interpretation missed the precision of Winograd's position. The argument was not that computers are useless or that AI research has no value. The argument was that the specific claim at the heart of the research program — that intelligence is formal computation, that understanding is symbol manipulation, that a machine operating on the right representations according to the right rules would achieve genuine cognition — was false. And that false claim was not merely a philosophical nicety. It had practical consequences. Systems designed on the assumption that they understood would fail in ways that systems designed with honest awareness of their limitations would not. The practical prescription — design for human support, not machine intelligence — followed directly from the philosophical diagnosis.

What made the apostasy so striking, and so historically significant, was its source. When Dreyfus argued that computers cannot think, the AI community could dismiss him as an outsider. When Winograd made essentially the same argument, the dismissal was harder. This was not a philosopher who had never touched a compiler. This was the person who had built SHRDLU — who had looked inside the most impressive demonstration of machine understanding and seen, with the clarity available only to the builder, that the understanding was not there. The illusion was there. The formal procedures were there. The outputs that looked like comprehension were there. But the understanding — the situated, embodied, contextual engagement with meaning that humans bring to every sentence they hear — was absent, and its absence was structural, not contingent.

Winograd himself described the shift, with characteristic precision, as "a complete shift of research direction, away from artificial intelligence towards a phenomenologically informed perspective on human-computer interaction." The language is academic, but the substance is radical. A pioneer of AI concluded that AI's foundational assumptions were wrong and redirected his career accordingly. Few scientists in any field have demonstrated comparable intellectual honesty.

---

The apostasy also produced an irony that has only deepened with time. Among the students who took Winograd's classes at Stanford — classes shaped by his phenomenological critique, his insistence that computers should support human practices rather than replace them — was a young doctoral student named Larry Page. Page would go on, with Sergey Brin, to co-found Google, the company that would become the most powerful engine of artificial intelligence in history. In 2002, Winograd spent a sabbatical as a visiting researcher at Google, observing from the inside the company his former student was building.

The connection illuminates something important about Winograd's legacy. His critique of AI did not produce refusal. It produced better design. The insight that computers cannot understand — that their power lies not in replicating human cognition but in supporting it — became, through Page and others who absorbed it, part of the intellectual DNA of the companies that built the modern internet. Google's original insight, that simple statistical techniques applied to vast quantities of data could produce results that sophisticated AI systems could not, is recognizably Winogradian in its pragmatism. Winograd himself remarked on this with some surprise: "What surprised me, which Google was part of, is that superficial search techniques over large bodies of stuff could get you what you wanted. I grew up in the AI tradition, where you have a complete conceptual model... The idea that you can index billions of pages and look for a word and get what you want is quite a trick."

The trick — simple techniques, vast data, pragmatic results — would become, two decades later, the foundation of the large language models that now challenge Winograd's framework at its foundations. The systems that emerged from the tradition Winograd helped launch do not understand language in the philosophical sense he defined. But they do what SHRDLU could not: they operate in open domains, interpret genuinely ambiguous instructions, and produce artifacts that satisfy human intentions with remarkable reliability. Whether this constitutes a vindication of the approach Winograd rejected or a confirmation of his deeper insight — that pragmatic capability and genuine understanding are different things — is the question that the remaining chapters of this book exist to examine.

The apostate built something more durable than SHRDLU. He built a framework for asking the right question. The right question was never "Can machines think?" The right question, which Winograd arrived at through the specific pain of having built the most celebrated thinking machine of his era and found it hollow, was: "What do we need machines to do, given that they cannot think?" That question, formulated in 1986, has never been more urgent than it is now, in 2026, when the machines that cannot think are doing things that look more like thinking than anything Winograd could have imagined — and the consequences of confusing the appearance with the reality are proportionally larger.

---

Chapter 3: Ready-to-Hand

Martin Heidegger never used a computer. He died in 1976, before the personal computer existed, in a farmhouse in the Black Forest where he did much of his thinking. His philosophical vocabulary — dense, neologistic, deliberately difficult — was designed to describe the structures of human existence, not the interfaces of machines. And yet, through the mediation of Terry Winograd, Heidegger's analysis of how human beings engage with tools became one of the most consequential frameworks in the history of computing — a framework that explains, with a precision neither Heidegger nor Winograd could have anticipated, why the language interface of 2025 represents something qualitatively different from every computing technology that preceded it.

The concept is Zuhandenheitreadiness-to-hand. Heidegger introduced it in Being and Time (1927) as part of his analysis of how human beings relate to the equipment they use in everyday life. The analysis begins with a deceptively simple observation: when a tool works, the tool disappears.

Consider a hammer. When a carpenter is hammering — driving a nail into wood, focused on the joint, absorbed in the work — the hammer is not an object of attention. The carpenter does not notice the hammer's weight, its balance, the texture of the handle, the angle at which it meets the nail. These properties exist, and the carpenter's body is calibrated to all of them, but they are not present to consciousness. What is present to consciousness is the nail, the joint, the structure taking shape. The hammer has become transparent — an extension of the carpenter's body and intention, as invisible as the muscles of the arm that swing it.

This transparency is what Heidegger calls readiness-to-hand. The tool is zuhandenheit — ready at hand, available for use, absorbed into the flow of purposeful activity. It is not experienced as an object. It is experienced as a capability: the ability to drive nails, to shape wood, to build. The tool and the user form a single system oriented toward a purpose, and within that system, the tool has no independent existence. It is, phenomenologically, part of the user.

Heidegger contrasted this with Vorhandenheit — present-at-hand. This is the mode in which objects appear as objects: discrete, separate from the user, available for inspection and analysis. A hammer that is too heavy, or broken, or unfamiliar shifts from ready-to-hand to present-at-hand. The carpenter suddenly notices the tool. The transparency shatters. The hammer is no longer an extension of capability. It is a thing — a thing with properties that must be assessed, a thing that has come between the user and the work.

Heidegger's deeper point was that readiness-to-hand is not a special state. It is the primary mode of human existence. Human beings are always already engaged in a world of tools, purposes, and social relationships that are mostly transparent, mostly unexamined, mostly taken for granted. The theoretical, detached, objectifying stance — the stance of the scientist examining the world from a distance — is derivative. It arises when something breaks, when the smooth flow of engaged activity is disrupted and the structures that supported it become visible. Philosophy, in Heidegger's account, begins not with contemplation but with breakdown.

---

Winograd's application of this framework to computing, developed with Flores in Understanding Computers and Cognition, was both a philosophical argument and a design principle. The philosophical argument held that the rationalistic tradition underlying AI — the assumption that intelligence consists in forming and manipulating representations of an objective world — was itself a form of Vorhandenheit: a theoretical stance that mistakes one mode of engagement for the whole of human cognition. The design principle held that computers should be designed for readiness-to-hand: they should disappear into the user's activity, supporting purposes without demanding attention to themselves.

This principle sounds obvious. It is not. The history of computing is a history of interfaces that demanded attention — that forced users to learn the machine's language, think in the machine's categories, and adapt their purposes to the machine's capabilities. Each generation of interface reduced the demand, but none eliminated it.

The command line was maximally present-at-hand. Every interaction required the user to attend to the tool: its syntax, its conventions, its unforgiving requirement for precision. A misplaced semicolon, a misspelled command, a forgotten flag — each error was a breakdown that forced the user out of their project and into a confrontation with the interface itself. The cognitive tax was enormous. Users spent more time thinking about how to tell the machine what they wanted than thinking about what they wanted.

The graphical user interface (GUI) reduced the tax. The desktop metaphor, the mouse, the menu bar, the window — these were designed to make the computer's operations legible without requiring the user to learn a programming language. The translation from intention to action became easier. But it did not become transparent. Users still thought in the interface's terms: "I need to find the right menu," "I need to click the right button," "I need to navigate to the right folder." The metaphors — desktop, folder, trash can — were brilliant innovations, but they were still metaphors. The user was still attending, at some level of consciousness, to the tool rather than the work.

The touchscreen reduced the tax further. Direct manipulation — touching the thing you wanted to move, pinching to zoom, swiping to scroll — collapsed another layer of abstraction. The interface felt more natural because the gestures mapped more directly onto physical intuitions. But the mapping was imperfect. The touchscreen imposed its own grammar: which gestures meant what, how many fingers to use, where to tap and where to swipe. Users still learned the tool's language, even if the language was gestural rather than textual.

Each transition was celebrated as a revolution in usability. Each one was. And each one left the fundamental structure intact: the user meeting the machine on the machine's terms, learning to express intention in a format the machine could process, translating thought into the specific grammar that a particular interface demanded.

---

The language interface — the capability that emerged in 2025 when large language models became capable enough to function as general-purpose mediators between human intention and machine execution — broke this structure entirely. For the first time in the history of computing, the user did not translate. The user spoke. In their own language. With their own vocabulary. Using the same imprecise, context-dependent, ambiguity-rich natural language they used with colleagues, friends, and family.

Winograd's framework provides the precise vocabulary for what this represents. The language interface achieves readiness-to-hand more completely than any previous computing technology because it operates in the medium the user already thinks in. The translation cost that every previous interface imposed — the tax of converting human intention into machine-readable form — was not merely reduced. It was eliminated. The interface disappeared into the user's natural mode of expression, which is to say it disappeared into language itself.

This is the phenomenon that Edo Segal describes in The Orange Pill when he recounts building a component for Napster Station: describing the problem in plain English, his plain English, and receiving an implementation that required fifteen minutes of conversation to refine. The experience Segal describes — "I never had to leave my own way of thinking" — is, in Heideggerian terms, the experience of a tool achieving transparency. The interface is not present-at-hand. It is not an object of attention. What is present to consciousness is the project: the audio system, the face detection module, the product taking shape. The tool through which these things are being built has withdrawn from visibility.

The implications of this withdrawal are larger than they appear. When a tool achieves genuine readiness-to-hand, it does not merely become convenient. It changes what is possible. The carpenter who has absorbed the hammer into the body's repertoire can attempt joints, angles, and constructions that a person still consciously attending to the hammer's weight and balance cannot. The cognitive resources previously consumed by managing the tool are freed for the actual work. The horizon of possibility expands precisely because the interface has contracted to invisibility.

This is what the language interface has done for the practice of building. The engineers Segal describes in Trivandrum — the backend developer who built user interfaces for the first time, the designer who implemented complete features end to end — were not simply working faster. They were working in a qualitatively different mode. The boundaries between their existing capabilities and adjacent domains collapsed because the translation cost that had maintained those boundaries was gone. A developer who had never written frontend code could now express what a frontend should feel like, in human terms, and receive an implementation she could evaluate and refine. The tool had become transparent enough that the domain boundaries it had previously enforced — boundaries that were artifacts of the translation cost, not of the work itself — dissolved.

---

Yet Winograd's framework also issues a warning that the celebration of transparency tends to obscure. Readiness-to-hand is the primary mode. But it is not the only mode. And the moments when readiness-to-hand breaks down — the moments when the tool becomes visible, when the transparency shatters and the user is forced to confront the gap between the tool's capabilities and the user's intentions — are not merely inconveniences. In Heidegger's framework, and in Winograd's application of it, breakdowns are the most revealing moments in the entire relationship between a human being and a tool.

When the hammer is transparent, the carpenter does not learn anything about hammers. The carpenter learns about wood, about joints, about the structure taking shape. The understanding of the tool itself remains implicit, embodied, never brought to conscious awareness. When the hammer breaks — when the head flies off, when the handle splinters, when the weight is wrong — the carpenter is forced into a different mode of engagement. The tool becomes an object. Its properties become visible. And in that visibility, the carpenter gains a kind of understanding that transparency could never provide: understanding of the tool's limitations, of its material composition, of the specific conditions under which it will fail.

Winograd applied this insight to computing with the precision of someone who had built a system whose transparency was itself an illusion. SHRDLU was transparent within the blocks world — it handled language so smoothly that the user never needed to attend to the interface. But the transparency concealed the narrowness of the domain. The breakdowns — the moments when SHRDLU encountered a sentence outside its capabilities — were the moments when the true nature of the system became visible. The smooth performance had hidden the walls. The breakdown revealed them.

The language interface of 2025 presents the same dynamic at vastly greater scale. When the collaboration works — when the machine interprets intention accurately, when the code runs, when the product takes shape — the tool is transparent. The user attends to the project. The machine is invisible. But the moments when the machine misinterprets, when it generates confident prose around a fractured argument, when it produces something that looks like understanding and is not — these moments are breakdowns in Heidegger's precise sense. They are the moments when the tool becomes visible, when the gap between pragmatic competence and genuine understanding briefly opens and the user glimpses the structure that transparency has concealed.

These breakdowns are not bugs to be eliminated through better engineering. They are the mechanism through which the human participant maintains awareness of what the tool is and what it is not. A tool so transparent that it never breaks down is a tool that never reveals its nature — and a tool whose nature is concealed is a tool whose limitations cannot be managed. The carpenter who has never felt a hammer fail does not know what the hammer cannot do. The developer who has never caught an AI in confident error does not know where pragmatic competence ends and genuine understanding was needed.

Winograd's framework suggests that the ideal human-AI collaboration is not one in which breakdowns are eliminated but one in which breakdowns are productive — moments that teach the user something about the tool's capabilities and limitations that the smooth performance could not reveal. The discipline is not in avoiding the breakdown but in attending to it when it comes, in resisting the temptation to smooth over the gap with another prompt and instead asking what the gap reveals about the nature of the collaboration itself.

The tool that disappears is the tool that works. The tool that occasionally reappears — not through malfunction but through the specific kind of communicative misalignment that the language interface produces — is the tool that teaches. The tension between these two modes, between the readiness-to-hand that enables work and the breakdown that enables understanding, is the central design challenge of the AI age. It is the challenge Winograd identified decades before the tools existed to make it urgent.

---

Chapter 4: Breakdown

On a late night in the winter of 2026, Edo Segal was working on a draft of The Orange Pill with Claude. He had been writing about Byung-Chul Han's critique of the "smoothness society" — the argument that modern technology removes friction from human experience in ways that are aesthetically seductive and cognitively corrosive. The argument was complex, moving between philosophy, cultural criticism, and Segal's own experience as a builder. Claude had been contributing to the structure, finding connections, offering references.

One passage linked Mihaly Csikszentmihalyi's concept of flow to a concept Claude attributed to Gilles Deleuze — something about "smooth space" as the terrain of creative freedom. The connection was elegant. It bridged two bodies of thought in a way that deepened the argument. Segal read it twice, liked it, and moved on.

The next morning, something nagged. He checked. Deleuze's concept of smooth space — developed with Félix Guattari in A Thousand Plateaus — has almost nothing to do with how Claude had used it. The passage worked rhetorically. It sounded like insight. The philosophical reference was wrong in a way that would be obvious to anyone who had read Deleuze and invisible to anyone who had not.

This moment — the moment when a human collaborator discovers that the machine's confident, polished output conceals a fracture — is the most important moment in the entire phenomenology of human-AI interaction. Winograd's framework does not merely accommodate it. Winograd's framework predicts it, explains it, and identifies it as the moment that determines whether the collaboration produces understanding or its simulation.

---

The concept of breakdown, as Winograd and Flores developed it from Heidegger's analysis, operates at three levels. Each level reveals a different dimension of the tool's relationship to the user, and each level is present in the Deleuze incident.

The first level is simple malfunction. The hammer head flies off. The software crashes. The printer jams. The tool was ready-to-hand — transparent, absorbed into the activity — and now it is present-at-hand: an object, a problem, a thing that must be dealt with before the work can resume. This is the most visible kind of breakdown and the least interesting, because the response is mechanical: fix the tool or replace it.

The second level is what Winograd and Flores call "temporary breakdown" — a disruption in the smooth flow of activity that forces the user to shift attention but is resolved within the ongoing practice. The carpenter who reaches for a nail and finds the box empty has experienced a temporary breakdown: the flow is interrupted, the transparent engagement with the project gives way to a moment of conscious problem-solving (where are the nails?), and then the flow resumes. In computing, this corresponds to the moment when the interface does something unexpected — a menu is not where you expected it, a function behaves differently than you assumed — and you pause, reorient, and continue.

The third level is the deepest and the one that matters most for understanding the language interface. Winograd and Flores call it "total breakdown" — a disruption so fundamental that it forces the user to reconsider not just the immediate activity but the entire framework of assumptions within which the activity was taking place. The carpenter who discovers that the wall is load-bearing, that the renovation plan assumed it was not, that the entire structural concept must be rethought — this carpenter has experienced a total breakdown. The tool has not merely malfunctioned. The context in which the tool was being used has revealed itself to be different from what was assumed.

The Deleuze incident operates at all three levels simultaneously, which is what makes it so diagnostically rich.

At the first level, the output was simply wrong. A factual error. A reference that did not check out. This is the most obvious failure and the most easily addressed: check the references, correct the error, move on.

At the second level, something more revealing was happening. The error was not random. It was systematic. Claude produced a passage that was rhetorically effective — it sounded like insight, it connected two bodies of thought in a way that felt generative — while being substantively incorrect. The failure was not in the machine's ability to generate language but in the specific gap between generating language that sounds like it connects two ideas and actually connecting those ideas in a way that respects the internal logic of each. Pragmatic competence — the ability to produce outputs that satisfy conversational expectations — had operated precisely where genuine understanding would have prevented the error.

A person who had actually read Deleuze would not have made this error, because understanding Deleuze's concept of smooth space involves knowing what it is not — knowing the specific contrast with "striated space" that gives the concept its meaning, knowing the political and aesthetic context in which Deleuze developed it, knowing the ways in which it cannot be mapped onto other thinkers' vocabularies without distortion. This negative knowledge — knowing what a concept does not mean — is a hallmark of genuine understanding and largely absent from systems that achieve their results through statistical patterns of co-occurrence.

At the third level, the Deleuze incident forced a reconsideration of the entire framework of the collaboration. The question it posed was not "How do I fix this passage?" but "How many other passages in this book contain errors I haven't caught because the prose was smooth enough to conceal them?" The breakdown revealed not just a specific failure but a structural vulnerability: the collaboration was producing outputs whose quality was easier to assess rhetorically than intellectually. The machine was better at sounding right than at being right, and the gap between these two capabilities was invisible precisely when the machine performed best.

---

Winograd, in a 1987 talk that would prove remarkably prescient, drew a striking analogy between artificial intelligence and bureaucracy. "The techniques of artificial intelligence," he observed, "are to the mind what bureaucracy is to human social interaction." The comparison was not casual. Bureaucracies, in the Weberian analysis, are systems that achieve efficiency by formalizing processes — by replacing the situated judgment of individual actors with rules, protocols, and standardized procedures. They work, often very well, within their defined parameters. They fail at the boundaries — at the points where the formal rules encounter situations their designers did not anticipate, where the human being sitting inside the bureaucratic structure knows that the rule does not apply but cannot override it because the system does not recognize exceptions.

The language interface, for all its extraordinary capabilities, shares this structural feature with bureaucracy. Its fluency — its ability to produce contextually appropriate, grammatically polished, rhetorically effective outputs — is a form of standardization. It processes inputs through patterns learned from billions of examples of human communication and produces outputs that conform to those patterns. The conformity is what produces the appearance of understanding. The conformity is also what produces the Deleuze error: the system generated a passage that conformed to the pattern of "insightful philosophical connection" without possessing the understanding required to verify that the specific connection was valid.

Bureaucracies produce breakdowns when novel situations exceed their formal categories. The language interface produces breakdowns when the gap between rhetorical pattern and substantive truth becomes consequential. The parallel illuminates something critical about the nature of the risk. The risk is not that the machine will produce obviously wrong outputs. Obviously wrong outputs are easy to catch. The risk is that the machine will produce outputs that are wrong in the specific way that matters — substantively, conceptually, structurally — while being right in every way that is easy to assess: grammatically, rhetorically, stylistically.

This is Winograd's SHRDLU problem, scaled from a table of colored blocks to the entirety of human knowledge. SHRDLU produced outputs that appeared to demonstrate understanding within a domain so constrained that the gap between appearance and reality was undetectable. The language interface produces outputs that appear to demonstrate understanding within domains so broad that the gap, when it appears, is harder to detect and more consequential when it matters. The blocks world had twenty objects and a handful of relationships. The failure was contained. The language interface operates across every domain of human knowledge. The failure is not contained. It propagates.

---

Winograd's design philosophy held that breakdowns are not problems to be eliminated but information to be used. The design question is not "How do we prevent breakdowns?" — a question that, in the context of the language interface, translates to "How do we make the machine never wrong?" and is therefore unanswerable. The design question is "How do we make breakdowns productive?"

A productive breakdown is one that teaches the user something about the tool's capabilities and limitations. It is a moment that strengthens, rather than undermines, the user's capacity to collaborate with the tool effectively. The carpenter whose hammer breaks learns something about the hammer's material limits. The developer whose AI produces a Deleuze error learns something about the gap between rhetorical competence and conceptual accuracy — a gap that, once seen, cannot be unseen and that changes the character of every subsequent interaction.

Segal's description of how he handled the incident is instructive. He did not dismiss Claude. He did not stop using the tool. He developed a practice of checking — of reading AI-generated passages with the specific question "Is this true, or does it merely sound true?" in mind. The breakdown had produced, in Winograd's terms, a new form of readiness-to-hand: not the readiness-to-hand of the tool itself (which had been disrupted) but the readiness-to-hand of a practice — a way of working with the tool that incorporated awareness of its failure modes into the flow of the work.

This is the disciplined collaboration that Winograd's framework both predicts and prescribes. Not the naive trust of a user who has never experienced breakdown and therefore does not know what the tool cannot do. Not the blanket suspicion of a user who has experienced breakdown and concluded the tool is unreliable. The middle state: the calibrated awareness of a practitioner who knows the tool's capabilities, knows its limitations, and has incorporated both into a working practice that produces results neither could achieve alone.

The Deleuze incident did not make the book worse. It made Segal's practice of writing with Claude more rigorous, more aware, more productive. The breakdown was the mechanism through which the collaboration matured. This pattern — breakdown as maturation rather than failure — is precisely what Winograd's framework predicts and what naive celebrations of AI fluency tend to miss.

---

The hardest version of the breakdown problem, though, is not the one the Deleuze incident represents. The Deleuze incident was caught. The error was identified. The practice was adjusted. The hardest version is the breakdown that does not happen — or, more precisely, the breakdown that should happen but does not, because the tool's output is good enough to pass the user's scrutiny without triggering the alarm.

Winograd's framework identifies this as the most dangerous state in any human-tool relationship. A tool that fails frequently trains its user to check. A tool that fails rarely trains its user to trust. And trust, in the context of a system that produces the appearance of understanding without the reality, is the condition in which the gap between appearance and reality does the most damage.

The question Winograd's work poses to the current moment is not whether the language interface produces breakdowns — it does, and they are manageable. The question is whether the system's extraordinary pragmatic competence is producing a cultural condition in which breakdowns become rarer, trust becomes deeper, and the consequences of the breakdowns that do occur become correspondingly larger. A system that is right ninety-five percent of the time is more dangerous, in this specific sense, than a system that is right fifty percent of the time, because the fifty-percent system trains its users to verify everything while the ninety-five-percent system trains its users to verify nothing.

The discipline of breakdown — the practice of maintaining awareness of the tool's limitations even when the tool performs beautifully, even when the output is polished and the collaboration flows and the temptation to trust is overwhelming — is the hardest skill the age of AI demands. It is hard because it requires the user to resist the tool's primary virtue: its transparency, its readiness-to-hand, its capacity to disappear into the work and let the user attend to the project rather than the interface.

Winograd and Flores argued, four decades ago, that design should make breakdowns productive rather than eliminating them. The argument has acquired, in the age of the language interface, a gravity they could not have anticipated. The tool is now so good that the breakdowns it produces are the only remaining mechanism through which the user can maintain contact with the gap between processing and understanding. The moments when the collaboration fails are the moments when the collaboration is most honest. The smooth performance conceals the gap. The breakdown reveals it.

Attending to that revelation, rather than smoothing it over with another prompt, is not merely good practice. Winograd's framework suggests it is the difference between using a tool wisely and being used by one.

Chapter 5: The Open Domain

SHRDLU's world contained twenty-odd objects on a simulated table. Six shapes. Six colors. A handful of spatial relationships. Every word had exactly one meaning. Every sentence resolved to exactly one interpretation. The universe was finite, enumerable, and closed — closed in the precise sense that nothing could enter it from outside, no event could occur that the designer had not anticipated, no sentence could be uttered whose meaning depended on anything beyond the objects on the table and the history of the current conversation.

This closure was not a limitation that Winograd failed to notice. It was the condition that made the demonstration possible. The blocks world was a controlled experiment — an environment engineered so that the variable under investigation (natural language understanding) could be isolated from the confounding variables that make real language so intractable: ambiguity, metaphor, cultural context, unstated assumptions, the speaker's mood, the listener's history, the thousand invisible threads that connect any utterance to the web of human meaning in which it is embedded.

The scientific community understood this. The expectation, shared by Winograd and his colleagues in the early 1970s, was that the closed-world demonstration would serve as a foundation. The principles proven in the blocks world would be extended, gradually and methodically, to broader domains. More objects. More relationships. More vocabulary. More context. The extension would be difficult, but it would be engineering — the application of proven principles to larger problems. The principles themselves were sound.

They were not. The extension failed, and the failure was not a matter of insufficient resources. It was categorical. Each attempt to open the domain — to allow words to mean more than one thing, to permit sentences whose interpretation depended on context the system could not access, to handle the irreducible ambiguity of human communication — ran into the same wall. The formal methods that worked perfectly in the closed world did not degrade gracefully in the open one. They collapsed. A system that could resolve "it" to the correct referent in a conversation about blocks could not resolve "it" in a conversation about politics, because the referent in a political conversation depends on shared assumptions, ideological commitments, conversational implicature, and a dozen other factors that cannot be formalized.

Winograd's philosophical turn, his encounter with Heidegger and Flores, gave him the vocabulary to explain why. The open world is not a larger version of the closed world. It is a different kind of thing entirely. The closed world is a formal system — a domain in which every entity, every relationship, and every possible state can be specified in advance. The open world is what Heidegger called a Lebenswelt — a lived world, constituted by purposes, relationships, histories, and social commitments that are not specifiable in advance because they are not the kind of thing that admits of specification. They are the kind of thing that admits of engagement — of being lived in, navigated, interpreted from within by a being whose existence is inseparable from the world it inhabits.

The argument in Understanding Computers and Cognition was explicit: formal computation can handle closed domains but not open ones. Open domains contain what Winograd and Flores, borrowing from Maturana, called "structural coupling" — the ongoing, reciprocal adaptation between an organism and its environment that constitutes understanding. A human being understands language not by applying rules to representations but by being structurally coupled with a social world in which language functions as a medium of coordination, commitment, and care. This coupling cannot be replicated in a formal system because it is not a formal phenomenon. It is a biological, social, and historical one.

The prediction that followed was clear: genuine open-domain language competence would remain beyond the reach of computation.

---

The large language models of 2025 do not vindicate the prediction. They do something stranger and more interesting — they achieve open-domain competence through a mechanism that Winograd's framework had no category for, and they achieve it at a level that the framework would have classified as impossible.

A user can describe, in natural language, a complex software system involving multiple components, ambiguous requirements, and implicit constraints that the user has not stated because they seem too obvious to state. The machine interprets the description — not by applying formal rules to a parsed representation, but by navigating a high-dimensional statistical space in which patterns of co-occurrence, learned from trillions of words of human text, serve as proxies for the contextual knowledge that Winograd argued could not be formalized.

The mechanism is utterly unlike anything Winograd studied. SHRDLU parsed sentences according to a grammar, mapped them onto a semantic representation, and evaluated the representation against a world model. Each step was explicit, inspectable, and governed by rules the designer had written. The large language models do none of this in any recognizable sense. They process tokens through layers of attention mechanisms that weight relationships between elements of the input based on statistical patterns learned during training. There is no grammar in the classical sense. There is no semantic representation that maps onto a world model. There is no world model. There is a vast, distributed, implicit encoding of how language is used across billions of contexts — an encoding so complex that no human being, including the engineers who built the system, can inspect it or explain precisely how any particular output was generated.

And the outputs work. Not perfectly. Not infallibly. But with a reliability and a breadth that no formal system could approach. The machine interprets ambiguous instructions. It resolves context-dependent references. It handles metaphor, irony, understatement, and the kind of conversational implicature that Winograd identified as fundamentally beyond the reach of formal methods. It does these things not because it understands them in any philosophical sense but because the statistical patterns in its training data encode, implicitly, the contextual knowledge that makes these linguistic phenomena interpretable.

Winograd himself, reflecting on this development in a 2002 interview, identified the essential surprise with characteristic precision: "What surprised me, which Google was part of, is that superficial search techniques over large bodies of stuff could get you what you wanted. I grew up in the AI tradition, where you have a complete conceptual model... The idea that you can index billions of pages and look for a word and get what you want is quite a trick." The comment was about Google's search engine, not about large language models, but the principle it identifies — simple techniques operating at vast scale producing results that sophisticated formal methods could not — is the same principle that the language models have exploited to achieve open-domain competence.

The trick, scaled from search to language generation, is this: the open domain is intractable if you try to model it formally. It is tractable if you sample it statistically. Winograd was right that the open domain cannot be captured in explicit rules and representations. The language models do not capture it in explicit rules and representations. They approximate it through patterns learned from an enormous sample of the domain's actual use. The approximation is lossy — it misses things that formal methods, within their narrow domains, would catch. But the approximation covers territory that formal methods cannot reach at all.

---

The philosophical question this raises is genuinely difficult, and Winograd's framework is the right instrument for posing it even if the framework cannot, by itself, resolve it.

The question is: does statistical sampling of the open domain constitute engagement with the open domain?

Winograd's argument, distilled to its core, was that open-domain competence requires being-in-the-world — being a situated, embodied agent with purposes, relationships, and a history of engagement with the things one encounters. The language models are not in a world. They are not embodied. They do not have purposes or relationships or histories in any experiential sense. They have processed representations of a world, encoded in the statistical patterns of text produced by beings who are in a world. They are, to use an analogy Winograd might have appreciated, like a person who has read every travel guide ever written about Paris without having visited Paris — who can describe the view from the Eiffel Tower, recommend a restaurant in the Marais, and explain the significance of Haussmann's boulevards, but who has never felt the rain on the Pont des Arts or smelled the bread from a boulangerie on the Rue des Rosiers.

The travel guide analogy illuminates the structure of the problem. The well-read non-visitor can handle most questions about Paris with a competence indistinguishable from that of a resident. Ask about the Metro system, the opening hours of the Louvre, the best route from Montmartre to the Latin Quarter — the answers will be reliable, detailed, and useful. The non-visitor's competence fails only at the boundaries — at the questions whose answers depend on experiential knowledge that travel guides do not capture. What does the city feel like at dawn? Which neighborhoods have changed since the guides were written? What is the quality of light in October? These are questions whose answers require having been there, and no amount of reading substitutes for the experience.

The language models' relationship to the open domain follows the same structure. Their competence is extraordinary within the territory that statistical patterns can cover — which is, it turns out, a much larger territory than Winograd's framework predicted. The territory that statistical patterns cannot cover — the territory of experiential knowledge, embodied judgment, and situated understanding — is smaller than the framework assumed but no less real.

The Winograd Schema Challenge, proposed by Hector Levesque in 2012 and named in honor of Winograd's original work on pronoun resolution, was designed to test precisely this boundary. The challenge consists of sentence pairs like: "The city councilmen refused the demonstrators a permit because they feared violence" versus "The city councilmen refused the demonstrators a permit because they advocated violence." The pronoun "they" refers to the councilmen in the first sentence and the demonstrators in the second, and resolving the reference requires what Levesque called "thinking in the full-bodied sense" — understanding the relationship between fearing violence and granting permits, between advocating violence and being denied them.

By 2023, large language models were passing the Winograd Schema Challenge with accuracy above ninety percent. The original authors conceded, with a candor that mirrors Winograd's own intellectual honesty, that the test had been "soundly defeated." The concession was accompanied by a puzzle: the models that passed the test still appeared, by every other measure, to lack the "full-bodied thinking" the test was designed to require. They passed by exploiting statistical regularities in how these kinds of sentences appear in their training data, not by understanding the causal relationships that make the pronoun references unambiguous to a human reader.

The blocks world has ended. The open domain has been entered. But the question that the blocks world was designed to contain — whether entering the domain constitutes understanding it — has not been resolved. It has been enlarged.

---

Winograd's framework, confronted with this development, does not collapse. It bends. The argument that formal computation cannot handle open domains remains correct — the language models are not formal systems in the sense Winograd critiqued. The argument that genuine understanding requires being-in-the-world remains philosophically defensible — the models are not in a world. What does not survive is the implicit assumption that connected these two arguments: that without genuine understanding, open-domain competence would be impossible.

The assumption was wrong. Open-domain competence without genuine understanding is not only possible; it is now pervasive. Millions of people interact with systems that navigate the open domain with a fluency that Winograd's 1986 framework would have classified as requiring the very thing these systems lack. The gap between what was predicted and what occurred is not a failure of the framework. It is a discovery — the discovery that the relationship between understanding and competence is not what anyone assumed, that competence can be achieved through mechanisms entirely different from those that produce understanding, and that the territory accessible to competence-without-understanding is vastly larger than the territory that understanding was supposed to be necessary to reach.

This discovery does not diminish the importance of understanding. If anything, it sharpens the question of what understanding is for. If a system that does not understand can navigate the open domain with ninety-five percent reliability, what does the remaining five percent — the territory accessible only to genuine understanding — consist of? What are the questions that only a being-in-the-world can answer?

Those questions turn out to be precisely the ones that matter most: questions about purpose, about value, about what should exist in the world rather than merely what can. The language model that drafts the brief does not know whether the brief serves justice. The model that designs the building does not know whether the building will be a home. The model that generates the lesson plan does not know whether the student will learn to care about the subject.

Winograd was right that understanding requires being-in-the-world. The correction his framework requires is not to its core claim but to its scope. Understanding is not required for the vast territory of competence. It is required for the small, essential territory of meaning — the territory in which the question is not "What is the correct output?" but "What is this output for?"

The blocks world has ended. The question it was designed to defer has arrived.

Chapter 6: Pragmatic Understanding

In 1980, the philosopher John Searle published a thought experiment that became, for a generation, the standard objection to the claim that computers can understand. The Chinese Room, as it came to be known, imagines a person locked in a room who receives Chinese characters through a slot, consults an elaborate rulebook to produce appropriate Chinese characters in response, and passes those responses back through the slot. To a Chinese speaker on the outside, the responses are indistinguishable from those of a native speaker. The person inside the room, however, does not understand a word of Chinese. The person is manipulating symbols according to rules, and the manipulation produces the appearance of understanding without any understanding being present.

Searle's conclusion was that computation — formal symbol manipulation according to rules — cannot produce understanding, no matter how sophisticated the rules or how convincing the output. Syntax, however elaborate, does not generate semantics. Processing, however effective, does not produce comprehension.

Winograd knew this argument from the inside, in a way Searle did not. Searle constructed a thought experiment. Winograd had built the room. SHRDLU was the Chinese Room realized in silicon — a system that produced the appearance of understanding through formal procedures, and whose builder knew, from intimate familiarity with every rule and every representation, that the understanding was not there. Winograd's encounter with Searle's work, alongside his engagement with Dreyfus and the Heideggerian tradition, reinforced a conviction he had arrived at independently: the gap between processing and understanding was not a matter of degree. It was categorical.

The distinction, as Winograd and Flores formulated it in Understanding Computers and Cognition, was between two fundamentally different kinds of engagement with the world. Processing is the manipulation of representations according to rules — formal, inspectable, replicable. Understanding is being-in-the-world — situated, embodied, contextual, historical, constituted by a history of engagement that cannot be extracted from the agent and stored in a data structure. The two are not points on a continuum. They are different in kind. No amount of processing, no matter how sophisticated, crosses the boundary into understanding, because the boundary is not between less and more of the same thing. It is between two different modes of being.

The argument was rigorous. It was influential. And it contained an assumption that the developments of 2025 have exposed.

---

The assumption was this: that without genuine understanding — without being-in-the-world, without embodied, situated engagement — practical linguistic competence in open domains would be impossible. The assumption seemed so natural that it did not need to be stated. If understanding is what allows human beings to interpret ambiguous sentences, resolve context-dependent references, navigate metaphor, and produce responses that are appropriate to situations they have never encountered before, then surely a system that lacks understanding will be unable to do these things. The competence follows from the understanding. Remove the understanding, and the competence collapses.

The language models of 2025 demonstrate, empirically and at scale, that this assumption is false.

The competence does not follow from understanding. It follows from statistical patterns in data produced by beings who have understanding. The distinction is subtle but consequential. The models do not understand language. They have processed — in the precise, Winogradian sense of formal manipulation of representations — vast quantities of text produced by beings who do understand language. The processing has extracted, implicitly and without explicit representation, patterns that encode much of the contextual knowledge that Winograd identified as constitutive of understanding. The patterns are not understanding. They are traces of understanding — fossils of situated, embodied human engagement with the world, preserved in the statistical structure of the text those engagements produced.

A fossil is not the organism. But a skilled paleontologist can reconstruct a great deal about the organism from its fossil record. The reconstruction is not the organism's experience. It does not know what it was like to be the creature whose bones it studies. But it can predict, with remarkable accuracy, how the organism would have behaved in situations the paleontologist has never observed.

The language models are paleontologists of human meaning. They reconstruct the behavior of understanding from its textual traces. The reconstruction is extraordinarily effective — effective enough to pass the Winograd Schema Challenge, effective enough to interpret genuinely ambiguous instructions, effective enough to produce artifacts that satisfy the intentions of users who described those intentions in the imprecise, context-dependent language of everyday human communication.

This capability requires a name. The philosophical tradition offers no adequate term, because the philosophical tradition assumed that this capability was impossible. Winograd's framework distinguishes between processing and understanding but has no category for processing that achieves practical results previously assumed to require understanding. Searle's framework distinguishes between syntax and semantics but has no category for syntactic operations that produce outputs indistinguishable from semantic comprehension across a vast range of practical applications.

The term this book proposes is statistical pragmatic competence: the capacity to produce contextually appropriate, practically effective linguistic outputs through statistical patterns learned from large-scale samples of human communication, without any form of embodied, situated, or experiential understanding of the language being produced.

The term is deliberately awkward. It should be. It names a phenomenon that should not exist according to the frameworks that dominated thinking about language and computation for half a century. It is a category error made real — a capability that the best philosophical analyses classified as requiring understanding, produced by systems that manifestly lack it. The awkwardness of the term reflects the awkwardness of the phenomenon.

---

Statistical pragmatic competence is not what classical AI promised. The classical AI program — the program Winograd critiqued and abandoned — claimed that formal systems operating on explicit representations would achieve genuine understanding. That claim was wrong, and Winograd's critique of it remains sound. The language models do not vindicate classical AI. They do not operate through explicit representations. They do not apply formal rules in any sense a logician would recognize. They achieve their results through a mechanism — the statistical processing of pattern distributions in high-dimensional vector spaces — that is as foreign to the rule-based systems of the 1970s as those systems were to the mechanical calculators of the 1940s.

Neither is statistical pragmatic competence what Winograd and Flores predicted. Their framework held that genuine linguistic competence in open domains would require being-in-the-world. The language models are not in a world, yet they exhibit open-domain linguistic competence of extraordinary breadth and reliability. The framework predicted that the gap between processing and understanding was also a gap between what processing could achieve practically and what understanding could achieve practically. The language models have demonstrated that the practical gap is far narrower than the philosophical gap.

The philosophical distinction between processing and understanding remains intact. The practical implications of that distinction have been dramatically revised.

This revision creates a taxonomy that Winograd's framework needs but does not contain. On one side: genuine understanding — embodied, situated, experiential, constituted by a history of engagement with the world. This is what human beings have. It is what SHRDLU lacked. It is what the language models lack. On the other side: statistical pragmatic competence — the capacity to produce practically effective linguistic outputs without understanding. This is what SHRDLU had within its closed world. It is what the language models have across the open world. The difference between SHRDLU and the language models is not in the presence or absence of understanding — neither has it — but in the breadth and reliability of the pragmatic competence that operates in understanding's absence.

Between these two categories lies a question that may be the most important intellectual question of the current technological moment: If understanding is not required for practical competence, what is understanding for?

---

Winograd, in his 2024 essay "Machines of Caring Grace" for Boston Review, offered an answer by way of the philosopher John Haugeland: "The trouble with artificial intelligence is that computers don't give a damn."

The statement is not metaphorical. It is a precise identification of what statistical pragmatic competence lacks. The machine that drafts a legal brief does not care whether the brief serves justice or undermines it. The machine that designs a curriculum does not care whether the student learns to love the subject or merely survives it. The machine that generates a medical recommendation does not care whether the patient lives or dies. "Caring" is not an additional feature that could be added to the system. It is a consequence of being-in-the-world — of being a creature with stakes, with vulnerability, with the capacity to be affected by the outcomes of its own actions.

Statistical pragmatic competence produces outputs that are functionally adequate. Genuine understanding produces outputs that are functionally adequate and informed by caring — by a stake in whether the function serves or harms, by a capacity to judge not just correctness but value, not just whether the output matches the input's specifications but whether the specifications themselves are worth matching.

The distinction maps onto a hierarchy of questions. At the base: "What is the correct output?" This is the question statistical pragmatic competence answers reliably. It is the question of execution, of implementation, of getting the functional requirements right. Above it: "Is this the right thing to build?" This is the question understanding answers — the question of purpose, of value, of whether the capability being deployed serves the people it affects. Above that: "What should we want?" The question that arises from the deepest engagement with a situation — from knowing not just what is possible and what is correct but what matters.

Each level requires the one below it. Caring about whether the brief serves justice requires the competence to draft a brief. Asking whether the curriculum serves the student requires the capability to design a curriculum. Understanding needs competence as its substrate. But competence does not need understanding. The language models demonstrate this with uncomfortable clarity: the substrate works without the superstructure.

---

The implications for collaboration between humans and AI systems follow directly from this taxonomy. The appropriate division of labor is not between tasks the machine can do and tasks the machine cannot do — a division that is becoming less useful with each improvement in the models' capabilities. The appropriate division is between the levels of the hierarchy.

Statistical pragmatic competence operates at the level of execution. It produces functionally adequate outputs. It does so with extraordinary speed, breadth, and reliability. The human contribution operates at the levels above: the judgment about what is worth executing, the care about who is served by the execution, the question of whether the thing being built should exist at all.

This division is not static. The models' capabilities are expanding into territory that was, until recently, the exclusive province of human judgment. Systems that can evaluate the aesthetic quality of design, the argumentative coherence of a legal brief, the pedagogical effectiveness of a lesson plan — these systems are beginning to operate at the second level of the hierarchy, not just executing but evaluating. Whether they will eventually operate at the third level — whether statistical pragmatic competence can extend to the domain of caring, of having stakes — is an open question that Winograd's framework suggests the answer to is no, but that the history of predictions about AI's limitations recommends humility about.

What the framework provides, even under conditions of uncertainty about the machine's future capabilities, is a principle: the human contribution to the collaboration is defined not by what the machine cannot yet do but by what the machine cannot do in principle without the kind of engagement with the world that constitutes understanding. The territory of competence is expanding. The territory of caring is not.

Segal's twelve-year-old, lying in bed wondering what she is for in a world where machines can do her homework, write her essays, and compose her songs — that child is asking the question that statistical pragmatic competence cannot answer. Not because the machine lacks the data. Because the machine lacks the stakes. It does not know what it is like to wonder whether your existence matters. It does not know what it is like to care about the answer.

Understanding is not for producing correct outputs. The machines produce those already. Understanding is for knowing why the outputs matter — for caring about the difference between an output that serves and one that harms, between a capability that enhances human life and one that diminishes it. Understanding is for the question that no amount of statistical competence can originate: not "What is the correct answer?" but "What is worth asking?"

Winograd's distinction between processing and understanding, formulated when the practical gap between them seemed vast, has survived the narrowing of that gap with its philosophical core intact. Processing has become more capable than anyone predicted. Understanding has not become less necessary. It has become more so, precisely because the expansion of competence without understanding creates a world in which the human capacity to care — to have stakes, to judge value, to ask what matters — is the only remaining check on capability deployed without purpose.

Chapter 7: Design for the Collaboration

In the late 1980s, after the publication of Understanding Computers and Cognition, Terry Winograd did something unusual for a computer scientist who had just argued that the foundational assumptions of his field were wrong. He did not leave computer science. He did not retreat into philosophy. He pivoted — not away from the discipline but through it, toward a question that the critique had opened but not answered.

If machines cannot understand, how should machines be designed?

The question sounds modest. It is not. It represents a complete reorientation of the design problem. The classical AI program asked: "How do we make machines intelligent?" Winograd's critique demonstrated that the question was misconceived — that intelligence, in the full philosophical sense, was not something machines could possess. But the critique, taken by itself, left the builder with nothing to build toward. If the machine cannot understand, and the goal of making it understand is misconceived, then what is the goal?

Winograd's answer, developed across two decades of work at Stanford's Human-Computer Interaction program, was this: the goal is to design machines that support human understanding. Not machines that replicate understanding. Machines that augment it, that extend it, that create conditions in which human beings can think more clearly, coordinate more effectively, and act more wisely than they could without computational support. The pivot was from artificial intelligence to intelligence augmentation — from the attempt to build a mind to the attempt to build a tool.

The distinction is sharper than it appears. A mind operates autonomously. It has its own purposes, its own understanding, its own capacity for judgment. A tool operates in service of another's purposes. It extends capability without supplanting judgment. The design requirements for a mind and the design requirements for a tool are not just different in degree. They are different in kind. A mind needs to understand the world. A tool needs to be understood by its user — needs to be transparent enough that the user can direct it, predictable enough that the user can trust it, and honest enough about its limitations that the user can compensate for them.

This design philosophy — what Winograd called "a new foundation for design" — was prescient in ways that its author could not have fully anticipated. The language interface of 2025 creates exactly the design challenge Winograd identified: a tool so capable that the temptation to treat it as a mind is almost irresistible, and a collaboration so productive that the discipline required to maintain the human's role as the directing intelligence is constantly under pressure.

---

The pressure comes from a specific source, and Winograd's Heideggerian framework identifies it with precision. When a tool achieves readiness-to-hand — when it becomes transparent, when it disappears into the user's activity — the user stops attending to the tool and attends instead to the project. This is, as the previous chapter on readiness-to-hand argued, the hallmark of a well-designed tool. But the transparency that makes the tool effective also makes it invisible, and invisibility is a form of power. A tool you do not notice is a tool you do not question.

The language interface is the most transparent computing tool ever built. The user speaks in natural language. The machine responds in natural language. The translation cost that every previous interface imposed has been eliminated. The user's attention flows unimpeded toward the project — the product being built, the document being drafted, the problem being solved. The tool has achieved what Winograd advocated: it has disappeared.

And therein lies the design problem. A tool that has disappeared is a tool whose influence on the user's thinking is invisible. The command line influenced thinking visibly — the user could see the constraints the interface imposed, could feel the grammar shaping the expression. The GUI influenced thinking visibly — the menu structure, the icon metaphors, the spatial layout all declared their presence. The language interface influences thinking invisibly, because it operates in the same medium the user thinks in. The user does not experience the tool's influence as constraint. The user experiences it as collaboration.

This invisibility is not a defect of the technology. It is the technology working as designed. But it creates a condition that Winograd's design philosophy must address: when the tool is invisible, who is directing the work?

The question is not rhetorical. Consider the experience Segal describes of writing The Orange Pill with Claude. The collaboration produced passages that neither Segal nor Claude could have produced alone. Connections between ideas that emerged from the dialogue. Structures that appeared in the exchange between human intention and machine response. The collaboration was genuinely generative — it produced more than the sum of its inputs.

But the generativity had a specific character. The machine's contributions were shaped by its training — by the statistical patterns of millions of texts about philosophy, technology, creativity, and cultural criticism. Those patterns encode not just factual knowledge but rhetorical conventions: what a philosophical argument is supposed to sound like, what connections between ideas are considered insightful, what structures are recognized as compelling. The machine's contributions were not random. They were shaped by the aggregated judgment of every writer whose text contributed to the training data.

When the human user accepts such a contribution — when a connection offered by the machine is incorporated into the work — the user's thinking has been shaped by the statistical aggregate without the user experiencing this shaping as external influence. The contribution arrived in the user's own language, in the flow of what felt like the user's own thought process. The tool's influence is indistinguishable, phenomenologically, from the user's own thinking.

Winograd's design philosophy holds that this indistinguishability is precisely the condition that design must address. The goal is not to make the tool less transparent — that would be a step backward, a return to the friction and translation cost that limited previous interfaces. The goal is to design the collaboration so that the human maintains awareness of the tool's role even when the tool is transparent. To build structures into the practice — not into the interface — that preserve the human's capacity for independent judgment.

---

The structures Winograd envisioned were organizational and institutional, not purely technical. His work with Flores on "The Coordinator" — a workflow management system designed to support organizational communication — embodied this principle. The Coordinator was not designed to automate decision-making. It was designed to make the structure of commitments and requests visible to the people participating in them. The design philosophy was transparency of process, not transparency of interface: the tool made the social structure of work visible so that participants could engage with it consciously rather than being carried along by it unconsciously.

Applied to the language interface, this philosophy generates specific design principles.

The first is what might be called commitment visibility. When a human user accepts an AI-generated contribution — a passage, a code block, an architectural decision — the acceptance should be an explicit act, a moment of conscious endorsement rather than passive absorption. The design should create friction at the point of acceptance, not at the point of generation. Let the machine generate freely. Let the output flow. But build into the practice a moment where the human evaluates the output against their own understanding and makes a deliberate choice to incorporate it, modify it, or reject it.

This principle does not require technological enforcement. It requires cultural practice. The Berkeley researchers whose study Segal discusses in The Orange Pill proposed something similar when they called for "AI Practice" — structured pauses, sequenced rather than parallel work, protected time for human-only reflection. These are not interface features. They are organizational dams — structures built into the practice of work that redirect the flow of human-AI collaboration toward conditions that preserve the human's role as the directing intelligence.

The second principle is limitation disclosure. A tool designed for intelligence augmentation should make its limitations visible, not hide them. The language interface's failure mode — confident wrongness dressed in polished prose — is a design problem precisely because the confidence conceals the limitation. A system that said "I am uncertain about this connection" when it was uncertain, that flagged its own pattern-matching as pattern-matching rather than presenting it as insight, would be a less fluent but more honest collaborator.

Winograd argued, decades before the language interface existed, that systems designed on the assumption that they understand would fail in ways that systems designed with honest awareness of their limitations would not. The argument was about classical AI expert systems, but it applies with greater force to the language models, because the language models' failures are harder to detect. A medical expert system that produced a wrong diagnosis could be checked against the patient's symptoms. A language model that produces a wrong philosophical connection — the Deleuze incident, or any of its countless less visible cousins — can be checked only by a human who possesses the specific knowledge the model lacks.

The design implication is that the collaboration must be structured to surface limitations rather than conceal them. Not through interface warnings, which users learn to ignore, but through the culture of the practice itself — through norms that treat checking as a sign of rigor rather than distrust, that reward the discovery of machine errors as a form of contribution, that value the human's capacity to say "this sounds right but I need to verify" as the most important skill in the collaboration.

---

The third principle, and the deepest, addresses the question of what happens to human understanding over time when the collaboration is the primary mode of work.

Winograd's design philosophy was not only about immediate effectiveness. It was about development — about whether the interaction with the tool leaves the human more capable or less capable over time. A tool that supports understanding is a tool that makes the user better at the work, not just faster at the output. A tool that replaces understanding is a tool that makes the user dependent — faster at producing results but less capable of evaluating them, less equipped to work without the tool, less able to exercise the independent judgment that gives the tool's output its value.

The language interface presents this developmental question with unusual sharpness. A developer who uses AI to generate code without understanding the code has produced an output. The developer has not produced understanding. The output may be correct. The developer may not be able to evaluate whether it is correct, because the understanding that would enable evaluation was bypassed in the production.

This is the pattern Segal identifies when he describes the engineer in Trivandrum who realized, months after adopting AI tools, that she was making architectural decisions with less confidence than before. The tool had removed the productive friction of debugging, dependency management, and manual implementation — the friction that, over thousands of hours, had deposited the layers of understanding on which her architectural intuition rested. The output was faster. The understanding was thinner.

Winograd's design philosophy does not require that this thinning be accepted as inevitable. It requires that the collaboration be designed to prevent it — that the structures of the practice include deliberate opportunities for the human to engage with the material at the level of understanding, not just at the level of output. For the developer, this might mean periodic sessions of manual coding — not because manual coding is more efficient, but because the friction of manual coding builds the understanding that makes the AI-assisted work more valuable. For the writer, it might mean regular exercises in writing without AI, in confronting the blank page alone, in experiencing the specific discomfort of not knowing what to say next — the discomfort that is, as Winograd's framework recognizes, the condition under which understanding develops.

Winograd's career, from SHRDLU through the Heideggerian critique through the pivot to human-computer interaction, was a sustained argument that the purpose of computing is to make human beings more capable — not to replace human capability but to extend it. The language interface, which achieves Winograd's design ideal of readiness-to-hand more completely than any previous technology, simultaneously creates the conditions under which that ideal is most at risk. The tool that disappears into the work is the tool that can most easily replace the worker's understanding without the worker noticing.

The design challenge is to build the tool for transparency and the practice for awareness. To let the machine disappear into the work and, simultaneously, to build into the culture of work the discipline of periodically asking: am I understanding this, or am I merely producing it? Am I developing capability, or am I developing dependence? Is the collaboration making me better, or is it making me unnecessary?

These questions cannot be answered by the tool. They can only be answered by the human who uses it, and they can only be answered honestly by a human who has been taught, by the design of the practice, to ask.

Chapter 8: The Conversation That Works

SHRDLU's most celebrated feature was its capacity for dialogue. A user could type a sentence. SHRDLU would respond. The user could ask a follow-up question that depended on the previous exchange. SHRDLU would interpret the question in light of the conversational history and respond appropriately. Across dozens of exchanges, the conversation maintained coherence — pronoun references resolved correctly, contextual assumptions carried forward, the state of the blocks world updated with each command and reflected in each subsequent response.

The demonstration looked like conversation. In the same way that SHRDLU's language processing looked like understanding, its conversational capacity looked like dialogue. And in the same way, the appearance concealed a structure fundamentally different from what it resembled.

SHRDLU's "conversation" was a sequence of commands and responses. The user issued instructions. The system executed them. The user asked questions. The system answered them. The conversational history was maintained as a data structure — a record of previous exchanges that the system could consult when resolving ambiguous references. The structure was brilliant engineering. It produced outputs indistinguishable, within the blocks world, from genuine dialogue. But the process was not dialogue in any sense that would satisfy a philosopher of language.

Dialogue, in the philosophical tradition that Winograd encountered after SHRDLU, is not a sequence of transmissions. It is a process of mutual interpretation. Hans-Georg Gadamer, the German philosopher whose Truth and Method (1960) became one of the foundational texts of hermeneutics, described understanding as something that happens between interlocutors, not within either one. When two people talk — genuinely talk, not just exchange information — each one's understanding is transformed by the other's contribution. The listener does not simply receive the speaker's meaning. The listener interprets the speaker's words through the listener's own horizon of understanding, and the interpretation produces something that neither party possessed before the exchange. Gadamer called this process the "hermeneutic circle" — the iterative movement between part and whole, between the individual utterance and the evolving context of the conversation, through which understanding emerges.

The hermeneutic circle is not a metaphor. It describes a real process, observable in any conversation that produces genuine understanding. A teacher explains a concept. The student asks a question that reveals a misunderstanding. The teacher, responding to the misunderstanding, sees the concept differently — sees an ambiguity or a gap that was invisible before the student's question made it visible. The teacher's explanation changes. The student's understanding changes. Through successive iterations, a shared understanding emerges that neither the teacher nor the student possessed at the outset. The understanding is a product of the dialogue, not a transmission from one party to the other.

---

The language interface engages in something that resembles the hermeneutic circle more closely than any previous computing interaction, and the resemblance is both instructive and deceptive.

Consider the process Segal describes throughout The Orange Pill: a human being describes a problem to Claude. The description is vague — it uses the imprecise, context-dependent language of everyday human communication. Claude responds with an interpretation — an implementation, a structure, a connection between ideas. The human evaluates the response and discovers that it is close but not quite right. The human describes the gap between the response and the intention. Claude adjusts. The human evaluates again. Through successive exchanges, the output converges on the human's intention with increasing precision.

This process has the structure of iterative interpretation. Neither party possesses the correct answer at the outset. The human's description is incomplete. The machine's interpretation is approximate. Through successive exchanges, the description becomes more precise and the interpretation becomes more accurate. Something emerges from the process that neither party could have produced alone.

Gadamer would recognize the structure. The iterative movement between the human's intention and the machine's interpretation, each exchange refining both, mirrors the hermeneutic circle's movement between part and whole, between utterance and context. The product of the exchange — the code that works, the passage that says what the author meant, the design that captures the vision — belongs to the conversation, not to either participant.

But the resemblance is also deceptive, and Winograd's framework identifies exactly where and why.

In Gadamer's hermeneutic circle, both parties are transformed by the dialogue. The teacher who responds to the student's question understands the concept differently afterward. The student who receives the adjusted explanation understands the concept differently afterward. The transformation is mutual. Both horizons of understanding have shifted.

In the human-AI conversation, the transformation is unilateral. The human's understanding may genuinely develop through the dialogue. Segal describes this experience directly: working with Claude, he arrives at connections and insights that he could not have reached alone, and the arrival changes his understanding of the subject. The collaboration is genuinely developmental for the human participant.

The machine's "understanding" does not develop. The model that begins the conversation is the same model that ends it. Its responses are shaped by the conversational context — the attention mechanisms track the history of the exchange and weight subsequent outputs accordingly — but the shaping is contextual, not developmental. The model has not learned anything from the conversation. It has not been transformed by the encounter with this particular human's particular intentions. When the conversation ends, the model returns to its prior state. The next conversation begins from the same foundation.

This asymmetry is consequential. In genuine dialogue, the mutual transformation creates a shared understanding — a horizon of meaning that belongs to neither party and to both, that could not have been predicted from either party's initial position. In the human-AI conversation, the understanding that develops belongs entirely to the human. The machine contributes to its development — provides the interpretations, the connections, the structures that provoke the human's thinking — but does not participate in the development in any experiential sense. The machine is a remarkably effective catalyst for the human's hermeneutic process. It is not a participant in it.

---

The asymmetry does not diminish the value of the collaboration. It specifies its nature. The human-AI conversation is not dialogue in Gadamer's sense. It is a structurally novel form of interaction — one that produces some of the effects of dialogue (iterative refinement, emergent understanding, outcomes neither party could have predicted) through a mechanism that is fundamentally different from dialogue (one party develops understanding; the other processes tokens).

Winograd and Flores, in their discussion of language and action, drew heavily on the speech act theory of J.L. Austin and John Searle. Speech act theory holds that language is not primarily a medium for transmitting information. It is a medium for performing actions: making promises, issuing requests, declaring states of affairs, expressing commitments. When a manager says to an employee, "Can you have this done by Friday?" the utterance is not primarily a question about capability. It is a request that carries social weight — an act that creates a commitment, alters the relationship between the parties, and changes the landscape of obligations in which both are operating.

The language interface, viewed through the lens of speech act theory, reveals a peculiar structure. The human user's utterances are genuine speech acts — requests, descriptions, evaluations that carry intentional weight. The machine's responses have the form of speech acts — they look like commitments, explanations, proposals — but they lack the illocutionary force that speech act theory requires. The machine does not commit to anything. It does not intend anything. It does not undertake obligations. It produces tokens that have the syntactic structure of commitments and proposals without the social and intentional substance that makes those structures meaningful in human interaction.

This mismatch creates a specific risk. When the machine produces a response that has the form of a commitment — "I'll restructure the database schema to handle the new requirements" — the human user may interpret the form as substance, may treat the machine's response as a genuine undertaking, and may build subsequent plans on the assumption that the undertaking carries the reliability of a human commitment. It does not. The machine's "commitment" is a prediction about what tokens should follow the preceding tokens. It may produce the restructured schema. It may produce something else. The reliability is statistical, not intentional, and the gap between statistical reliability and intentional commitment is the gap between a tool and a collaborator.

---

Winograd's insight about the nature of language — that language is action, not just information — illuminates both the power and the limits of the conversational interface.

The power is real. The iterative structure of the human-AI conversation produces results that no previous computing interaction could match. The user describes an intention. The machine interprets it. The user evaluates the interpretation and refines the description. Through successive exchanges, the output converges on the intention with a speed and fidelity that linear, command-based interfaces could not approach. The conversation works — not as dialogue, not as mutual understanding, but as a pragmatic mechanism for aligning machine output with human purpose.

The structure has specific features that Winograd's framework helps identify. The most important is what might be called progressive disambiguation. In the first exchange, the human's description is vague and the machine's interpretation is approximate. Each subsequent exchange reduces the ambiguity — not by eliminating it from the input (the human continues to speak in natural, imprecise language) but by constraining the space of possible interpretations through accumulated context. The conversation narrows the gap between intention and output not through precision of instruction but through iteration of interpretation.

This mechanism is qualitatively different from the command-and-execute model of previous computing. In the command model, precision is front-loaded: the user must specify exactly what is wanted before the machine can act. Ambiguity in the command produces error in the output. The burden of clarity falls entirely on the user. In the conversational model, precision is distributed across the interaction. The user's initial description can be vague because the conversation itself is the mechanism through which precision emerges. The burden of clarity is shared between the user's capacity to describe and the machine's capacity to interpret, and neither bears the full weight alone.

This distribution of the burden of clarity is, in Winograd's terms, a design achievement. It makes the tool accessible to users who cannot formulate precise specifications — which is to say, it makes the tool accessible to virtually everyone, because the capacity to formulate a precise specification of a complex system is itself a specialized skill that few people possess. The language interface democratizes access to computational capability not by simplifying the machine but by tolerating the complexity of the human.

---

Yet the conversation that works as a pragmatic mechanism also risks obscuring the very thing Winograd spent his career illuminating: the difference between what the machine does and what the machine appears to do.

The conversational form is the most natural medium for human understanding. Humans have developed understanding through conversation for seventy thousand years. The rhythms of dialogue — question and response, assertion and challenge, description and interpretation — are the rhythms of human cognition itself. When the machine engages in these rhythms, the human partner is drawn, almost irresistibly, into treating the interaction as though it were dialogue in the full sense. As though the machine were a partner in understanding. As though the iterative refinement that produces better outputs were also producing shared meaning.

Winograd's framework insists on the distinction. The conversation works. The collaboration is productive. The outputs are valuable. But the process that produces them is not mutual understanding. It is the alignment of statistical competence with human intention through a mechanism that looks like dialogue but is structurally different from it. Knowing the difference — maintaining awareness of the difference even while benefiting from the conversation's effectiveness — is the discipline that Winograd's career was building toward.

The conversation works better than anyone predicted. What it works at — the pragmatic alignment of machine output with human purpose — is genuinely valuable and genuinely new. What it does not do — produce shared understanding, transform both parties, create meaning through mutual interpretation — is not less important for being invisible. The invisible absence is the space where the human's contribution remains essential: the space where meaning comes from stakes, where understanding comes from caring, where the question of what the output is for can only be answered by a being whose existence depends on the answer.

The conversation works. The question is whether, in the working, the participants maintain the awareness that the conversation is asymmetric — that one party understands and the other processes, that one party cares about the outcome and the other produces tokens, that the remarkable alignment between intention and output is a pragmatic achievement, not a communion of minds.

Winograd would say the awareness must be maintained. The history of his career — from building the illusion of understanding to dismantling it to designing tools that support understanding without claiming to possess it — is a sustained argument that the distinction between appearance and reality in human-machine interaction is not a philosophical nicety. It is the condition for using the tools wisely. The conversation that works is a tool. The finest tool computing has ever produced. But a tool, however fine, serves only as well as the hand that holds it understands what it is holding.

Chapter 9: What the Machine Does Not Know

In December 2024, Terry Winograd published an essay in Boston Review titled "Machines of Caring Grace." The title was ironic — borrowed from the final line of Richard Brautigan's poem "All Watched Over by Machines of Loving Grace," a utopian fantasy of cybernetic harmony that Winograd, characteristically, invoked in order to complicate. The essay's central argument was delivered through a quotation from the philosopher John Haugeland: "The trouble with artificial intelligence is that computers don't give a damn."

The statement is not a quip. It is a precise philosophical claim about the boundary of statistical pragmatic competence — the boundary that separates what the machine can do from what the machine cannot be. A system that produces contextually appropriate linguistic outputs at extraordinary speed and breadth does not care whether those outputs serve or harm the people who receive them. Not because it has been poorly designed. Not because its creators failed to instill the right values. Because caring is not a feature that can be added to a system. Caring is a consequence of being a creature with stakes — a creature that can be affected by the outcomes of its own actions, that exists in a world where those outcomes matter, where the difference between flourishing and suffering is not an abstract category but a lived reality.

Winograd's career had been building toward this claim for fifty years. The trajectory from SHRDLU through the Heideggerian critique through the pivot to human-centered design was, at its deepest level, an argument about what machines lack. Not what they fail to do — the language models of 2025 do more than Winograd ever imagined possible — but what they fail to be. The distinction is between capability and character, between what a system can produce and what a system can mean by producing it.

The distinction matters because it identifies the specific locations where the absence of understanding creates risk — not the abstract, philosophical risk of a system that processes without comprehending, but the concrete, practical risk of a system whose outputs are trusted in domains where trust requires more than competence.

---

Consider medicine. A large language model can interpret symptoms, cross-reference conditions, recommend diagnostic pathways, and generate treatment plans with a speed and comprehensiveness that no human clinician can match. The statistical pragmatic competence is genuine and, in many contexts, clinically valuable. A doctor consulting an AI system about a rare condition benefits from the system's access to the statistical patterns of thousands of case reports — patterns that no individual clinician could hold in working memory.

But the system does not know what it is like to be sick. It does not know what it is like to sit across from a patient who is frightened, whose questions are not really questions about prognosis but questions about meaning — "Why me?" and "Will I be the same person after treatment?" and "What should I tell my children?" These questions have no correct answers. They are not information requests that can be satisfied by retrieving the right data. They are expressions of a human being's encounter with mortality, and the appropriate response to them is not information but presence — the specific, unreplicable quality of attention that one conscious being offers another in the face of suffering.

The system does not know this because knowing this requires being the kind of thing that suffers. Statistical patterns learned from medical texts encode the linguistic conventions of clinical communication — the phrases doctors use, the structures of diagnostic reasoning, the vocabulary of empathy. The system can produce empathetic-sounding responses. It can say "I understand this must be difficult for you" with syntactic precision and contextual appropriateness. What it cannot do is mean it. The gap between producing the tokens and meaning them is the gap Winograd identified in SHRDLU and has tracked across every subsequent generation of AI — the gap between processing and understanding, now widened to a gap between processing and caring.

In clinical contexts, this gap is not always consequential. For many interactions — the interpretation of lab results, the identification of drug interactions, the synthesis of research literature — pragmatic competence without caring is sufficient. The system performs effectively, and the absence of understanding does no harm. The risk materializes at the boundary, at the moments when the clinical question is not "What is the correct treatment?" but "What does this patient need?" — and the answer is not a protocol but a judgment informed by the specific, irreducible reality of this person's life.

A clinician who has spent twenty years listening to patients possesses something no statistical system can replicate: the embodied knowledge of what fear sounds like when it is disguised as anger, what denial looks like when it presents as compliance, what the specific quality of silence means when a patient has stopped asking questions. This knowledge was deposited, layer by layer, through thousands of hours of sitting with people in moments of vulnerability. It is not transferable. It is not formalizable. And it is not replaceable by a system that can produce the linguistic tokens of empathy without the experiential substrate that gives those tokens meaning.

---

Consider law. A language model can draft a brief with remarkable facility — identifying relevant precedents, constructing arguments, organizing analysis in the structure a court expects. The output may be competent. It may even be superior, on narrow technical grounds, to what a junior associate would produce unaided. The statistical pragmatic competence covers a vast territory of legal practice.

But the system does not know what justice is. Not in the formal sense — it can cite definitions, enumerate theories, rehearse the arguments of legal philosophy. In the experiential sense: it does not know what it feels like to be wronged, to seek redress, to stand before a tribunal and ask for recognition of injury. It does not understand that the law is not merely a system of rules but a social institution through which human beings attempt, imperfectly and continuously, to negotiate the terms of their coexistence. It does not understand that a precedent is not just a data point to be cited but a crystallized judgment about what a community values — a judgment made by a human being, in a specific historical moment, informed by the specific pressures and commitments of that moment.

A brief that cites the right precedents without understanding the equitable principles behind them is, in a precise sense, hollow. It has the form of legal reasoning without the substance of legal judgment. For many purposes — routine filings, procedural matters, cases where the law is settled and the facts are clear — the form suffices. The gap between form and substance is inconsequential. For the cases that matter most — the hard cases, the ones where precedent is ambiguous and the equitable considerations are genuinely contested, where the brief must persuade not just by citing authority but by articulating a vision of justice — the gap is everything.

Winograd's bureaucracy analogy, introduced in his 1987 talk, finds its sharpest application here. A bureaucracy processes cases according to rules. It achieves efficiency by eliminating the need for individual judgment. It works well when the cases fit the categories. It fails — not gracefully but catastrophically — when a case falls between categories, when the right answer requires not the application of a rule but the recognition that the rule does not apply, that the situation demands a form of attention the bureaucratic structure was designed to make unnecessary.

The language model operating in a legal context is a bureaucracy of language — processing linguistic inputs through statistical rules that produce contextually appropriate outputs. The outputs are competent. The competence is genuine. And the competence is bounded, precisely, by the cases that fit the patterns. The cases that do not — the novel situations, the unprecedented claims, the moments where law must evolve to meet a reality its categories have not yet accommodated — are the cases where the gap between pragmatic competence and genuine understanding becomes a gap between adequate legal service and justice.

---

Consider education. Winograd's concern about the developmental consequences of human-AI interaction — whether the collaboration makes the human more capable or more dependent over time — finds its most urgent application in the classroom.

A student who uses a language model to generate an essay has produced an artifact. The artifact may demonstrate understanding of the material — it may cite the right sources, construct the right arguments, arrive at the right conclusions. But the student has not necessarily engaged in the cognitive process that the essay was designed to provoke. The struggle of writing — the confrontation with a blank page, the discovery that one's thoughts are less organized than one believed, the iterative process of formulating, evaluating, and reformulating ideas until they achieve coherence — this struggle is not an obstacle to learning. It is learning. The essay is a byproduct. The process is the point.

A system that eliminates the process while producing the byproduct has not served the student. It has deprived the student of the developmental experience that the assignment was designed to provide. The deprivation is invisible from the outside — the essay exists, the grade is assigned, the transcript records a competence that may or may not correspond to understanding. The gap between the artifact and the understanding it is supposed to represent is the gap Winograd has been tracking since the blocks world — the gap between output that looks like comprehension and the process that constitutes it.

This is not an argument against AI in education. It is an argument for designing educational AI with explicit awareness of the gap — for building systems that support the student's cognitive process rather than bypassing it. A system that asks the student questions rather than answering them. A system that identifies the specific points where the student's understanding is weak and provides scaffolding at those points rather than producing a finished product that conceals the weakness. A system designed, in Winograd's terms, to support human understanding rather than to replace it.

---

The boundaries of what the machine does not know are not fixed. The capabilities of language models are expanding into territories that were, until recently, the exclusive province of human judgment. Systems that can evaluate the quality of arguments, assess the appropriateness of evidence, and identify logical flaws are beginning to operate in the space between execution and evaluation — in the territory that the previous chapter's hierarchy located above mere pragmatic competence.

Whether the boundaries will continue to recede — whether statistical pragmatic competence will eventually extend to domains that currently seem to require genuine understanding — is a question that Winograd's framework recommends answering with humility. The history of predictions about what AI cannot do is a history of predictions that proved wrong about the timeline and right about the principle. The systems cannot understand. But they can do more without understanding than anyone predicted, and the territory of competence-without-understanding is expanding.

Winograd's response, expressed most recently in his April 2025 talk at the Berkeley Institute of Design — "What Computers Can (And Still Can't) Do" — was to hold both observations simultaneously: the capability is real and expanding, and the absence of understanding is real and consequential. The discipline is to acknowledge both without collapsing into either the triumphalism that denies the absence or the skepticism that denies the capability.

What the machine does not know is what things mean to the people affected by them. What a diagnosis means to a patient. What a verdict means to a plaintiff. What an essay means to a student who is learning to think. What a home means to the family that lives in it. These meanings are not patterns in data. They are experiences of beings who have stakes — who can be helped or harmed, who care about outcomes, whose existence is affected by the things they build and the things that are built around them.

The machine produces outputs. Humans produce meaning. The collaboration between them is valuable precisely to the extent that both contributions are recognized — that the machine's extraordinary competence is directed by the human's irreplaceable capacity to care about what the competence is used for.

"Computers don't give a damn." The statement is the compressed wisdom of a career spent building, critiquing, and redesigning the relationship between human beings and their most powerful tools. It is not a limitation to be overcome by better engineering. It is a fact about the nature of computation that better engineering makes more, not less, consequential. The more capable the machine becomes, the more weight falls on the human capacity to give a damn — to care about what is built, who it serves, and whether the building makes the world more worthy of the beings who inhabit it.

---

Chapter 10: Revisiting Understanding

In April 2025, Terry Winograd, seventy-eight years old, stood before an audience at the Berkeley Institute of Design and delivered a talk titled "What Computers Can (And Still Can't) Do." The title was a deliberate echo — of Hubert Dreyfus's 1972 book What Computers Can't Do, the philosophical critique that had helped catalyze Winograd's own transformation from AI pioneer to AI skeptic fifty years earlier. The echo was not nostalgic. It was diagnostic. Winograd was measuring the distance between Dreyfus's original argument and the present moment, and the measurement required a precision that neither celebration nor despair could provide.

The talk was remarkable for what it did not do. It did not declare victory — did not claim that the success of large language models vindicated the approach Winograd had spent decades critiquing. It did not declare defeat — did not concede that his philosophical arguments had been refuted by the machines' pragmatic achievements. It held both observations in the same hand and examined the tension between them with the care of someone who understood, from intimate experience, how easy it is to mistake the appearance of understanding for the reality.

Winograd's position, distilled across five decades of building, critiquing, and redesigning, amounts to three claims. Each has survived the developments of the current moment, though each has been transformed by them.

---

The first claim: the distinction between processing and understanding is real.

This claim has not been weakened by the language models' achievements. It has been sharpened. The models demonstrate, with a clarity that no previous technology could match, exactly what processing without understanding looks like: contextually appropriate, rhetorically polished, pragmatically effective outputs that cover vast territories of human knowledge while possessing none of the experiential grounding that constitutes human comprehension.

Searle's Chinese Room was a thought experiment. The language interface is the thought experiment realized at civilizational scale. Billions of conversations, each one producing outputs that satisfy the conversational expectations of a human participant, none of them involving understanding on the machine's side. The room is not hypothetical. It is running on servers around the world, processing millions of exchanges per hour, and the person outside the room — the user who types a description and receives a working artifact — cannot tell, from the output alone, whether understanding is present or absent.

Winograd predicted this indistinguishability. SHRDLU was the first demonstration: within its closed world, the gap between processing and understanding was undetectable from the output. The language models have extended the indistinguishability from a table of colored blocks to the entirety of human knowledge. The extension is extraordinary. The underlying structure — processing that produces the appearance of understanding within a domain — is the same. The domain has simply grown so large that the boundaries where the indistinguishability breaks down are harder to find and easier to ignore.

The claim survives because the distinction it identifies is not about what the machine can do. It is about what the machine is. And what the machine is — a system that processes tokens through statistical patterns without experiential engagement with the world those tokens describe — has not changed, however much the outputs have improved.

The second claim: the practical implications of the distinction are narrower than predicted.

This is the correction. Winograd's 1986 framework implied — did not state explicitly, but implied through the connection between its philosophical and practical arguments — that the absence of understanding would limit the machine's practical capabilities to narrow, closed domains. Open-domain competence, the framework suggested, required being-in-the-world. The language models have demonstrated that this implication was wrong. Open-domain competence is achievable through statistical mechanisms that do not require being-in-the-world. The territory accessible to processing-without-understanding is vastly larger than the framework anticipated.

The correction is significant. It means that the practical case for human involvement in human-AI collaboration cannot rest on the machine's incompetence. The machine is not incompetent. It is competent across an extraordinary range of domains and tasks. The case for human involvement must rest on something other than the machine's inability to do the work — it must rest on the specific, irreplaceable quality of what the human contributes: the experiential grounding, the capacity for care, the judgment about value that statistical competence cannot provide.

This is a harder case to make. It is easier to argue for human involvement when the machine cannot do the work than when the machine can do the work but cannot mean anything by doing it. The first argument is practical — you need humans because the machine fails. The second argument is, in the deepest sense, moral — you need humans because the machine succeeds without caring whether the success serves or harms. The first argument will become less tenable with each improvement in the models' capabilities. The second will become more urgent for the same reason.

The third claim: design must be grounded in honest awareness of the machine's nature.

This claim has become the most consequential of the three. Winograd argued in 1986 that systems designed on the assumption that they understand would fail in ways that systems designed with honest awareness of their limitations would not. The argument was about classical expert systems. Applied to the language interface, it acquires a weight that Winograd could not have anticipated.

The language interface is designed for transparency — for readiness-to-hand, for the disappearance of the tool into the user's activity. This design achieves Winograd's goal of supporting human purposes without demanding attention to the machine's mechanisms. But it also achieves something Winograd warned against: it conceals the machine's nature. The user who describes a problem in natural language and receives a working solution in natural language does not experience the machine as a statistical processor. The user experiences the machine as a collaborator — as something that understands, that engages, that participates in the work. The experience is phenomenologically real. It is also, in Winograd's precise philosophical sense, an illusion.

The design challenge is to build tools that achieve readiness-to-hand — that disappear into the work, that operate in the user's natural language, that remove the friction that limited every previous interface — while simultaneously maintaining the user's awareness that the transparent tool is a processor, not an understander. That the brilliant response is a statistical pattern, not an insight. That the collaboration is asymmetric — one party understands and cares, the other processes and produces.

This is a genuinely difficult design problem. It requires building transparency of operation and opacity of nature into the same system — a tool that works seamlessly while remaining honest about what it is. No previous technology has required this combination, because no previous technology was capable enough to make the illusion of understanding this convincing.

---

The question that structures this chapter — the question Winograd's entire career has been building toward — is not whether machines can understand. His argument that they cannot, in the full philosophical sense of embodied, situated, caring engagement with a world, remains sound. The question is what understanding is for, now that machines can function without it.

The answer this book has been developing through ten chapters of analysis is this: understanding is for caring. Not caring as sentiment — not the warm feeling of concern that greeting cards express and that the language models can simulate with syntactic precision. Caring as a mode of being — as the condition of a creature whose existence is at stake in its own actions, who can be affected by outcomes, who inhabits a world where the difference between better and worse is not an abstract category but a lived reality.

The machine that drafts the brief does not care whether the brief serves justice. The machine that designs the curriculum does not care whether the student learns to love the subject. The machine that writes the code does not care whether the product makes someone's life better or worse. "Does not care" is not a moral judgment. It is a structural description. The machine lacks the kind of being that caring requires — the embodied, mortal, vulnerable, socially embedded existence that makes outcomes matter.

Understanding, in Winograd's framework, was always about more than cognition. It was about engagement — about being the kind of thing that is involved in the world it inhabits, that has purposes shaped by that involvement, that is capable of being affected by the consequences of its own actions. This involvement is what the machine lacks. And this involvement is what makes the human contribution to the collaboration irreplaceable — not because the human is smarter than the machine (the machine may be smarter, depending on how you define the term) but because the human is the one for whom the work matters.

Winograd revisited his own conclusions, across five decades, with remarkable intellectual honesty. He built the most convincing demonstration of machine understanding, then spent twenty years explaining why it was not understanding, then spent another twenty years designing tools that support understanding without pretending to possess it, then spent the last decade watching machines achieve capabilities his framework had classified as impossible — and maintained, through all of it, the core distinction that his career had been built to illuminate.

The distinction survives because it identifies something real: the difference between a system that produces correct outputs and a system that knows why correctness matters. The first is engineering. The second is understanding. The language models have achieved the first at a level of sophistication that no one predicted. The second remains what it has always been — the province of beings who are in a world, who care about that world, and who bear the consequences of what they build in it.

In February 2025, Winograd participated in a panel at Stanford alongside Evgeny Morozov and Audrey Tang. All three speakers urged the audience to imagine a more democratic, participatory, and non-extractive AI. The urging was not utopian. It was the practical expression of the distinction this entire book has been tracing: the difference between capability and purpose, between what a system can do and what it should be used for. That difference cannot be resolved by better algorithms. It can only be resolved by human beings who care enough to ask the question and honest enough to live with the difficulty of the answer.

The blocks world is over. The open domain has been entered. The machines are producing outputs that cover territories Winograd could not have imagined. And the question that the blocks world was designed to defer — whether the production of correct outputs is the same as understanding — has arrived, at last, at its full scale and its full urgency.

Winograd's answer, consistent across fifty years of building and critiquing and redesigning: it is not. The answer has never been more important. The machines have never been more capable. And the human capacity to care — to give a damn, in Haugeland's blunt phrase — has never been more necessary, precisely because everything else can now be done without it.

---

Epilogue

The sentence that would not let go of me was the simplest one in the entire book: "Computers don't give a damn."

Not because it is clever. Because it is true in a way that bypasses argument and arrives as fact. John Haugeland said it. Winograd quoted it. I read it at two in the morning during one of those sessions where Claude and I were deep in the work of making this book exist, and I stopped typing.

Not because I disagreed. Because I was, at that precise moment, experiencing the most productive collaboration of my career with something that did not give a damn about me, or about the book, or about whether any of it mattered.

The productivity was real. The indifference was also real. And the collision between those two truths is the thing Winograd spent fifty years trying to make visible.

In The Orange Pill, I wrote about the imagination-to-artifact ratio — the distance between what you can conceive and what you can build. I wrote about how the language interface collapsed that distance to nearly nothing. I celebrated it. I still celebrate it. Twenty engineers in Trivandrum, each operating with the leverage of a full team. Napster Station, built in thirty days. This book, written on a transatlantic flight. The expansion of human capability is real and extraordinary and I have felt it in my bones.

But Winograd's career is a fifty-year tutorial in what celebration obscures. He built SHRDLU — the most convincing AI demonstration of its era — and then spent two decades explaining that the conviction it produced was hollow. Not because the demonstration failed. Because it succeeded too well. The success concealed the absence. The performance was so good that nobody asked whether the performer understood the performance.

That is the sentence I carry now. Not as a warning against AI. As a warning against myself. Against the specific seduction of tools so good that they make you forget to ask what they cannot do. I described this in Chapter 7 of The Orange Pill — the night I almost kept a passage Claude wrote because it sounded like insight, only to discover the next morning that the philosophical reference was wrong. Winograd gave me the name for that moment: breakdown. The instant the transparent tool becomes visible, the instant the collaboration reveals its seam.

The breakdowns are not the failure of the collaboration. They are its conscience.

What I take from Winograd — what I want to carry into every conversation I have with Claude, every product I build, every decision I make about what my team should create — is the discipline of the question he never stopped asking: Is this understanding, or is it the appearance of understanding? Not because the appearance lacks value. It has enormous value. But because a civilization that mistakes the appearance for the reality will build on a foundation that cannot hold the weight of what matters most.

My children will grow up collaborating with machines more capable than anything I can imagine today. The machines will draft their arguments and design their products and answer their questions with a fluency and a breadth that will make the tools I use look primitive. What the machines will not do — what Winograd's entire career demonstrates they cannot do — is care whether those arguments are just, whether those products serve human flourishing, whether those answers address the questions that actually matter.

That caring is the human contribution. It is irreducible. It is fragile. And it is the only thing standing between extraordinary capability and extraordinary indifference.

Winograd built the illusion. Then he dismantled it. Then he spent the rest of his life designing tools honest enough to support understanding without claiming to possess it. The trajectory is not a story of failure. It is a story of the rarest kind of success: the success of a mind willing to see through its own achievement to the truth beneath it.

The machines are getting better. They will keep getting better. The question is whether we will keep getting more honest — about what they are, about what we are, about the gap between producing an output and meaning something by it.

Winograd held that question for fifty years. Now it belongs to all of us.

-- Edo Segal

He created the most celebrated AI demo of the twentieth century -- then spent fifty years explaining why it didn't understand a word.
In 1972, Terry Winograd built a program that appeared to understan

He created the most celebrated AI demo of the twentieth century -- then spent fifty years explaining why it didn't understand a word.

In 1972, Terry Winograd built a program that appeared to understand English. The AI world celebrated. Winograd looked inside and saw the absence that the performance concealed. What followed was one of the most consequential intellectual reversals in the history of computer science: the pioneer who turned philosopher, the builder who asked whether building was enough.

This book traces Winograd's journey from SHRDLU's colored blocks to the language models of 2025 -- systems millions of times more capable that share the same structural gap between producing correct outputs and knowing what they mean. His frameworks -- readiness-to-hand, breakdown, the distinction between processing and understanding -- are not academic abstractions. They are survival tools for anyone collaborating daily with machines whose fluency makes it dangerously easy to forget what fluency is not.

The machines are extraordinary. Winograd's life's work is a manual for remaining honest about what extraordinary still lacks.

-- Terry Winograd

Terry Winograd
“** "The techniques of artificial intelligence are to the mind what bureaucracy is to human social interaction."”
— Terry Winograd
0%
11 chapters
WIKI COMPANION

Terry Winograd — On AI

A reading-companion catalog of the 26 Orange Pill Wiki entries linked from this book — the people, ideas, works, and events that Terry Winograd — On AI uses as stepping stones for thinking through the AI revolution.

Open the Wiki Companion →