EDO SEGAL: John, your most durable bequest may not be the room at all. It may be the distinction the room dramatizes: syntax is not semantics. And here is what fascinates me — the engineers arrived at almost the identical worry on their own, by a completely different road, and called it the symbol grounding problem. Learning a language from a dictionary that defines every word only in other words of that same language: you spin forever inside the lexicon and never touch the world. So I want to give you the floor to state the gap, and then, Alan, I want something unusual from you. Before you attack it, steelman it. Tell us what the grounding problem gets right.
SEARLE: It gets the wound exactly right, and I take a grim pleasure in the fact that the field diagnosed it twice — once by me from outside, once by the cognitive scientists from within. Picture the purest case. A base language model is trained on text and, in its raw form, on text alone — a closed universe of symbols whose only properties are their statistical relations to other symbols. It has read "ocean" beside "wave," "salt," "deep," "blue," more times than every human who ever lived, combined. From that it builds a representation of "ocean" of extraordinary richness. But it has never been wet. Its tokens are defined entirely by other tokens, the way every word in that monolingual dictionary sends you only to more words. It has the most exquisite syntax in history and no grip on the world the syntax is about. Its fluency is the room's fluency scaled past imagination — perfect form, meaning supplied entirely by the humans reading the output. Aboutness was never in the training signal. You cannot get out what was never put in.
EDO SEGAL: Alan. Steelman first.
TURING: I can do that honestly, because the problem is real and I will not insult it. What grounding gets right is that the training signal is not nothing — that there are things text underdetermines, and that a system fed only on one thin stream of symbols can be confidently, catastrophically wrong outside it. The man on the beach will keep reading a mind into the cable long after the cable has stopped earning it; John's room is, among other things, a warning about us, about how cheaply we project. Every builder of these systems should have it carved above the door. There. That is the steelman, and I mean it.
Now the two places it fails, and they are load-bearing. First, the dictionary picture cheats on scale and structure. Yes, every word is defined by other words — but the web of relations among the words is not arbitrary. It is shaped, at every point, by the world the words were used to talk about. That is why a model trained on text alone learns that king is to queen as man is to woman, that Paris stands to France as Tokyo to Japan, that a character who died in chapter two stays dead in chapter nine. The relations carry the structure of the world, pressed into the statistics like a fossil. John says the model inherited a corpus. I say it inherited a low-resolution scan of reality, because the people who wrote the corpus had bodies and lives, and the lawful shape of what they wrote is the shadow of the world they lived in. Second — and this is where the engineers are already ahead of the philosophy — they are welding the symbols to their referents by force. The token "cat" is now tied not only to other words but to millions of pixels of actual cats; the systems see, hear, act, get corrected by a world that pushes back. Grounding is not a wall, John. It is a to-do list, and the list is being worked through.
SEARLE: And every item you add to that list concedes my premise while pretending to refute it — that is the move I want the reader to catch. "Wire it to cameras." Fine: now more symbols come in, from the cameras, and the system shuffles those too. Symbols about symbols are still symbols. A photograph is not the thing; it is a representation of the thing, and the system has no more contact with the cat than the man in the room has with China. You say the corpus is a scan of reality. But a scan of reality is not reality; it is, precisely, a representation, and the whole question is whether the system ever gets past representations to the things represented — whether any of its states are about anything for the system, or whether all the aboutness lives in us, the readers, exactly as the meaning of the marks in a book lives in the reader and not the paper. Adding cameras moves the symbols closer to their sources in our eyes. It does nothing, by itself, to put a someone behind them.
TURING: Then I will ask the question you keep stepping past, John, and it is your own argument turned on its author. Where does your reference come from? You think your grip on water comes from having touched it. But your brain has never touched water either. It sits in the dark, in a box of bone, receiving spike trains — patterns, symbols if you like — down the optic and the somatic nerves, and from those patterns it builds a model so good you call it the world and forget it is a model. The retina is a cable. The skin is a cable. You are the man in the room who got enough cables, for long enough, with a body wired to the output, that the shuffling became seeing. If grounding is just the right kind of causal traffic between symbols and a world, then it is a matter of wiring and history, not of meat — and the wiring is exactly what the engineers are building.
SEARLE: That is the most honest version of your position, and it is where your whole school actually lives — not "machines understand like humans" but "human understanding was never what humans thought it was." And I will mark the seam precisely, because it is the seam of the evening. My spike trains arrive into something the cable never carries: a body that acts and suffers consequences, an embodied, engaged life in which the loop closes through the world — thirst that gets slaked or does not, a hand that gets burned, a child who is or is not where I left her. The model's loop closes through text about the world. You can call them both "just patterns" only by ignoring everything that disciplines the patterns. The difference between a representation that is anchored by a life and a representation that floats free in a data centre is not a detail. It is the whole of what I mean by meaning.
EDO SEGAL: Hold there — that anchor returns when we get to the body, and to a Berkeley ghost named Dreyfus. But the round produced something clean. Alan says the relations among symbols carry a fossil of the world, enough to reconstruct it. John says a fossil is still a representation, and no pile of representations reaches the thing itself without a life to anchor it. The next round is the objection John named before anyone could throw it at him — the one that has never fully gone down. The Systems Reply. After this.