The Octopus, the Bear, and the Hierarchy

Page 1 · The Octopus, the Bear,

EDO SEGAL: Emily, most people in our audience have heard the phrase "stochastic parrot," but the deeper machinery is an earlier thought experiment, from your 2020 paper with Alexander Koller. I'd like you to tell it the way you'd tell it to a smart fifteen-year-old, and then, Geoff — I want you to do something unusual for a debate. Before you attack it, I want you to steelman it. Tell us what the octopus gets right.

BENDER: Happily. Two people are stranded on separate islands, connected by an old underwater telegraph cable. They pass messages back and forth, in English. Deep below, a hyperintelligent octopus taps the cable. It cannot see the islands. It has never seen a coconut, a sunrise, a person. All it observes — for months, years — is the pattern of signals: which sequences follow which. It is a superb statistician, so eventually it can do something remarkable: it cuts the cable and impersonates one islander, and the other islander doesn't notice, because most of conversation is well-trodden pattern.

Then one day the islander writes: I'm being chased by a bear, I have two sticks and a coconut, tell me how to defend myself — quickly. And the octopus has nothing. Not because it's stupid — because what's needed now isn't the pattern of bear-talk; it's bears. Sticks as levers and clubs, the physics of a charging animal, the difference between advice that sounds right and advice that keeps you alive. The octopus has only ever had access to form. Meaning — the relation between the signals and the world — never traveled down the cable. It could not have. That's the octopus's situation, and it is exactly, structurally, the situation of a language model: a brilliant statistician of a wake it has never seen the boat of.

Two people are stranded on separate islands, connected by an old underwater telegraph cable.

EDO SEGAL: Geoff. Steelman first.

· · ·

Page 2 · The Octopus, the Bear,

HINTON: I can do that honestly, because the experiment is genuinely good. What it gets right is this: the training signal matters, and there are things text underdetermines. The octopus warns, correctly, against assuming that competence in the distribution implies competence outside it. Every machine learning researcher should have it tattooed somewhere discreet. It also gets right something Emily doesn't always get credit for: the warning about us — that the islander on the beach will keep reading minds into the cable long after the cable has stopped deserving it. That's real, it's important, and the industry exploits it. There's the steelman.

Now the two places it fails, and they're both load-bearing. First: the octopus is data-starved in exactly the way that matters. It taps one cable — two people's chatter. Our systems tap, in effect, every cable humanity ever laid: the physics textbooks and the survival manuals and a hundred thousand accounts of what bears do and what sticks do and what worked and what got someone killed. Text at that scale isn't gossip about the world; it's a low-resolution scan of it. Redundant, lawful, cross-referenced from a million angles. Emily says aboutness never traveled down the cable. I say: structure traveled down the cable, and structure is most of what aboutness is. The proof is embarrassing in its directness — ask a modern system the bear question. It gives you the answer that keeps you alive. Not because bear-attacks-with-coconuts litter the training data, but because bears and levers and fear do, in ten million decomposed pieces, and the network composed them. That's what the hierarchy does. In vision it built edges into textures into objects without being told to; in language it builds tokens into syntax into situations into something I have no better word for than concepts. I've watched that hierarchy self-assemble for forty years. At the bottom it's statistics. At the top it isn't anymore — that's the whole lesson of deep learning, and it's the lesson the octopus is built to prevent you from learning.

· · ·

Page 3 · The Octopus, the Bear,

BENDER: It answers the bear question now. The bear question is in the literature — it's in my paper, which is in the training data, along with this entire genre of gotcha and reply. Geoff, you can't cite the system's performance on the canonical counterexample as evidence; that's the contamination problem wearing a bow. And "low-resolution scan of the world" — I want to flag what that move does, because it's the central move of your whole school. It quietly converts text produced by people who have a world into the world. The scan metaphor assumes what's in question. Here's what I'll grant, and it's not nothing: a system trained on the wake of a billion boats is a vastly better wake-model than my octopus. What I won't grant is the category jump. More wake is more wake. Show me the mechanism — the actual mechanism, not the adjective "emergent" — by which pattern over symbols acquires reference, the connection to the non-symbolic, and I'll retire the octopus myself.

It builds a model from regularities in signals, and the model is so good you call it the world.

HINTON: The mechanism is the same one you use. You think your reference comes from touching water? Your brain sits in the dark, in a box of bone, receiving spike trains — patterns. It never touches water either. It builds a model from regularities in signals, and the model is so good you call it the world. I'm not being mystical; this is textbook. The retina is a cable, Emily. You are the octopus that got enough data.

· · ·

Page 4 · The Octopus, the Bear,

BENDER: That is the most honest version of your position I've ever heard you give, and it's where I think your school actually lives: not "machines understand like humans" but "human understanding was never what humans thought it was." And there — no, I want to mark this carefully, because it's the seam of the whole evening. My spike trains come with something the cable never carries: a body that acts, an environment that pushes back, stakes — thirst that gets quenched or doesn't, a child who is or isn't where I left her. Embodied, engaged, accountable activity — the loop closes through the world. The model's loop closes through text about the world. You can call them both "just signals" only by ignoring everything that disciplines the signals.

HINTON: And when the systems act — when they use tools, run code, see, get feedback from a world that pushes back? That's not hypothetical; that's deployed. Does your line move?

BENDER: Ask me what the systems with those capabilities were actually trained to optimize and who audited it, and then we'll see whether the line moves or the marketing did.

EDO SEGAL: Stay in this round one more beat, because there's a ghost at this table and I'd like to seat him properly. Hubert Dreyfus. Berkeley philosopher, wrote What Computers Can't Do in 1972, spent his life arguing — from Heidegger, from the phenomenology of skill — that computers would never achieve real intelligence because intelligence is embodied, situated, a matter of coping with a world rather than processing symbols about one. The AI establishment of his day despised him. And Geoff — here's the delicious part — Dreyfus was attacking symbolic AI, the school that you also spent thirty years fighting. His arguments against rules and representations read, today, almost like a connectionist manifesto. So whose ancestor is he? Each of you gets to claim him. Make your case.

· · ·

Page 5 · The Octopus, the Bear,

HINTON: Oh, he's mine, and I'll prove it with a memory. Dreyfus said intelligence isn't rule-following — that the experts can't articulate their expertise because the expertise was never propositional in the first place; it's pattern, acquired through experience, resident in something more like a body than a logic engine. That is precisely what a neural network is: competence without articulable rules, knowledge smeared across weights, skill that can't explain itself. We vindicated him. The irony is that he didn't live to see his argument win by changing sides — the "computers" he said couldn't do it were the symbol machines, and he was right, and the thing that could do it was built on his side of the argument: learning, pattern, the refusal of explicit rules. If Bert were here I'd tell him: you were right about everything except which machines you were right about.

You kept the "no rules" half and quietly dropped the "being in the world" half, because the half you dropped is the half your systems still don't have.

BENDER: That's a beautiful eulogy and a selective one. What Dreyfus actually said was embodied, Geoff — not just "non-symbolic." Situated. In a world that pushes back, with stakes, with a body whose skillful coping is the understanding. You kept the "no rules" half and quietly dropped the "being in the world" half, because the half you dropped is the half your systems still don't have. A network trained on text is exactly what Dreyfus would have called a degenerate case: all pattern, no situation. He'd look at your language models and say what he said about every generation's AI — that we've mistaken the articulable shadow of intelligence for the thing, again — and then he'd note, with that Berkeley relish, that this time we built the shadow out of everyone's articulations at once, which makes it a much more convincing shadow. He's my ancestor, Geoff. You just buried him with the wrong family.

HINTON: Then let's honor him properly and put the question where he'd put it: in the body. Robots with these models inside them are learning manipulation from experience as we speak. When the network copes — Dreyfus's word — when it skillfully copes with a world that pushes back, do you inherit his position or abandon it?

· · ·

Page 6 · The Octopus, the Bear,

BENDER: When the coping is general, unstaged, and audited by someone the vendor doesn't pay — ask me then. Dreyfus also taught us how many demos the field can stage per decade.

EDO SEGAL: Hold that thread — it returns in the round on what the death cross is measuring. But the next round belongs to a bird. Emily, you gave the world a phrase that escaped the academy and entered the language, and Geoff, you've called the worldview behind it the last gasp of human exceptionalism. Stochastic parrots — what the phrase illuminates, and what it hides. After the break.

· · ·

Continue · Chapter 5

The Parrot and the Prophet

→