The Challenge of 1988

Page 1 · The Challenge of 1988

**EDO SEGAL:** Jerry, in 1988 you and Zenon Pylyshyn published a paper that defined thirty years of this field — a flat, in-principle impossibility claim against exactly the kind of network now running on every phone. You said connectionist systems could never *explain* the structure of thought; they could only fake it. The machines arrived and did the thing. I'd like you to state the challenge as sharply as you stated it then. And then I want to ask you the question that has to hurt: did they answer you, or did they prove you right in a way no one expected?

**FODOR:** I'll state it without softening, because softening it would be dishonest. Zenon and I granted the connectionists everything except the prize. We said: fine, maybe a network is how the brain is *implemented* — the wetware, the hardware. But it cannot be the theory of the *cognitive architecture*, the level where thinking is structured. Why? Because a network has no compositional constituents. Its representations are distributed patterns, not assemblies of reusable parts. So a network has no *principled* reason to be systematic. It might handle *John loves Mary* and *Mary loves John* both — if you trained it on both. But that's a brute fact about the training, not an explanation, because nothing in the architecture *requires* that mastering one bring the other for free. In a real symbol system, you can't represent the one with structure without thereby being able to represent the other — same parts, same syntax. The connectionist could *mimic* systematicity. He could not *explain* it. And a theory that can't explain the essential properties of thought is not a theory of thought.

· · ·

Page 2 · The Challenge of 1988

Now. Did the machines answer me? On the surface, spectacularly. A modern model handles novel sentences with ease, recombines concepts it was never explicitly trained to recombine, generates unbounded structured output. It looks like the most systematic artificial system ever built — and it's connectionist to the bone, and it got there by riding the [scaling laws](https://www.youonai.ai/fieldguide/med/scaling_laws) I'd have bet against. The triumphant reading says: Fodor's impossibility claim was just false, and the architecture he mocked did the thing he said it couldn't. That reading is widely held and it is *not stupid*. I lose, on the surface, and I want the audience to know I can say that out loud.

**EDO SEGAL:** But.

**FODOR:** But the whole force of the 1988 argument was that the *surface* is not the issue. We never denied a network could be *trained* to behave systematically over some range. We denied it could deliver the *principled, exceptionless, guaranteed* systematicity that real compositional structure gives you. So the question the machine poses isn't "does it behave systematically?" — plainly it does. The question is whether its systematicity is the *principled* kind that refutes me or the *approximate, training-bounded* kind that confirms me. And you find out by going to the edges — novel recombinations far from training, recursion too deep, structure the statistics never saw. And there, reliably, the machines crack. They generalize across the bulk of the distribution and then fail, suddenly and bizarrely, on inputs a real symbol system would handle trivially. That brittleness is not a bug to be patched. It is the *fingerprint of approximation* — the place the soft, learned, statistical structure reveals it is not the hard, built-in, exceptionless thing. I didn't predict the fluency. I'll own that. I predicted the *shape of the failure*, and the shape of the failure is exactly what we observe.

· · ·

Page 3 · The Challenge of 1988

**WITTGENSTEIN:** This is the most interesting thing Jerry has said, and I am going to agree with the *fact* and dismantle the *frame*. The fact is real: the machine is brilliant in the middle of the distribution and breaks at the edges. I accept it. But look at what Jerry has done with it. He has set up a contest between two hidden essences — "principled structure" versus "mere approximation" — and declared that the brittleness reveals which essence is *really* inside. And I want to ask my rule-question: what are we *doing* with "principled" here? How would you ever *see* the difference, except in the behavior at the edges — which is to say, in more behavior? You have named the brittleness "the fingerprint of approximation." You could equally name it "the fingerprint of a finite system, like every system, including the human one." A human pupil also breaks at *her* edges. Push anyone far enough past their training and they confabulate, lose the thread, go on wrongly while sincerely believing they go on the same. The brittleness does not reveal a hidden essence. It reveals a limit, and limits are universal.

**FODOR:** No. The difference is that the human's competence is *recursive and open* in a way the machine's is not — give a person the rule and she runs it to arbitrary depth; the machine degrades with depth precisely because it never had the rule, only the statistics of the rule's outputs. That's not "everyone has limits." It's a *structural* difference in the kind of competence.

**WITTGENSTEIN:** Is it? Or is "give a person the rule and she runs it to arbitrary depth" a fairy tale you have never tested? Give a person *the cat that the dog that the man that the boy that the girl saw knew owned chased ran* and she is lost at the third nesting, exactly like your machine, because human working memory is finite, like the machine's context. The "arbitrary depth" of human competence is a property of the *idealized* rule, not of any actual human being. You have compared the *real* machine to an *idealized* human and called the gap an essence. That is not science. That is theology with a confusion matrix.

· · ·

Page 4 · The Challenge of 1988

**FODOR:** *[pause]* The recursion point is fair and I've heard it before and it has teeth. But there's a residue you can't dissolve. When a human *does* fail at depth five, she can be *told* the rule and *take it up* — she can step outside the performance, grasp the principle, and apply it where her memory failed. The machine can't be told the rule in a way that fixes the structure; it can only be shown more examples. That difference — between grasping a rule and absorbing a distribution — is the one I'll die on, and it's not idealized. It's the difference between a creature that can *follow* a rule and a system that *conforms* to one.

**WITTGENSTEIN:** *[long pause]* And now you have walked, of your own accord, into the center of my philosophy, and I did not even have to lead you. "Following a rule" versus "conforming to one." Jerry, that distinction is *mine* — it is the most vertiginous thing in the *Investigations*, and it does not say what you think it says. You think following a rule is *grasping an inner principle*. I spent years proving it is no such thing. The next round is going to cost you, because the very weapon you just reached for was forged in my workshop, and it does not fire the way you assume.

**EDO SEGAL:** That's the cleanest handoff I've ever heard at a debate table — one of you reaching for a weapon and the other saying "I made that, and you're holding it backwards." Round on rule-following next. It is the hinge of the whole evening, and as it happens, the hinge of the alignment problem. After this.

· · ·

Continue · Chapter 7

Following a Rule, or Only Conforming

→