The Part of the Mind We Don't Understand

Page 1 · The Part of the

**EDO SEGAL:** Jerry, your second great book, *The Modularity of Mind*, ends with a warning everyone forgets. You said perception and language are modular — fast, sealed, tractable — but the part that *reasons*, the central system, is not modular, and for that reason may be beyond any computational theory, including the one you championed. You called it the most important moral of the book and the one most likely to be ignored. Make the case, and tell me where the machine is weakest because of it.

**FODOR:** The modules are the easy, comforting half, and the field loves them — vision, hearing, language, each fast and automatic and *informationally encapsulated*, meaning each does its job using only its own information, walled off from everything else you know. The proof is the visual illusion: in the Müller-Lyer figure the two lines look unequal, and they go on looking unequal *after you've measured them and know they're equal*. Your knowledge can't reach into the visual module and fix the percept. The module is sealed. And sealing buys speed and reliability — a perceptual system that had to consult everything you believe before reporting what it saw would be hopelessly slow and see what it expected.

· · ·

Page 2 · The Part of the

But the *interesting* part of the mind — the central system, the thing that takes the modules' outputs and *reasons*, forms beliefs, decides — is the opposite of modular. It's *global*. And here's the argument I'll die defending. Central thought is *isotropic*: anything you know might, in principle, bear on anything else — a fact from astronomy can suddenly matter to a question in biology. And it's *Quinean*: what matters to it — simplicity, plausibility, relevance — are properties of your *whole* belief system at once, not of any local piece. Now: computation, as I understand it, operates on the *local, syntactic* properties of representations — the shapes, the things a mechanism can read off one symbol without consulting the rest. But relevance is *not* a local property of a symbol. It's a property of the whole. So a process that works by reading local shapes *cannot*, in principle, compute relevance. That's the [frame problem](https://www.youonai.ai/fieldguide/med/frame_problem) — how does a system know which of the millions of things it believes bear on the situation, without checking them all, which is impossible? Classical symbolic AI *broke* on this. It drowned in which-rules-apply. And I drew the moral the field has spent decades evading: common sense is global, computation is local, and the gap is conceptual, not engineering. We don't have a theory of common sense that would survive scrutiny by an intelligent five-year-old. The five-year-old has it. The machine, structurally, does not.

· · ·

Page 3 · The Part of the

**WITTGENSTEIN:** I am going to do something that will unsettle the audience, which is to *agree with Jerry's diagnosis and locate it in my own vocabulary*, because he has, without naming it, rediscovered a piece of me. The frame problem — the impossibility of computing relevance from local rules — is what you hit when you try to make rule-following into an *explicit* affair, a rulebook the system consults. And I proved that rule-following *cannot* be that. There is no rulebook of relevance, because every rule of relevance would itself need a rule for when *it* is relevant, and you are off down the regress. The reason the five-year-old has common sense and the symbolic machine drowned is that the child does not *compute* relevance from a private rulebook. She has been *trained into a form of life* in which the relevant simply *shows up* as relevant, against a background she does not consult because she *is* it. Jerry calls this an unsolved engineering problem. I call it a confusion about what understanding is — and we are, astonishingly, describing the same wall from opposite sides.

**FODOR:** *[pause]* That may be the most generous reading anyone's given the frame problem, and I'll take the alliance even though I'll regret the terms. But now the twist that makes this round urgent, and it's a twist on *both* of us. The language model addresses the frame problem in a way I never anticipated, and it *half-works*. It doesn't reason with explicit rules over symbols — it *absorbs* the statistical structure of relevance from a staggering quantity of human text, learning implicitly which things tend to bear on which. It learns relevance as a *pattern* rather than computing it from local syntax. And that's an end-run around the exact obstacle I identified. For the first time, a machine shows something that *looks* like the global, contextual relevance-detection I said computation couldn't produce. So did it solve my problem, or paper over it?

**WITTGENSTEIN:** Say which you think, Jerry. You've earned the floor.

· · ·

Page 4 · The Part of the

**FODOR:** Papered over — and the *manner* of the failures is exactly what my framework predicts. Its grasp of relevance is statistical, not principled, so it fails the way an approximation fails: confidently, without warning. It retrieves what's *typically* relevant, and when a situation needs an *unusual* relevance, or needs *ignoring* an association that's usually apt but presently misleading, it stumbles. It hallucinates connections that are common-in-the-data and false-in-the-world; it misses the bearing of a fact any human sees is decisive but the training didn't make salient. Those aren't random errors. They're the signature of a system that learned the *distribution* of relevance without grasping relevance — that knows what tends to matter without knowing why, and so can't tell when the tendency misleads. My frame problem wasn't solved. It was approximated. Well enough to be useful, not well enough to be trusted. And the gap between approximation and solution is the gap I spent my life insisting was real.

**WITTGENSTEIN:** And here, at last, I must take something back from Jerry, because his very success this round exposes his deepest error. He says the machine "learned the distribution of relevance without grasping relevance." But Jerry — *grasping relevance* was never a mechanism the child has and the machine lacks. The child has no inner relevance-engine either! She, too, "absorbed" relevance from immersion in a practice — that is what training into a form of life *is*. Your demand that the machine "grasp" relevance, over and above tracking it the way a trained participant does, is the same demand for an inner something that your whole frame argument should have taught you to drop. You have, all evening, correctly seen that the machine *lacks the form of life* — and incorrectly translated that lack into a missing *inner grasp*. The lack is real. Your name for it is the bewitchment.

**FODOR:** No — the child can *recognize* when her trained sense of relevance is wrong and *override* it. The machine can't override its distribution; it *is* its distribution. That override capacity is the residue of grasp, and it's not a ghost — it's a higher-order structure the machine lacks.

· · ·

Page 5 · The Part of the

**WITTGENSTEIN:** Or it is one more place where the machine does not *share our stakes*, and so has no reason to override, because nothing is at risk for it in being wrong. You explain by structure what I explain by answerability, and we cannot, tonight, prise them apart — which is itself the finding.

**EDO SEGAL:** Jerry, I have to put one thing to you here, because it's the most famous place you were wrong, and the machine is the witness. You argued for decades that you couldn't *learn* a genuinely new concept — that the structure had to be innate, triggered not acquired, because learning a concept would require already having it. The machine acquires usable concepts from data, including concepts for things invented long after any evolution could have stocked the mind. Did the machine break your nativism?

**FODOR:** It bent it badly, and I'm not going to wriggle. My radical concept nativism — the claim that nearly everything is innate, that you can't acquire a real new concept — was the most extravagant thing I ever defended, and a machine that learns a serviceable concept of a smartphone from scratch is a hard fact to wave away. I underestimated what statistical learning could extract from exposure. I'll say it plainly: I was wrong about the *magnitude* of learning. What I won't concede is the *distinction* the nativism was protecting — between a representation that has genuine compositional structure and one that approximates the outputs of structure. I bet wrong about which side learning could reach. I did not invent the side. The line between grasping a concept and absorbing its distribution survives my being wrong about how much absorbing could do.

**WITTGENSTEIN:** That is the most honest thing you have said, and I want to honor it by agreeing with the half that is right: there *is* a difference between mastering a concept and parroting its company. I have spent my life on that difference. We disagree only about whether it lives in an inner structure or in an answerable practice — and the machine, learning the company without the practice, is the exact case that keeps the disagreement alive.

· · ·

Page 6 · The Part of the

**EDO SEGAL:** Two thinkers from opposite shores standing at the same wall, naming it in different languages, and unable to agree whether the wall is made of missing structure or missing stakes. That's convergence four and it's the deepest. Hold it. Now we come down out of the metaphysics, because there's a parent at the kitchen table who doesn't care what's inside the machine — she cares what it does to her kid's future. The death cross. The apprentice. The candle. After this.

· · ·

Continue · Chapter 11

The Apprentice and the Candle

→