PERSON

Jerry Fodor

The combative philosopher who took the computer metaphor for the mind more seriously than anyone alive—then spent forty years using that seriousness to draw a hard line around what a computer could and could not be, a line that the large language model now sits directly on, daring us to say which side it is on.

Jerry Fodor is the philosopher who built the most rigorous available theory of how a mind could be a machine, and then used the same rigor to insist that no machine anyone was actually building could be a mind. His two foundational contributions—the Language of Thought hypothesis and the modularity thesis—were not only about the mind; they were about the conditions any computational system must satisfy to count as thinking, and those conditions are exactly what the dominant AI architecture of our moment appears to violate. He held that genuine thought requires discrete symbols with stable meanings combined by syntactic rules—a Mentalese, a language in which thinking is done. Large language models have no such symbols: they operate on continuous vectors, distributed representations in high-dimensional spaces, with no discrete constituents to recombine and no syntax in his strict sense. And yet they produce compositional, systematic language of staggering fluency, doing the thing he said required a Language of Thought without, apparently, having one. The question his framework forces—whether the machines have solved the problem he posed or found a way to approximate its solution well enough that the difference is hard to see from the outside—is the deepest question about what these systems are, and it cannot be answered by admiring the output. Fodor’s gift was not a verdict but a set of distinctions sharp enough that you could see exactly what you were claiming when you claimed a machine could think: the difference between behaving systematically and being systematic, between a statistical approximation of compositional structure and the real thing, between a system that processes symbols and a system that produces symbol-shaped output without any symbol inside.

In the [YOU] on AI Field Guide

The cycle asks what these systems actually are, beneath the fluency. Fodor is the thinker who most precisely arms the reader for that inquiry. His central distinction—between a system that has a property and a system that behaves as if it had one—is the instrument the cycle most needs when confronted with AI outputs that sound like understanding, look like reasoning, and pass tests that were once reserved for minds. He was not opposed to machines that think; he was opposed to the confusion of fluent output with the structure that produces it, and that confusion is everywhere in the current discourse.

His modularity thesis, which came in two halves, illuminates the AI landscape with unusual precision. The comforting half—that perception and language are modular, fast, and tractable—is the half the field has instinctively moved toward, building specialized models for specialized tasks, retrieval systems that fetch facts from sealed stores, mixture-of-experts architectures that route inputs to dedicated sub-networks. The devastating half—that central cognition, the system that reasons, decides, and draws on everything you know at once, is precisely not modular and is for that reason not understood—is the half that the cycle keeps returning to, because it is where the machines most conspicuously struggle. The hallucinations, the coherence failures, the inability to hold a long argument together without contradiction: these are the signatures of a system that handles the local and the modular with grace and fails at exactly the global, integrative cognition that Fodor declared beyond computational reach.

The challenge of 1988, which Fodor issued with Zenon Pylyshyn to the connectionist movement that would eventually produce the large language model, is the most important specific argument in this context. The challenge rested on two properties of thought—systematicity and productivity—that Fodor held could only be explained by compositional symbolic structure, not approximated by connection weights. The machines have since displayed both properties to a degree no one expected, and the honest question is whether they answered the challenge or merely met it at a level of coverage broad enough that the underlying difference is hard to find. Fodor’s framework predicts where to look: at the edges of the training distribution, where genuine compositional structure and statistical approximation of it come apart.

His contribution to the cycle’s humanistic concerns is less about the machines than about the thinking we bring to them. He was the clearest available voice for the proposition that fluency is not intelligence, that the output tells you nothing about the process, and that the question of what is inside a system matters independently of how well the system performs. In an age that tends to slide from impressive performance to strong claims about machine minds, this insistence on structural rather than behavioral criteria is exactly the discipline the cycle needs.

Origin

Jerry Fodor was born in New York in 1935 and spent most of his career at MIT and Rutgers, dying in 2017 the year before the systems that would most decisively test his framework arrived. He was a polemicist by temperament—relishing argument, attacking positions he thought confused, picking fights with behaviorists, connectionists, evolutionary psychologists, and Darwinians with equal energy—and the polemical style could obscure the depth of the philosophical work beneath it. He was also, by most accounts, the most important philosopher of mind of his generation.

The Language of Thought hypothesis, proposed in the 1975 book of that name, was Fodor’s foundational contribution: the claim that thinking is done in a mental language, with its own symbols and its own combinatorial syntax, prior to and independent of any spoken tongue. When you believe that the cat is on the mat, there is, on this account, a structured representation in your head, composed of a symbol for the cat, a symbol for the mat, and a relation between them assembled according to rules. The belief has parts, and the parts are the same parts that show up in other beliefs. This is not a metaphor; it was a literal hypothesis about the format of cognition.

The Modularity of Mind (1983) added the second great contribution, arguing that the mind is divided into specialized, informationally encapsulated input systems—modules that do their work in sealed isolation from the rest of what you know—and a central system that reasons, decides, and draws on everything at once, whose non-modular character was, he argued, the deepest reason why cognitive science could explain the peripheral and not the central. His 2000 book The Mind Doesn’t Work That Way sharpened this into a thesis about the limits of the whole computational approach: global, abductive reasoning—inference to the best explanation from the totality of one’s beliefs—cannot be mechanized by operating on the local syntactic properties of representations, and since central cognition is precisely such reasoning, the computational theory of mind, however good at the modules, cannot explain the thing that matters most.

Key Ideas

The Language of Thought. Genuine thinking requires a system of internal symbols—Mentalese—with two faces: a syntactic shape that a mechanical process can manipulate, and a semantic content that the shape carries. Computation that works by manipulating the shapes while respecting the meanings is the only account, Fodor held, of how thought can be at once physical and rational. A large language model operates on vectors, not symbols in this sense, and the question whether a sufficiently structured vector space can implement what a Language of Thought is—whether the two are levels of description of the same thing rather than alternatives—is the live technical debate his framework generates.

Modularity and its limits. The mind’s input systems—vision, language—are fast, domain-specific, and informationally encapsulated, each doing its work without consulting the rest of what you know. This encapsulation buys reliability at the cost of flexibility. The central system, which integrates the outputs of the modules and reasons across the full web of belief, is not modular, and for that reason not understood. The field has begun rediscovering the value of modularity—specialized sub-models, retrieval systems, routing architectures—and in doing so has converged, slowly and without quite admitting the lineage, on the modular design Fodor described. But the part he said we did not understand remains the hardest: the global, integrative, relevance-sensitive reasoning that the machines most conspicuously fail to perform reliably.

Systematicity and productivity. Fodor and Pylyshyn argued in 1988 that any mind that can think one thought can think the systematically related thoughts—anyone who can think John loves Mary can think Mary loves John—and that this systematicity follows necessarily from compositional structure, not contingently from training. A connectionist system could mimic systematicity but not explain it. The machines of our moment display systematicity far beyond what anyone then imagined—and the Fodorian question is whether they display the principled, exceptionless kind that follows from genuine structure, or the statistical, coverage-bounded kind that follows from having seen enough examples. Careful probing of compositional generalization benchmarks suggests the latter: the machines are systematic within the distribution they have seen, and brittle at its edges, in exactly the way approximation rather than structure predicts.

The frame problem and abduction. Central cognition is global and abductive: it draws on everything you know, assesses candidate beliefs by how they fit the whole web, reasons to the best explanation. These are properties of whole belief systems, not of individual symbols, and a process that works on the local syntactic properties of representations has no access to them. The frame problem—how a system knows which of everything it knows is relevant to the situation at hand—was the rock on which classical symbolic AI broke, and Fodor held it was a conceptual impossibility, not merely an engineering difficulty. The large language model addresses the frame problem by a different means: absorbing the statistical structure of relevance from vast text corpora, learning implicitly which things tend to go with which. This approximates relevance without computing it, and fails precisely where approximation fails: on situations where the unusual relevance is decisive and the statistical tendency misleads.

The third thing. Fodor framed the science of mind as a war between symbolic and connectionist approaches, assuming they exhausted the options. The large language model is neither: connectionist in its substrate, displaying symbolic-looking behavior in its conduct. It is a statistical approximation of a symbol system—implementable in a distributed substrate, achieving compositional behavior by learning the patterns that compositional structure produces, exhibiting systematicity and productivity across the bulk of the distribution while failing at its edges in ways that genuine structure would not. Fodor gave us the wrong taxonomy and the right concepts, and the concepts—the distinction between having a property and approximating it, between structure that is built in and structure that is learned—are what let us see what the machines are.

Debates & Critiques

The central debate Fodor's framework generates is whether the large language model answered the challenge of 1988 or merely papered it over at unprecedented scale. The triumphant reading says yes: connectionism did the thing Fodor and Pylyshyn declared impossible in principle, achieving systematic, productive, compositional behavior without discrete symbols, and the impossibility claim was simply wrong. The Fodorian reading says the machines achieved behavioral systematicity, which Fodor never denied was possible, while the principled systematicity—the kind that follows from compositional structure and holds everywhere, not just where the training distribution reaches—remains undemonstrated and probably absent. The careful compositional-generalization benchmarks, which probe the systematic recombination of elements the model has not seen in combination, tend to support the Fodorian reading. But the issue is genuinely contested, and what makes it practically urgent rather than merely scholastic is that the answer determines whether scaling leads to general intelligence or to an ever-better imitation of it. If the machines have genuine compositional structure, more scale means more mind. If they have a statistical approximation, more scale means a more seamless approximation—harder to distinguish from the real thing, but not the same thing. Fodor’s framework cannot tell us which world we are in. It tells us that the difference is the whole question, and that fluent output is the one kind of evidence that cannot decide it.

Fodor's Three Distinctions

The conceptual tools the large language model most needs and most pressures

The Hypothesis

Language of Thought

Genuine thinking requires discrete symbols with stable shapes that carry meanings into systematic combination. The large language model operates on continuous vectors without such symbols. Whether the geometry of a well-trained vector space implements what a Language of Thought is — or merely approximates its behavioral outputs — is the live question Fodor's framework forces.

The Architecture

Modularity and Its Limits

The input systems are modular; the central system is not; and the central system is where thinking actually happens. The field has rediscovered the value of modularity for the peripheral systems. The hardest part — global, integrative, relevance-sensitive reasoning — remains exactly where Fodor said it would be: beyond the reach of the computational approaches we have built.

The Challenge

Systematicity and Approximation

Any mind that can think one thought can think the systematically related thoughts, and this follows necessarily from compositional structure. The machines display systematicity across the bulk of their training distribution and fail at its edges — which is the signature of approximation rather than structure, of having learned the patterns that compositionality produces without possessing the syntax that guarantees them everywhere.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

Fodor's Three Distinctions

Related Entries

Further Reading