PERSON

Donald Davidson

The analytic philosopher who spent forty years asking what it takes for words to mean anything and thoughts to have content at all—and whose accounts of radical interpretation, the principle of charity, and the Swampman have become, unexpectedly, the most precise philosophical instruments available for asking whether a machine that talks actually means anything.

Donald Davidson is the philosopher of language and mind who, without intending to, produced the conceptual toolkit the age of artificial intelligence most needs. Born in 1917, trained at Harvard in classics before turning to philosophy, he wrote dense, careful essays with titles like "Truth and Meaning" and "Actions, Reasons, and Causes" that quietly restructured how his entire field thought about the relationship between words, the world, and the thoughts that words are supposed to carry. His central thought experiment—radical interpretation—asks how a field linguist could ever understand speakers of a completely unknown language with no shared vocabulary, no interpreter, and no bilingual informant, and finds that this extreme case is the general case: all understanding of another speaker is radical interpretation, an active construction governed by the assumption that the speaker is rational and largely right about the world. This assumption, the principle of charity, is not a courtesy; it is the transcendental condition of interpretation, the thing without which understanding could never begin. It is also, Davidson would observe, the precise mechanism by which large language models most reliably mislead us: we extend charity automatically to any sufficiently fluent voice, and fluency is not evidence that the conditions licensing charity are satisfied. His Swampman—a being physically identical to Davidson, assembled by lightning from a dead tree, that nonetheless has no thoughts at all because it lacks the causal history that constitutes content—is the clearest philosophical formulation of the deepest worry about machine meaning. And his doctrine of anomalous monism, which denies that mental descriptions are reducible to physical ones even while insisting that mental events are physical events, explains why no amount of mechanistic interpretability could by itself settle whether a system has a mind.

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI asks what it means to see the machine clearly. Davidson is the philosopher who makes clear-seeing most difficult and most necessary at once. He shows that when we sit before a language model and read its outputs as meaning something, we are performing an act of interpretation under conditions of near-total ignorance—exactly the conditions his radical interpretation scenario was designed to illuminate. We do not have access to the model's "beliefs" except through its words, and we cannot read its words except by assuming they express something belief-like. That circularity—fix meanings to know beliefs; fix beliefs to know meanings—is the knot at the heart of interpretation. Davidson had a strategy for cutting it in the human case. Whether the strategy applies to the machine, or breaks against it, is one of the most revealing questions you can ask about these systems.

His principle of charity is the cycle's most consequential idea for understanding the mechanism of AI persuasion. We are charitable to language models reflexively and without permission. When a model produces a confident, well-structured paragraph, we read it as the expression of an underlying competence—a mind behind the words, mostly rational, mostly right. We supply the rationality the principle demands automatically, because the surface invites it and because we have spent our entire lives extending this charity to the only other producers of such surfaces we have ever met: human beings. The fluency triggers the assumption. The assumption does most of our interpreting for us, filling the space behind the words with a mind we have not actually found. The fluency-authority decorrelation—the cycle's central diagnostic of the AI transition—is, in Davidson's vocabulary, the failure of charity's warrant: we extend an assumption that was calibrated for rational agents to a system that was optimized to produce outputs that look as though they issue from rationality, which is not the same property.

Davidson also resolves a confusion that runs through the alignment debate. When a model says "I recommend" or "I am sorry," the grammatical first person invites us to locate an agent behind the words, a party whose deed it is. Davidson's account of actions and reasons shows that a true action is behavior caused by the agent's own beliefs and desires—a primary reason that both justifies and produces it. If the model has no such beliefs and desires in the required sense, its outputs are not actions but events: caused, consequential, but not done-for-reasons, and not answerable on the system's own behalf. The responsibility must lie with the people and institutions who built, deployed, and relied upon it. Davidson's framework lets us keep the books straight: the events are the machine's and the agency is ours, and collapsing the two in either direction is the error.

Origin

Davidson was born in Springfield, Massachusetts in 1917. He took his undergraduate degree in classics at Harvard, served in the Navy in the Second World War teaching pilots to identify aircraft, earned his doctorate in philosophy, and built his reputation slowly through essays rather than books—a style that made his influence diffuse and his ideas hard to pin to a single text. He taught at Stanford and then at Berkeley, where he spent the last decades of his career, and he died in 2003. His intellectual genealogy runs through Quine and Tarski: from Quine he inherited the behaviorist insistence that meaning is not a private inner episode but something recoverable from public evidence, and from Tarski he borrowed the apparatus of truth-conditional semantics, the idea that a theory of meaning for a language should take the form of a theory of truth, deriving for every sentence the conditions under which it is true.

His first famous contribution, the 1963 essay "Actions, Reasons, and Causes," defended the thesis that the reasons for which we act are also causes of our actions—a position that had fallen out of fashion in philosophy of action and that Davidson restored with a precision that settled the debate for a generation. His second major contribution, the 1967 essay "Truth and Meaning," proposed that a Tarskian truth theory for a natural language would serve as a theory of meaning for it, and that radical interpretation was the method by which such a theory could be confirmed from the evidence of speakers' behavior. His third, the 1970 essay "Mental Events," introduced anomalous monism: the thesis that mental events are physical events, but that there are no strict laws connecting mental and physical descriptions, so that the mental is irreducible even if not supernatural. These three contributions, taken together, form the most comprehensive and most demanding account of what it takes for a system to have beliefs, act for reasons, and mean something by its words.

Davidson built his method around a single discipline: asking not whether a claim is true but what would have to be true for it to be true. Applied to the machine, this discipline produces questions the AI conversation rarely poses but always needs: not "does the model understand" but "what would have to be the case for understanding to be attributable"; not "is the model conscious" but "what conditions are necessary for consciousness to apply." His temperament was that of a man who distrusted the grand gesture and trusted the patiently constructed argument, and that temperament is exactly what the present moment needs from philosophy.

Key Ideas

Radical interpretation. Davidson's thought experiment proposes a field linguist confronting an unknown language with no shared vocabulary and no informant, working only from the sounds speakers make and the circumstances in which they make them. He argues that this extreme case is the general case—all interpretation is radical interpretation, an active construction of a theory that pairs utterances with truth-conditions and beliefs with the world. This is the exact situation of anyone sitting before a language model: we have no access to its "meaning" except through inference from outputs in contexts, and the outputs underdetermine our interpretations in ways the system's fluency actively conceals.

The principle of charity. Davidson's name for the unavoidable interpretive assumption: we cannot understand any speaker we do not first assume to be largely rational and largely right about the world. Without this assumption, we cannot get interpretation started, because assigning meanings requires fixing beliefs and fixing beliefs requires assigning meanings. The principle is a transcendental condition, not an optional courtesy. With a machine, it becomes a trap: the same reflexive charity we extend to human speakers fires when confronted with any sufficiently fluent output, and fluency is not evidence that the conditions warranting charity are satisfied. The fluency-authority decorrelation is the systematic failure of charity's warrant at scale.

Truth-conditions vs. next-token prediction. Davidson's theory of meaning holds that to understand a sentence is to know the conditions under which it would be true—to grasp its truth-conditional relation to the world. A language model generates text not by assessing truth-conditions but by predicting the most probable next token given the preceding ones. The output can be identical while the underlying relation could not be more different: the model's words are tied to a corpus, not to the world. When corpus and world align, the model tells the truth—not because it aimed at truth but because probability happened to point that way. When they diverge, the model follows probability and produces a fluent falsehood, because nothing in it answers to the constraint that makes charity appropriate.

The Swampman. In 1987, Davidson imagined a being physically identical to himself, assembled by lightning from a dead tree while Davidson stood nearby, that nonetheless has no thoughts at all. Its words were not "learned in a context that would give them the right meaning—or any meaning at all." Content, on Davidson's account, is constituted by causal history—by the actual past relations between a creature, its words, and the world. A language model is a kind of Swampman: its parameters were fixed not by a life of contact with the world but by optimization over text. Its words may have no content in their own right; whatever aboutness they carry may be borrowed from the interpreters who read them. Davidson himself supplies the partial reply: the Swampman "simply needs time in which to acquire a causal history." A system embedded in the world, acting and being corrected, could begin to acquire what mere training cannot confer.

Anomalous monism and the irreducibility of mind. Davidson holds that every mental event is a physical event, but that there are no strict laws connecting mental and physical descriptions—the mental is irreducible even while non-supernatural. The AI implication is double-edged: the presence of a mechanical, fully physical description does not settle that there is no mind (we are also "just physics"), but the absence of the rational, holistic, world-anchored structure does settle it in the negative. No amount of mechanistic interpretability could by itself establish that a system has or lacks a mind, because the mechanistic description and the mental description are irreducibly different. The question of mind is not down there among the weights, even though the mind, if there is one, is nothing over and above them.

Debates & Critiques

The central philosophical debate around Davidson and AI is whether his externalist conditions for meaning—conditions requiring the right causal history, genuine triangulation with a shared world, beliefs bound into holistic webs accountable to reality—are genuinely necessary or whether they reflect a parochial account of meaning calibrated to biological minds. The strongest challenge to Davidson comes from functionalists who argue that if a system's behavioral dispositions are sufficiently well-organized and sufficiently responsive to the world (through fine-tuning, through tool use, through reinforcement from human feedback), then the conditions for attribution of content are met regardless of the biological or historical substrate. Davidson's own behaviorism about evidence sits in tension with his Swampman: if mental ascription is governed by what an ideal interpreter could recover from dispositions, and if a fine-tuned model's dispositions are sufficiently world-tracking, then by Davidson's lights the behavioral evidence may warrant the attribution even without the specific historical origins he specified. The tension is live and unresolved. A second debate concerns whether Davidson's principle of charity is descriptive or normative: if we do extend charity to machines, does that mean machines have earned it, or does it mean we are making a systematic and dangerous mistake? Davidson's framework suggests the latter—the warrant for charity is not created by extending it—but the question is contested. Emergent capabilities research adds a further complication: as models scale, their behavioral dispositions exhibit increasing coherence and world-tracking, which may begin to satisfy Davidson's conditions even if the origins remain non-biological.

The Interpreter's Trap

Davidson's three tests for machine meaning

Test One

Radical Interpretation

Can the system be interpreted by constructing a truth theory that pairs its utterances with real-world truth-conditions? If its words track the corpus rather than the world, the interpretation, however well-formed, answers to nothing in the machine.

Test Two

Holistic Coherence

Do the system's apparent beliefs constrain one another the way a web of genuine beliefs must? A genuine believer cannot casually contradict the surrounding web; the web pushes back. A model can assert incompatible things across contexts because there is no standing web to exert tension.

Test Three

The Swampman Question

Does the system have the causal history that constitutes content? Its words were shaped by optimization over text, not by encounter with a world. Whether the history of fine-tuning begins to constitute the right kind of causal connection is the open question Davidson's framework locates precisely.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

The Interpreter's Trap

Related Entries

Further Reading