CONCEPT

The Indeterminacy of Translation

Quine’s proof that two complete translation manuals for a language can fit all possible behavioral evidence while assigning incompatible meanings—and that the words of a language model may refer to nothing more determinate than a human’s.

A linguist stands beside a native speaker in an unfamiliar jungle; a rabbit darts past; the speaker says “Gavagai”; and from this small scene Willard Van Orman Quine launched the most radical attack on meaning in the history of philosophy. The behavioral evidence—the pattern of assent and dissent across every possible situation—cannot decide whether “gavagai” means rabbit, undetached rabbit part, temporal stage of a rabbit, or the fusion of all rabbits. Each reading is compatible with every possible observation, because the speaker assents in exactly the same circumstances under all of them. There is no further fact that fixes the reference. This is the inscrutability of reference, and it underwrites the larger thesis of the indeterminacy of translation: one could construct two complete translation manuals for a language, each perfectly consistent with all behavioral evidence, that nonetheless assign incompatible meanings. Quine’s conclusion was not that we cannot discover the correct meaning—as though truth were hidden—but that there is no correct one to discover. Applied to large language models, the thesis lands with redoubled force: the model’s entire training signal is behavioral, it never stood beside the rabbit in the jungle, and so the question of what its word “rabbit” refers to has no more determinate answer than the linguist’s—possibly less, because the rabbit was never even present. The model’s semantic predicament is the human predicament with the dial turned, and that is not a comfort: it means the favorite question of the AI debate—“but does it really mean anything?”—may be, in Quine’s sense, a question with no fact of the matter behind it, for machines or for us.

In the [YOU] on AI Field Guide

The cycle holds up the AI system as a mirror for human self-understanding, and no mirror is more unsettling than Quine’s. The machine’s apparent failure to grasp determinate meanings is, on his analysis, our own condition revealed—the indeterminacy we had been managing through shared practice and a shared world, now exposed by a system that has the practice (the distributional competence) without the world (the perceptual rootedness). The orange pill, in this frame, is not the comfortable discovery that the machine is fundamentally different from us but the less comfortable discovery that it is not as different as we wished, and that the features we thought were ours alone may have been less firmly ours than we supposed.

The practical upshot for the cycle is a recalibration of the question. Rather than asking whether the model has meaning (which Quine’s framework suggests has no determinate answer), the cycle asks about reliability and grounding: in what domains does the model’s output track the world reliably enough to be trusted, and in what domains does the lack of perceptual grounding produce the familiar failures—confabulation, stale claims, fluent error about how things actually are? These are empirical questions that the indeterminacy thesis frames rather than forecloses.

Origin

The indeterminacy of translation was developed in Quine’s 1960 masterwork Word and Object and followed from the combination of two earlier commitments: his holism (beliefs face experience only as a corporate body, not individually) and his behaviorism about meaning (there are no meanings in the mind, only dispositions to verbal behavior). If holism means no sentence has its own private confrontation with experience, and behaviorism means the totality of possible behavior is all the evidence there is for meaning, then meaning is underdetermined by all possible evidence—which is the indeterminacy thesis.

Quine carefully distinguished inscrutability of reference (the reference of an individual word cannot be fixed by behavioral evidence) from indeterminacy of translation (the meaning of a whole sentence cannot be fixed). The first is the more radical claim; the second follows. Both have been contested. Donald Davidson argued that charity in interpretation—the principle that we attribute to speakers the beliefs that make their assertions most rational—goes some distance toward fixing meaning. Quine replied that charity is itself one more convention imposed by the interpreter, not a discovery about the speaker’s mind, and so does not escape the indeterminacy.

Key Ideas

Inscrutability of reference. A word’s reference cannot be fixed by behavioral evidence, however complete, because different reference schemes—rabbits vs. rabbit-stages, say—are compatible with all the same assents and dissents. There is no fact of the matter about which scheme is correct. Applied to LLM embeddings: the embedding for “rabbit” encodes distributional role, not reference. Whether it refers to enduring animals or their temporal stages is not settled by the behavioral evidence from which the embedding was built.

The double underdetermination of model words. Quine’s linguist at least has the jungle and the rabbit; the model has only text in which others, who had the jungle and the rabbit, wrote. The reference of human words is underdetermined by all possible behavioral evidence; the reference of model words is underdetermined by that plus the absence of the perceptual grounding that provides the only behavioral evidence we actually have access to. The model is in the gavagai situation, with no rabbit at the bottom.

The deflationary lesson. Quine did not conclude that human words have determinate meaning while machine words do not. He concluded that the very notion of a determinate, fully-fixed meaning—the inner glow of significance behind the words that both enthusiasts and skeptics assume must be present or absent in the machine—is a philosopher’s fiction in both cases. The indeterminacy is the human condition, which the machine inherits and, by its starkness, exposes. Meaning talk earns its keep not through a correspondence to inner determinacy but through the reliable coordination of behavior it enables in a shared world.

Debates & Critiques

The indeterminacy thesis has been contested on both technical and philosophical grounds. On the technical side, some philosophers of language argue that causal theories of reference—according to which a word refers to whatever causally grounded its initial use in the community—break the Quinean deadlock, since the rabbit, not the rabbit-stage, was what the ancestor of the word “rabbit” causally encountered. Quine’s response is that the causal grounding itself is available only through behavior and interpretation, and so inherits the indeterminacy. For large language models, multimodal systems provide a partial answer: a model trained on images as well as text has a kind of causal contact with the world that narrows the indeterminacy, though Quine would note that perceptual stimulation, even genuine stimulation, does not by itself fix reference—the point of the thought experiment was precisely that even the perceptual evidence (the passing rabbit) does not settle gavagai. The deepest debate concerns whether the indeterminacy is a deflationary insight about what meaning always was or a reductio of the behaviorist framework that generates it. Defenders of a richer mentalist semantics argue that there is a fact about what speakers mean, accessible not through behavior but through phenomenology and the first-person perspective—which is exactly the territory Quine’s behaviorism refused to enter.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

Related Entries

Further Reading