The cycle’s central mirror is the AI system held up to human self-understanding. Quine’s contribution to that mirror is the most deflationary available: he spent his career arguing that the features we most want to attribute uniquely to human minds—determinate meanings, fixed references, foundations for knowledge—were never there in the first place, in us or in the philosophical tradition that tried to secure them. The machine’s semantic predicament is our own, magnified until we can finally see it. When we ask whether the model “really” understands its words, Quine’s framework answers: that question presupposes a determinacy of meaning that nothing, human or machine, ever possessed, and the honest response is not to answer it but to dissolve it by seeing clearly what meaning was always limited to.
This is not consolation; it is a sharper challenge. If the model’s semantic indeterminacy is our own, we cannot use that indeterminacy as the criterion that distinguishes it from us. We must look elsewhere, and Quine’s framework points exactly where: to the sensory surface, the perceptual rootedness in a world that provides the irritations from which all knowing begins. The model has no surface in this sense. It has only text—the frozen residue of others’ perceptual encounters with the world. The gavagai problem afflicts it with redoubled force, because not even the jungle and the rabbit are present; only the sentence about the jungle and the rabbit. This is the precise sense in which its knowledge is thinner than ours, and Quine’s empiricist framework is what locates the thinning exactly.
His naturalized epistemology also provides the most honest framing for AI evaluation. The right questions about a learning system are not foundationalist—what are the certain premises from which its outputs are derived?—but naturalistic: by what process did this system come to represent the world, and how reliable is that process? These are empirical questions that Quine would have recognized as continuous with his own project, and they are the questions that honest AI benchmarking is actually trying to answer, even when it doesn’t know Quine’s name.
Quine was born in Akron, Ohio, in 1908 and died in Boston in 2000, spending virtually his entire adult life at Harvard, where he became one of the most influential philosophers of the twentieth century and the philosophical adversary of Rudolf Carnap, the leading logical positivist of the age. His 1951 essay “Two Dogmas of Empiricism” is one of the most important papers in the history of analytic philosophy. Its two targets were the analytic-synthetic distinction—the idea that some truths hold by virtue of meaning alone and are immune to empirical revision—and the dogma of reductionism—the idea that each meaningful sentence can be individually confirmed or refuted by its own slice of experience. Both dogmas, Quine argued, were unjustified and mutually supporting. Their joint abandonment led to his holism: if nothing is true by meaning alone, and sentences do not face the tribunal of experience individually, then the whole of knowledge forms one large interconnected web revised under empirical pressure only at the edges.
His 1960 masterwork Word and Object pressed this into the theory of meaning, generating the indeterminacy of translation and the inscrutability of reference through the now-classic “gavagai” thought experiment. His 1969 essay “Epistemology Naturalized” proposed the transformation of epistemology into a branch of psychology: stop looking for the foundation of knowledge and start studying, empirically, how natural creatures actually get from sensory stimulation to a theory of the world. This naturalistic turn is the move that makes him the unacknowledged founding philosopher of machine learning, which is exactly the natural-process account of knowledge formation he called for.
Confirmation holism. The totality of our knowledge forms a web that faces experience only at the edges, as a corporate body. No single belief is individually confirmed or refuted; what we revise when experience surprises us is a matter of strategy, not logic. This maps exactly onto the distributed, holistic representation of a trained neural network: every weight is load-bearing for distant outputs, surgical editing ripples through the system, and catastrophic forgetting is the disaster Quine’s picture predicts when revision is allowed to tear through the deep interior. The geography of revisability Quine drew by hand is now an empirical property researchers can measure.
Gavagai and the inscrutability of reference. The linguist in the jungle who correlates “gavagai” with passing rabbits cannot, from behavioral evidence however complete, determine whether the word means rabbit, undetached rabbit part, or temporal stage of a rabbit. Each reading is compatible with every possible observation of assent and dissent. This is the indeterminacy of translation, and it applies to the language model with redoubled force: its entire training signal is behavioral, it never touches a rabbit, and so the question of what its word “rabbit” refers to has no more determinate answer than the linguist’s—and possibly less, because the jungle and the rabbit are not even present.
Underdetermination of theory. Many incompatible theories can fit all possible evidence equally well. In machine learning this is not a theoretical concern but a literal and reproducible fact: many weight configurations achieve the same training loss and test accuracy while differing internally. The choice among them is determined not by the data alone but by inductive biases—architecture, regularization, initialization. Quine’s account of why we prefer some theories to empirically equivalent rivals (simplicity, conservatism) is, almost word for word, a theory of inductive bias.
Naturalized epistemology. Stop looking for indubitable foundations for knowledge and start studying, empirically, how natural systems get from stimulation to theory. This is the methodological premise of machine learning, which trains a physical system on stimulation and measures the reliability of the resulting theory. Every evaluation study is a piece of naturalized epistemology in Quine’s sense. His framework also forecloses the dream of grounding AI knowledge in a certified, foundational base: there is no such base, for machines or for us, and a model’s knowledge, like ours, is a web without foundations, justified only by the reliability of the process that produced it.