
The cycle initiated by [YOU] on AI asks what it means to see the machine clearly. The comprehension problem is the most precise answer to what you see when you look: a system of enormous competence and no understanding, performing the outputs of comprehension without the comprehension itself. This is not a temporary state that will resolve as systems scale; it is, in Wooldridge’s analysis, a structural consequence of learning from text without grounding in the world. Whether it can be resolved, and what would resolve it, is the deepest open question in the field.
The comprehension problem also clarifies the cycle’s most important practical message: the machine amplifies whatever judgment you bring to it, and the machine cannot supply the judgment it amplifies. A system without comprehension cannot recognize when it has been pushed past the boundary of what it reliably knows, cannot distinguish the case where its fluent output is well-grounded from the case where it is confident confabulation. The human who uses the system must supply that recognition, must maintain the judgment that the system cannot have about its own outputs. The comprehension problem is the technical foundation of the cycle’s insistence on human agency in the loop.
The concept is elaborated in Wooldridge’s A Brief History of Artificial Intelligence (2020) and The Road to Conscious Machines (2020), both published just before the large language model revolution made it acute. He anchors the discussion in the Chinese Room argument that philosopher John Searle published in 1980, which argued that syntax—the manipulation of symbols according to rules—is not sufficient for semantics—the grasp of meaning. The room produces correct Chinese because it has a complete rulebook; the person inside understands nothing because the rules connect symbols to other symbols, never to what those symbols mean. Wooldridge treats the argument as a clarification of what understanding would require rather than a proof that machines cannot have it, but uses it to make precise what is absent in current systems.
The comprehension problem connects to the symbol grounding problem that Stevan Harnad formulated in 1990: symbols acquire meaning only through grounding in something that is not itself symbolic, ultimately in sensorimotor interaction with the world. A system trained entirely on text—on symbols derived from symbols—has no grounding in this sense. Its symbols connect to an enormous network of other symbols but to no perception, no action, no world that can be interrogated independently of what other humans have said about it.
Fluency Without Comprehension. The central observation is that fluency and comprehension are separable—that a system can produce language indistinguishable from that of an understanding speaker while having no model of the world the language is about. The separation was always theoretically possible; the large language models have demonstrated it empirically at scale. The Turing test was designed on the assumption that nothing could pass it without understanding; the test has now effectively been passed, and understanding is no clearer than before. The criterion has been met and the question it was supposed to settle has not been settled.
Ungrounded Symbols. The system’s representations are anchored to patterns in training text, not to the world the text is about. When a model correctly states that Paris is the capital of France, it has not consulted any fact about Paris; it has reproduced a pattern that was overwhelming in its training data. When it incorrectly states something that would have been easy to verify against reality, it has not failed to check; it has no mechanism for checking, no world to check against. The grounding that would anchor the symbols to reality is structurally absent.
Characteristic Failure Modes. The comprehension problem predicts the specific failures that characterize current systems: hallucination (confident generation of false information, because there is no mechanism for distinguishing fact from plausible-sounding pattern); adversarial fragility (small perturbations that defeat the system while being invisible to any understanding observer, because the system has learned the surface patterns and not the underlying structure); and task degradation at the edges (fluent performance on typical cases, catastrophic failure on genuinely novel ones that require grasping a situation rather than recognizing its linguistic shape).
Integration as the Missing Ingredient. Wooldridge holds that we have some components of intelligence—systems that see, systems that reason, systems that converse—but no idea how to build a system in which these are unified into a single, grounded comprehension. A child understands language because the language is grounded in a world the child inhabits, perceives, and acts in. Integration of perception, action, and linguistic competence in a common representation is what comprehension requires, and it is precisely what the architecture of current large language models does not provide.