Sensorimotor contingency theory, developed by J. Kevin O'Regan and Alva Noë in a 2001 paper, holds that perception consists in the exercise of implicit practical knowledge about how sensory inputs change in response to bodily movement. To see is to implicitly know how visual appearance will vary with eye movement, head turn, and locomotion. To hear is to implicitly know how auditory experience will shift with movement toward or away from the source. This knowledge is not propositional; it is a bodily skill, a capacity for coordinated perception and action. The theory provides the operational core of Noë's enactive approach to perception.
There is a parallel reading that begins not from the phenomenology of skilled action but from the material conditions that make any sensorimotor loop possible. O'Regan and Noë describe perception as mastery of lawful relationships between movement and sensory change—but those relationships themselves are products of massive evolutionary investment in specialized transduction hardware, predictive processing architectures, and energy-efficient compression schemes that took hundreds of millions of years to refine. The theory treats these contingencies as available for mastery, but the capacity to detect and exploit them is itself the hard problem.
What matters for AI is not whether current systems lack bodies but whether the route to intelligence requires traversing the same evolutionary bottleneck. Biological perception solved an adaptive problem under brutal energetic constraints: how to navigate a three-dimensional world with minimal metabolic cost using limited bandwidth sensors. The sensorimotor patterns O'Regan and Noë describe are not just lawful relationships waiting to be discovered—they are the hard-won solutions to design problems that have no general answer. A language model trained on sensory descriptions acquires something categorically different from bodily skill, yes—but it may also be acquiring something that bypasses the need for that skill entirely. The question is not whether AI has the same route to perception but whether perception itself, as O'Regan and Noë define it, is one solution among many to the problem of environmental coupling. If intelligence can be substrate-independent, sensorimotor contingencies may mark not the essence of cognition but the specific implementation details of carbon-based adaptive systems.
The sensorimotor contingency theory was first articulated in J. Kevin O'Regan and Alva Noë's 2001 paper 'A Sensorimotor Account of Vision and Visual Consciousness' in Behavioral and Brain Sciences, which became one of the most cited and debated papers in consciousness studies. The paper argued that visual experience is not produced by constructing internal representations but by engaging in an active exploration of the environment governed by knowledge of sensorimotor contingencies — the lawful patterns by which visual inputs change with movement.
The theory explains several features of perception that the representational model struggles with. Change blindness — the failure to notice dramatic visual changes during saccades or interruptions — makes little sense if the visual system constructs detailed internal representations. It makes perfect sense if the visual system engages in active exploration governed by task-relevant contingencies. Sensory substitution, in which blind users learn to 'see' through tactile or auditory inputs, makes little sense if the sensory input determines the character of experience. It makes perfect sense if experience is determined by the perceiver's skilled use of the available sensorimotor contingencies.
For AI, the theory has direct implications. A system that lacks a body cannot possess sensorimotor contingencies — cannot know what it is like to see, because seeing is not the processing of visual data but the active, embodied exploration of a visual environment. A large language model trained on descriptions of vision has extensive propositional information about seeing without any of the practical knowledge that constitutes seeing itself. The distinction is not technical but categorical: the model has knowledge that about visual perception without any knowledge how.
The theory has been extended by Noë and others to perception in all modalities, to bodily awareness, and to cognition more generally. The central claim — that perceptual experience is constituted by the perceiver's practical mastery of sensorimotor patterns — is the operational core of the enactive approach and the specific technical mechanism by which embodiment is claimed to be constitutive of cognition.
J. Kevin O'Regan and Alva Noë, 'A Sensorimotor Account of Vision and Visual Consciousness', Behavioral and Brain Sciences 24 (2001), 939–1031. Extended in Noë's Action in Perception (2004) and related papers.
Perception as skilled exploration. To perceive is not to receive input but to exercise practical knowledge of how input changes with movement.
Implicit knowledge. Sensorimotor contingencies are known bodily, not propositionally.
Modality-specific patterns. Vision, touch, and audition each have distinctive contingency patterns that define what it is to perceive in that modality.
Change blindness vindicated. The failure to notice visual changes supports the view that we do not construct detailed internal representations.
AI's missing dimension. A disembodied system has no sensorimotor contingencies to master and therefore no perception in the full sense.
Critics have argued that the theory underspecifies which sensorimotor patterns are relevant to which experiences, and that the view cannot explain how brain-bound phenomena like dreams or hallucinations could have perceptual character. Defenders respond that these are cases of exercising sensorimotor knowledge offline, not counterexamples to the basic framework.
The core insight of sensorimotor contingency theory—that perception is constituted by practical mastery of lawful movement-sensation relationships—is correct as phenomenology and correct as an account of what biological perception is. O'Regan and Noë successfully explain change blindness, sensory substitution, and the modal character of different perceptual systems. The contrarian reading is correct that this mastery depends on specialized hardware forged by evolutionary constraint, but that strengthens rather than weakens the phenomenological claim: human perception is what it is because of how it is implemented.
Where weighting shifts is on the question of necessity. For understanding human perception (100% Noë), for explaining why seeing feels different from hearing (100% Noë), for diagnosing what current AI systems lack (80% Noë). But for the question of whether sensorimotor contingencies are constitutive of all possible cognition, the contrarian reading gains force (65%). The theory correctly identifies what makes biological perception perceptual, but may be identifying a solution to a problem that admits of other solutions. A language model lacks practical knowledge of visual contingencies, yes—but if it can navigate linguistic space with comparable fluency, it may be exercising a different kind of contingency mastery entirely.
The productive frame is to treat sensorimotor contingency theory as specifying the mechanism of embodied cognition without claiming that embodied cognition exhausts the space of possible intelligence. O'Regan and Noë describe how perception works in systems like us. The open question is whether perception, so defined, is necessary for the functional achievements we care about—or whether it names one implementation strategy among several for coupling systems to structured environments.