Gesture research has progressively overturned the view that hand movements merely accompany speech. Tversky's work, alongside Susan Goldin-Meadow's, demonstrates that gestures encode spatial relationships that the speaker may not be able to articulate verbally, that blocking gesture impairs spatial reasoning, and that speakers gesture even when no one is watching. Gesture is part of how thinking happens, not a byproduct of thinking already completed. The implication for AI collaboration is stark: text-only interfaces systematically discard the gestural channel through which much spatial cognition occurs, impoverishing the communication between builder and machine.
When a builder describes a user flow to a colleague in person, her hands trace the temporal sequence — user approaches here, face is detected there, the response occurs above. These spatial encodings carry information that the accompanying words often leave implicit or ambiguous. The colleague absorbs the spatial structure through the gestural channel even when the verbal channel is incomplete. When the same builder types the description into a prompt, the gestural channel is lost entirely.
This is a specific instance of Tversky's broader claim about embodied cognition: thinking happens in and through the body. The hand that gestures is not illustrating a completed thought but completing the thought itself. Research on speakers who lose the ability to gesture — through restraint or neurological damage — shows corresponding impairments in spatial reasoning tasks, suggesting that gesture is causally involved in the cognitive work, not merely correlated with it.
The AI implication extends beyond simple input methods. If gesture is part of spatial thinking, then interfaces that incorporate gestural input could restore a channel that current text-based interaction has eliminated. Emerging multimodal systems — accepting sketches, manipulations, and gestural input — point toward the recovery of this lost dimension. But until such interfaces mature, builders collaborating with AI work with one cognitive hand tied behind their backs.
There is also a subtle cost to the collaboration's output quality. When the builder's gestural reasoning is unavailable to the model, the model must infer spatial structure from impoverished verbal cues. Hallucinations and misunderstandings may occur not because the model is failing at language but because the builder's thought was never fully in language to begin with.
Tversky's gesture research developed in dialogue with Susan Goldin-Meadow's work at the University of Chicago from the 1990s onward. The collaboration produced a body of experimental evidence that has reshaped how cognitive scientists understand the relationship between language, gesture, and spatial reasoning.
Gesture as constitutive. Hand movements are not illustrations of thought but components of thinking, especially for spatial problems.
Blocked gesture, blocked thought. Experimental evidence shows that preventing gesture impairs performance on spatial reasoning tasks.
The gestural channel loss. Text-based AI interfaces eliminate the gestural dimension of human communication, reducing the information the builder can transmit.
Multimodal futures. Interfaces that accept gesture, sketch, and spatial manipulation alongside language could restore cognitive channels current AI tools systematically discard.