The canonical image schemas include CONTAINMENT (things are inside or outside categories, arguments fall within or outside a theory's scope, persons are in love or in trouble), BALANCE (emotional stability, fairness, mathematical equality — the scales of justice, a balanced argument), PATH (life as journey, career progression, narrative structure — stories go somewhere, projects move forward), and FORCE (causation, argumentation, motivation — strong and weak arguments, being driven by a goal). Each schema is grounded in a universal bodily experience: every human body has been contained, has maintained equilibrium against gravity, has moved along paths, has exerted and encountered force. The schemas are universal because the bodily experiences that produce them are universal.
What makes image schemas consequential for the AI debate is their specific embodied grounding. Large language models do not have bodies. They do not maintain equilibrium against gravity. They do not walk paths. They do not grasp objects or encounter resistance. The image schemas that structure human thought — the foundational patterns from which all abstract reasoning is built — are absent from their cognitive architecture entirely. And yet every sentence in their training data is saturated with these schemas. When an AI system produces the sentence "We need to get this project back on track," it produces an utterance structured by the PATH schema (the project moves along a path) and the FORCE schema (something has pushed it off, and effort must return it). The system produces the sentence because the statistical regularities of its training data capture the pattern. But the system has never walked a path. The image schema that gives the sentence its cognitive content for a human speaker is absent from the system's processing. The words are present. The experiential structure that gives the words meaning for an embodied mind is not.
This asymmetry defines the terms of human-AI collaboration. The human contribution to any collaboration with AI is not merely directional — telling the machine what to do — but evaluative: assessing whether the machine's output is grounded in genuine understanding or merely in statistical plausibility. The evaluation requires the image schemas the machine does not possess. When a human reads a passage and feels that something is off — before she can articulate what is wrong — she is performing an embodied assessment. The BALANCE schema registers an asymmetry. The FORCE schema detects insufficient resistance. The CONTAINMENT schema notes that the argument does not hold. These are not conscious deliberations. They are pre-reflective, somatic, enacted by the same neural circuits that compute physical balance and physical force and physical containment. They are available only to an embodied evaluator.
The concept of image schemas was developed by Mark Johnson in The Body in the Mind (1987) and elaborated by Lakoff in subsequent work. The framework drew on Jean Piaget's developmental psychology, phenomenological philosophy (particularly Maurice Merleau-Ponty), and emerging work in cognitive linguistics on the bodily foundations of grammar and meaning.
In Philosophy in the Flesh (1999) and The Neural Mind (2025), Lakoff and his collaborators argued that image schemas are not merely psychological constructs but are implemented in specific neural circuits — circuits originally evolved for sensorimotor control and repurposed for abstract thought through a process they call neural metaphorical mapping.
Pre-conceptual structure. Image schemas are not metaphors but the patterns from which metaphors are constructed, acquired through bodily experience before language develops.
Embodied grounding. The schemas are rooted in universal bodily experiences — containment, balance, motion, force — available to any creature with a body like ours.
Neural implementation. The schemas are implemented in sensorimotor circuits that are recruited for abstract cognition, making the body's neural architecture the mind's conceptual architecture.
Absence in disembodied systems. Image schemas are absent from large language models, which process the linguistic surface of schema-saturated text without access to the bodily grounding that gives the schemas cognitive content.
Evaluative function. Image schemas enable the embodied ground check — the pre-reflective somatic evaluation through which humans detect that something is off before they can articulate what.
The strong neural-implementation claim remains contested. While neuroimaging evidence supports some degree of sensorimotor involvement in abstract reasoning, the specificity of the implementation and its necessity for concept formation are actively debated. Disembodied-cognition researchers argue that abstract reasoning can proceed through amodal representations that do not require specific bodily grounding.