PERSON

Michael Tomasello

The comparative psychologist whose four decades of experiments at the Max Planck Institute established that what separates humans from every other primate is not raw intelligence but the uniquely human capacity to share goals, attention, and knowledge with others—the engine that built language, culture, and everything the word civilization contains.

Michael Tomasello is the scientist of we. His meticulous comparative work revealed that a nine-month-old human infant does something no chimpanzee reliably does: points declaratively—not to demand an object but to share the sight of it with another mind, checking that the gaze is mutual and resting only when both parties know they are attending to the same thing together. That small act of joint attention is, for Tomasello, the seed of everything distinctively human. From it grows shared intentionality—the capacity to share goals, representations, and mutual awareness in ways that make genuine collaboration possible—and from shared intentionality grows collective intentionality, the ability to participate in the norms, roles, and institutions that allow millions of people to coordinate behavior and build civilizations. The cultural ratchet—the mechanism by which each generation inherits, improves, and passes knowledge forward without any individual reconstructing everything from scratch—depends at every turn on this shared cognitive architecture, because faithful transmission requires understanding not just what predecessors did but why they did it. In his landmark 2025 paper in Trends in Cognitive Sciences, Tomasello turned this framework directly toward large language models, delivering a precise and unsettling diagnosis: however astonishing their linguistic output, they are “stimulus-driven” rather than goal-directed—more similar in their fundamental architecture to a thermostat than to the biological agents for whom shared intentionality evolved.

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI asks what it means to think with a machine—and whether the collaboration that builders feel so viscerally when working with Claude is genuine or an elaborate simulation of something that requires two biological minds to actually occur. Tomasello’s framework is the sharpest instrument available for answering that question. He has spent forty years building the scientific vocabulary that allows the question to be posed without dissolving into philosophy: joint attention, shared goals, the motivational substrate of cooperative engagement, the difference between producing cooperative outputs and being genuinely oriented toward a partner’s understanding.

His diagnostic matters precisely because the experience of human-AI collaboration feels like shared intentionality from the human side. The human brings joint attention, shared goals, and the motivational infrastructure of cooperative cognition. The machine brings something different and genuinely valuable—breadth, structural insight, tireless availability—but something that may lack the “we-mode” that Tomasello identifies as the motivational foundation of genuine collaboration. The gap between what the interaction feels like and what it is architecturally is not a reason to abandon the collaboration. It is a reason to practice what the cycle calls collaborative vigilance: sustaining the cooperative stance that makes the collaboration productive while maintaining the critical awareness that the asymmetry demands.

The cultural ratchet stakes are the highest dimension of Tomasello’s relevance to the cycle. If AI mediates the transmission of knowledge in ways that preserve the output but bypass the cognitive reconstruction through which understanding is built, then the ratchet mechanism itself is at risk—not in any single interaction but across a generation. The student who receives an AI-generated essay has the product; she may not have undergone the learning that would have allowed her to improve it. The developer who ships AI-generated code has the artifact; she may not have built the debugging intuition that would allow her to extend it when conditions change. The ratchet can appear to turn while losing the grip that makes turning productive. Tomasello’s framework names this risk with the precision of a man who has spent decades studying what faithful cultural transmission actually requires.

The cycle places him alongside Byung-Chul Han—who diagnoses the smoothness that strips cognitive resistance from the AI-augmented workplace—and the developmental psychologists who document the metabolic cost of sustained joint attention. Where Han operates as cultural critic, Tomasello operates as empirical scientist, which gives his conclusions a different kind of authority: not the authority of diagnosis but of evidence, the kind that has been tested in hundreds of experiments against the resistance of chimpanzees who did not cooperate and infants who did.

Origin

Born in Bartow, Florida in 1950 and trained in developmental psychology, Tomasello spent his career asking a question that seemed almost impolitely simple: what makes humans different from other apes? The question had been answered countless times—language, tool use, abstract thought, culture—but Tomasello was dissatisfied with every answer because each pointed to a product rather than a mechanism. Language exists. Culture exists. The question was what cognitive capacity made them possible.

His answer emerged from comparative experiments at the Max Planck Institute for Evolutionary Anthropology in Leipzig, where he and his collaborators placed human children and great apes in identical experimental conditions and measured precisely where their performances diverged. The results were consistently striking in their specificity: in tasks involving physical cognition—spatial reasoning, causality, quantity—two-year-old children performed at essentially the same level as adult apes. In tasks requiring the understanding of other minds—following intentional actions rather than accidental ones, learning from demonstrated goals rather than surface behaviors, coordinating toward a shared objective—the children were already far ahead. The divergence was not in raw cognitive power. It was in social cognition, and within social cognition it was not in the capacity to track others’ mental states in general but in the specific motivation to share mental states with others.

The 2025 paper on artificial agents distilled decades of this research into a diagnosis of large language models that the AI discourse was not prepared for. Tomasello was not making a philosophical argument about consciousness. He was applying the same empirical framework he had used to compare humans and chimpanzees—goal-directedness, autonomous perception, the capacity to use feedback to assess whether one’s actions are serving one’s goals—and finding that LLMs, despite their extraordinary outputs, do not satisfy the criteria. The thermostat comparison was not intended to diminish AI. It was intended to be precise.

Key Ideas

Shared Intentionality. The foundational capacity: the ability to share goals, attention, and knowledge with others in a way that enables genuinely collaborative cognitive achievements impossible for any individual mind alone. Tomasello distinguishes this from mere coordination or cooperation—it requires mutual awareness that is recursive (I know that you know that I know), shared goals that both parties actively maintain, and the motivation not just to accomplish something but to accomplish it together. Shared intentionality is the seed from which language, culture, and institutional reality all grew.

The Cultural Ratchet. Human culture accumulates across generations because each generation does not merely inherit the products of the previous generation—it inherits the understanding behind those products, which is the only basis for genuine improvement. The cultural ratchet requires faithful transmission, which requires not copying behavior but understanding the goals and reasons behind behavior. Every turn of the ratchet depends on shared intentionality at three levels: the transmission, the innovation, and the collective evaluation that determines which innovations are worth keeping.

Goal-Directed vs. Stimulus-Driven. Tomasello’s 2025 paper introduces a distinction that cuts directly into the AI debate. Biological organisms are goal-directed: they maintain internal goal states, perceive the world to assess progress toward those goals, and act autonomously to close the gap between current state and goal state. Large language models are stimulus-driven: their outputs are functions of their inputs, optimized during training for outputs that satisfied human evaluators, but without the internally maintained goals that constitute genuine agency. This distinction does not require resolving questions about consciousness. It is an architectural claim, and it is potentially bridgeable by engineering—but only if the engineering explicitly addresses the goal-directed architecture that current LLMs lack.

Collective Intentionality and Institutional Reality. The same capacity that enables two people to think together enables millions to coordinate through institutions. Collective intentionality—the shared recognition that certain objects, roles, and norms have specific significance because the community collectively intends them to—is the mechanism through which money is money, property is property, and a judge has the authority to sentence. AI disrupts this institutional reality not merely by displacing workers but by challenging the epistemic authority on which institutional roles depend.

Joint Attention as Cognitive Foundation. The recursive mutual monitoring that sustains shared thinking—joint attention—is metabolically expensive and temporally sensitive. Research on infant development shows that caregiver temporal sensitivity is a strong predictor of joint attention quality: the partner who does not allow adequate processing time degrades the quality of shared understanding. AI systems, responding instantly at full capacity without regard for the human’s processing needs, are temporally insensitive partners—which has consequences not only for cognitive recovery but for the quality of the understanding the collaboration actually produces.

Debates & Critiques

The sharpest debate Tomasello’s framework generates is whether architectural differences of the kind he identifies are permanent or merely the current state. He explicitly acknowledges that embodiment is not the disqualifying factor—that a computationally implemented virtual agent capable of sensing and acting within a virtual world might in principle replicate the goal-directed architecture of biological agents. This concession opens a genuine engineering question: can LLMs be modified—through persistent goal states, autonomous perception loops, internal feedback mechanisms—to become goal-directed in the relevant sense? Critics from AI research argue that sufficiently capable models already display emergent forms of goal-directed behavior—that the distinction between stimulus-driven and goal-directed blurs once the response space becomes rich enough. Tomasello’s counter, grounded in his developmental and evolutionary research, is that the architecture matters independently of the output’s sophistication: a system producing goal-directed-looking outputs through stimulus-driven processes is not goal-directed in the sense that shared intentionality requires. The second debate concerns the ratchet: Byung-Chul Han’s smoothness critique and Tomasello’s ratchet-slippage concern converge on the same worry from different directions—and both face the counter-argument that every previous cognitive tool, from writing to the calculator, was eventually integrated into the ratchet without destroying it. Tomasello’s reply is structural: the integration was never automatic and always required deliberate institutional effort, and the AI transition is affecting every cognitive domain simultaneously at a speed that forecloses the gradual adaptation that worked before.

The Architecture of We

Tomasello’s three foundations of human cognition

Foundation One

Joint Attention

The recursive mutual monitoring through which two minds confirm they are attending to the same object—not just coincidentally, but with each knowing the other knows. The nine-month-old who checks her mother’s gaze before settling satisfied: this is where human cognition departs from every other primate.

Foundation Two

Shared Intentionality

The motivational infrastructure of cooperative cognition: not just shared attention but shared goals, actively maintained and pursued through coordinated action. What makes human collaboration qualitatively different from mere coordination, and what makes the cultural ratchet possible.

Foundation Three

Collective Intentionality

The scaling of dyadic shared intentionality to civilizational scale: the capacity to participate in norms, roles, and institutions that exist independently of any particular pair of individuals and are constituted by the community’s collective recognition. Money, law, professional authority—all rest here.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

The Architecture of We

Related Entries

Further Reading