CONCEPT

The Durkheim Test

Susan Leigh Star's 1989 proposal to evaluate AI systems by their capacity to serve a community rather than by their ability to imitate an individual mind — an inversion of the Turing Test that three decades of AI discourse largely ignored and the AI transition has made urgent.

The Turing Test asks whether a machine can mimic individual human intelligence convincingly enough to fool an interlocutor. The question is psychological: can this system pass as a person? Star's alternative asks a different question: does this system strengthen or weaken the social bonds of the community that uses it? Does it incorporate differing viewpoints or flatten them? Does it distribute capability or concentrate it? Does it create conditions for collective deliberation or render deliberation unnecessary? The Durkheim Test is sociological rather than psychological, collective rather than individual, and its neglect for thirty-six years reflects the individualist assumptions embedded in the culture that produced both the Turing Test and the technologies it was designed to evaluate. Applied to the AI tools of 2025 and beyond, the test produces concerning results: democratization of capability is real, but the structural tendencies point toward weakened social bonds, flattened perspectives, and reduced collective deliberation.

In the AI Story

Hedcut illustration for The Durkheim Test — The Durkheim Test

Star was a sociologist of science whose work on classification, infrastructure, and distributed cognition made her an unusual voice in early AI debates. Her 1989 proposal — made at the founding conference of Distributed Artificial Intelligence — was not idle provocation. It grew from her conviction that AI systems are always already social, embedded in communities that use them, and that evaluating them outside of that social embedding produces a systematic misunderstanding of what they are and what they do.

The proposal was largely ignored because the AI field of the 1990s was dominated by cognitive modeling and symbolic reasoning, with evaluation criteria inherited from cognitive psychology and formal logic. The rise of large language models has not fundamentally changed the evaluation framework: contemporary AI systems are still evaluated primarily by their individual-level performance on benchmarks, and the collective consequences of their deployment are treated as secondary or as policy concerns separate from the technical evaluation proper.

The practical implementation of the Durkheim Test would require institutional infrastructure that does not currently exist. It would require professional bodies with the normative authority to define collective evaluation criteria, the practical knowledge to apply them, and the legitimacy to make their assessments consequential. The EU AI Act classifies systems by individual-level risk categories. American executive orders emphasize safety and accountability at the individual level. None of the current regulatory frameworks ask the sociological questions Star's proposal identified as essential.

The results of applying the test to contemporary AI are specific and diagnostic. On bond-strengthening: the structural tendency is toward weakening as solo-building replaces team-building. On perspective-incorporation: the tool reflects the statistical distribution of its training data, a single (if vast) perspective, rather than genuinely disagreeing from a different position with different stakes. On capability distribution: the results are favorable — the democratization is real. On collective deliberation: the tendency is to make deliberation optional, and optional rituals erode over time.

Origin

Susan Leigh Star proposed the Durkheim Test in her 1989 paper "The Structure of Ill-Structured Solutions: Boundary Objects and Heterogeneous Distributed Problem Solving," presented at the AAAI Distributed Artificial Intelligence Workshop and later published in the proceedings. Her framing was deliberate: she was inverting the Turing Test's individualist assumptions to propose a collective alternative.

The proposal received little engagement from AI researchers in the years that followed. It has been rediscovered periodically by sociologists of technology and, most recently, by the AI ethics literature as the individual-focused evaluation frameworks have proven inadequate to the collective consequences of large-scale deployment. Star died in 2010; her sociology of infrastructure and boundary objects has had far wider influence than the Durkheim Test, though the test may yet prove her most prescient contribution.

Key Ideas

Sociological, not psychological. The test asks what the system does to communities, not whether it can imitate an individual.

Four evaluation dimensions. Bond-strengthening, perspective-incorporation, capability distribution, and collective deliberation — each structurally distinct.

Largely failed by current AI. The technology performs well on capability distribution, poorly on the other three dimensions.

Requires institutional implementation. No individual can apply the test; it requires professional bodies with normative authority.

The Turing Test has been answered; the Durkheim Test remains open. The question of machine intelligence is effectively settled. The question of machine contribution to collective life is not.

Appears in the Orange Pill Cycle

Emile Durkheim — On AI

The Durkheim Test

In the AI Story

Origin

Key Ideas

Appears in the Orange Pill Cycle

Related Entries

Further reading