The Turing Test asks whether a machine can mimic individual human intelligence convincingly enough to fool an interlocutor. The question is psychological: can this system pass as a person? Star's alternative asks a different question: does this system strengthen or weaken the social bonds of the community that uses it? Does it incorporate differing viewpoints or flatten them? Does it distribute capability or concentrate it? Does it create conditions for collective deliberation or render deliberation unnecessary? The Durkheim Test is sociological rather than psychological, collective rather than individual, and its neglect for thirty-six years reflects the individualist assumptions embedded in the culture that produced both the Turing Test and the technologies it was designed to evaluate. Applied to the AI tools of 2025 and beyond, the test produces concerning results: democratization of capability is real, but the structural tendencies point toward weakened social bonds, flattened perspectives, and reduced collective deliberation.
Star was a sociologist of science whose work on classification, infrastructure, and distributed cognition made her an unusual voice in early AI debates. Her 1989 proposal — made at the founding conference of Distributed Artificial Intelligence — was not idle provocation. It grew from her conviction that AI systems are always already social, embedded in communities that use them, and that evaluating them outside of that social embedding produces a systematic misunderstanding of what they are and what they do.
The proposal was largely ignored because the AI field of the 1990s was dominated by cognitive modeling and symbolic reasoning, with evaluation criteria inherited from cognitive psychology and formal logic. The rise of large language models has not fundamentally changed the evaluation framework: contemporary AI systems are still evaluated primarily by their individual-level performance on benchmarks, and the collective consequences of their deployment are treated as secondary or as policy concerns separate from the technical evaluation proper.
The practical implementation of the Durkheim Test would require institutional infrastructure that does not currently exist. It would require professional bodies with the normative authority to define collective evaluation criteria, the practical knowledge to apply them, and the legitimacy to make their assessments consequential. The EU AI Act classifies systems by individual-level risk categories. American executive orders emphasize safety and accountability at the individual level. None of the current regulatory frameworks ask the sociological questions Star's proposal identified as essential.
The results of applying the test to contemporary AI are specific and diagnostic. On bond-strengthening: the structural tendency is toward weakening as solo-building replaces team-building. On perspective-incorporation: the tool reflects the statistical distribution of its training data, a single (if vast) perspective, rather than genuinely disagreeing from a different position with different stakes. On capability distribution: the results are favorable — the democratization is real. On collective deliberation: the tendency is to make deliberation optional, and optional rituals erode over time.
Susan Leigh Star proposed the Durkheim Test in her 1989 paper "The Structure of Ill-Structured Solutions: Boundary Objects and Heterogeneous Distributed Problem Solving," presented at the AAAI Distributed Artificial Intelligence Workshop and later published in the proceedings. Her framing was deliberate: she was inverting the Turing Test's individualist assumptions to propose a collective alternative.
The proposal received little engagement from AI researchers in the years that followed. It has been rediscovered periodically by sociologists of technology and, most recently, by the AI ethics literature as the individual-focused evaluation frameworks have proven inadequate to the collective consequences of large-scale deployment. Star died in 2010; her sociology of infrastructure and boundary objects has had far wider influence than the Durkheim Test, though the test may yet prove her most prescient contribution.
Sociological, not psychological. The test asks what the system does to communities, not whether it can imitate an individual.
Four evaluation dimensions. Bond-strengthening, perspective-incorporation, capability distribution, and collective deliberation — each structurally distinct.
Largely failed by current AI. The technology performs well on capability distribution, poorly on the other three dimensions.
Requires institutional implementation. No individual can apply the test; it requires professional bodies with normative authority.
The Turing Test has been answered; the Durkheim Test remains open. The question of machine intelligence is effectively settled. The question of machine contribution to collective life is not.