CONCEPT

Socialization vs. Training

Harry Collins's 2025 formulation of the deepest structural difference between how humans acquire collective tacit knowledge—through living inside a community that holds them accountable—and how AI systems learn—through processing the textual residue of a community they never inhabit.

Learning from the internet is not the same as socialization. Harry Collins stated this with characteristic directness in his 2025 paper on AI and the sociology of knowledge, and the distinction it draws is the most consequential claim his framework makes for the builders of artificial intelligence. Socialization is not high-bandwidth learning; it is a different kind of process entirely. A child acquires a language not by reading about language but by living inside a language-using community that corrects, praises, embarrasses, and rewards with attention the child's attempts to participate. The correction matters not because of the information it contains but because of the social consequences it carries: the child cares about getting it right because getting it wrong has consequences for belonging, identity, and the relationships that constitute the child's social world. A system trained on text cannot be embarrassed. It cannot experience the social consequences of norm violation. It has undergone no primary socialization and therefore cannot possess the collective tacit knowledge that socialization produces—including the moral compass that Collins identifies as a specifically social product, and the capacity to distinguish between a claim the community would recognize as warranted and a claim it would recognize as overreach. Large language models possess interactional fluency without social embedding, which means they can speak the language of expertise without having undergone the formation that makes expertise trustworthy.

In the [YOU] on AI Field Guide

The cycle opened by [YOU] on AI identifies the fluency of AI output as both its most impressive feature and its most dangerous one. The danger is not that the output sounds wrong; the danger is that the output sounds right when it is wrong, and that evaluating the difference requires exactly the knowledge that is most systematically absent from a system that was trained rather than socialized. Collins's distinction names the mechanism of this absence. The model has processed everything the community has ever written. It has not attended a conference, received a rebuke from a senior practitioner, or discovered through embarrassment that a concept it thought it understood was being misused. The text contains the community's conclusions. The socialization contains the community's standards for reaching conclusions—standards that are maintained through the ongoing social practices of the community, not derivable from the published record.

The implications are most acute in domains where collective tacit knowledge is densest: professional communities with strong norms about what counts as good work, scientific communities with tacit standards for what counts as evidence, ethical communities with evolving shared senses of what is and is not acceptable. In all these domains, the model's interactional fluency can produce outputs that satisfy the surface criteria for expertise while violating the substantive criteria that the community actually holds. And the violation will be invisible to anyone who lacks the contributory expertise to hear it—which, as AI outputs proliferate, increasingly means nearly everyone.

The apprenticeship problem that Collins identifies is directly connected to this distinction. The apprenticeship is not merely practice at mimeomorphic tasks. It is the process through which the apprentice is socialized into the community: subjected to correction, held accountable to standards, and gradually incorporated into the social fabric of the practice. Remove the apprenticeship and you remove the socialization. A generation of practitioners who learned their craft by directing AI, rather than by undergoing the social formation that the craft historically required, may produce work that is competent by mimeomorphic standards and impoverished by polimorphic ones.

Origin

Collins articulated the socialization-training distinction most sharply in his 2025 paper 'Socialization vs. Training: Why Artificial Intelligence Cannot Acquire Collective Tacit Knowledge,' building on forty years of empirical work on communities of scientific practice. The distinction derives ultimately from Emile Durkheim's foundational claim that social facts are not reducible to individual psychological facts, and from the tradition of social epistemology that has developed it: the claim that some knowledge is constitutively social, residing in the practices of groups rather than the minds of individuals.

The concept converges with Harry Frankfurt's account of the difference between a wanton and a person. A wanton, in Frankfurt's vocabulary, acts without second-order attitudes toward its own actions—without caring whether its outputs are endorsed by its own evaluative structure. Training produces a system without the social formation that constitutes the analogous capacity at the collective level: a system that generates without the community accountability that would make the generation trustworthy. Socialization is the process that produces the moral compass Collins identifies as missing; training is the process that produces the fluency that makes the absence invisible.

Key Ideas

Accountability is constitutive. What makes socialization different from training is not the volume or quality of the input but the accountability relationship. The socialized learner is held accountable by a community whose judgment matters to them. The trained system is not held accountable in the social sense; it is optimized against a loss function. These are not equivalent processes, and they do not produce equivalent knowledge.

The moral compass as social product. Collins's claim that LLMs have no moral compass is not a claim about their character but about their formation. A moral compass is a product of collective tacit knowledge—of the community's sense of what is acceptable, maintained through ongoing social participation. It cannot be produced by training on descriptions of what is acceptable, because the descriptions are the explicit residue of a social practice, not the practice itself.

The surface without the structure. A system trained on the textual output of an expert community acquires the surface vocabulary, the canonical arguments, the standard framings. It does not acquire the structure beneath the surface: the implicit standards for when to use each argument, the community's evolving sense of what counts as good reasoning, the subtle norms that distinguish appropriate from inappropriate use of a concept. This is why Collins's gravitational wave physicists could not be replicated by an LLM: the reasoning they used to dismiss the fringe paper depended on the structure, not the surface.

Time is not the solution. The optimist response to Collins's argument often takes the form: more training time, more diverse data, more fine-tuning from human feedback will close the gap. Collins's response is that these are all variations of training, not socialization, and no variation of training can produce accountability in the sense that socialization requires. The gap is not one of degree but of kind.

Debates & Critiques

The most sustained debate about the socialization-training distinction concerns whether reinforcement learning from human feedback (RLHF) and related techniques represent a form of socialization. In RLHF, human raters evaluate model outputs and the model is trained to produce outputs that humans rate highly. The optimists argue that this is a form of accountability: the model is being corrected by humans in a way that resembles the correction that constitutes socialization. Collins's response is twofold. First, the raters themselves are not a community in the relevant sense—they are individuals applying explicit criteria, not a coherent social group with shared tacit standards. Second, the model does not experience the correction as a social event with consequences for its identity and belonging, because it has no identity that belongs to a community. The correction is a gradient update, not an embarrassment. The distinction between a gradient update and an embarrassment is, in Collins's framework, the distinction between training and socialization, and it is precisely this distinction that determines whether the resulting system can possess collective tacit knowledge. Harry Frankfurt's framework adds precision: the socialized person forms second-order attitudes about what they produce, endorsing some outputs as genuinely their own and repudiating others as inconsistent with their values. The trained system has no such structure. It generates without the evaluative hierarchy that makes caring—and therefore the moral compass that caring produces—possible.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

Related Entries

Further Reading