CONCEPT

Socialization vs. Training

Collins's 2025 distinction between the social process through which humans acquire collective tacit knowledge and the statistical process through which machines extract patterns from textual output — the difference that defines what AI can and cannot absorb from human practice.

Socialization vs. training is the distinction Collins made most sharply in his 2025 AI & Society paper: 'learning from the internet is not the same as socialisation.' The internet contains the textual residue of social life. Socialization is the process by which humans are raised inside social life — the thousands of daily corrections, demonstrations, implicit lessons, and embodied participations through which a child absorbs her community's collective tacit knowledge. A child does not learn norms by reading about them. She learns them by living inside them, being corrected when she violates them, observing what is praised and punished, absorbing the community's sense of what is appropriate in contexts that no text could enumerate.

In the AI Story

Hedcut illustration for Socialization vs. Training — Socialization vs. Training

The distinction is decisive for the AI debate. An LLM trained on internet text has access to an extraordinary volume of text about what communities value, practice, and believe. But it has undergone no primary socialization. It has not been corrected by a parent, praised by a teacher, embarrassed by a peer. It does not know what it feels like to violate a norm, because it does not experience norms as constraints. It processes descriptions of norms. It does not live inside them.

This is why, Collins argues, large language models 'have no moral compass.' The phrase is not moral judgment of the machines. It is sociological observation. A moral compass is a product of collective tacit knowledge transmitted through socialization and maintained through ongoing social participation. A system that has not been socialized cannot possess it. It can mimic the surface of moral reasoning — reproducing arguments about ethics with the same fluency it reproduces arguments about physics — but cannot distinguish genuinely moral positions from rhetorically persuasive ones, because the distinction depends on collective tacit knowledge the machine has not acquired.

The practical implication is that no amount of additional training data can close the gap. The gap is not a quantity-of-text problem. It is a kind-of-process problem. To acquire what socialization provides, a system would need to undergo socialization — which means participation in the social life of a community, not exposure to the community's textual record. Whether any AI paradigm can provide such participation remains genuinely open.

Origin

Collins articulated the distinction most directly in his 2025 AI & Society paper on AI and the sociology of knowledge. The framing built on decades of prior work on tacit knowledge, community of practice, and the social constitution of expertise.

Key Ideas

Textual residue, not social life. The internet captures what communities have written, not the processes through which they live.

Primary socialization is irreplaceable. The developmental process of being raised in a community has no textual substitute.

No moral compass. The absence of socialization explains the absence of reliable moral judgment in LLMs.

Not a data problem. More text does not provide what socialization provides; the kind of process matters more than the quantity of input.

Appears in the Orange Pill Cycle

Harry Collins

Socialization vs. Training

In the AI Story

Origin

Key Ideas

Appears in the Orange Pill Cycle

Related Entries

Further reading