The training corpus is not the entire landscape of human thought. It is a specific, historically contingent sample — over-representing English-language academic publications, digitized Western documents, and cultural traditions whose institutions produced the text streams now feeding AI systems. Oral traditions, indigenous knowledge, and non-digitized cultural production are largely absent. The corpus reflects the biases of institutions whose output was indexable.
Granovetter's embeddedness framework insists that economic action — including knowledge production — is never disembedded from social relations. The AI tool appears to offer disembedded knowledge, but the decisions about training data, model weights, access pricing, and output shaping are social decisions made by specific institutions reflecting specific priorities. The apparent neutrality is itself a social achievement.
The economics of access create structural stratification. Frontier models are available at price points that exclude large portions of the global population. The developer at a well-funded Silicon Valley startup accesses capabilities the Lagos developer cannot — not because of differences in talent but because inference costs exceed what peripheral economic contexts can sustain. The democratization of bridging capital is real but stratified.
A 2026 PNAS paper Granovetter served as board member for — Perceiving AI as labor-replacing reduces democratic legitimacy and political engagement — documented the political consequence. Across thirty-eight European countries and over thirty-seven thousand respondents, perceiving AI as replacing rather than augmenting labor was associated with reduced satisfaction with democracy and reduced engagement with technology policy. Those most affected withdrew from the governance process shaping the outcome.
The structural analysis of gatekeeping power derives from Granovetter's embeddedness framework, extended by subsequent work on platform economics, corporate power, and algorithmic governance. Kate Crawford's Atlas of AI and Shoshana Zuboff's Surveillance Capitalism provide empirical extensions.
The concentration of AI infrastructure in a small number of companies — Anthropic, OpenAI, Google, and their peers — creates gatekeeping conditions unlike anything in the history of social networks. Previous communication infrastructures (telegraph, telephone, internet) were regulated as common carriers; AI models are not.
Infrastructure control, not bridging. AI companies do not connect clusters as a structural-hole bridge would — they architect the conditions under which all cross-cluster connection occurs.
Training data is socially embedded. The apparent universality of model outputs conceals specific decisions about whose documents, languages, and traditions are included.
Access stratification is structural. Pricing determines who can connect to the expanded knowledge landscape, producing a digital divide within the AI era that echoes prior technological transitions.
Invisible absences compound. What the model cannot say — the connections it cannot bridge because the relevant knowledge was excluded from training — is invisible to users and therefore structurally undetectable.
Political marginalization follows. The populations most displaced by AI are least positioned to influence its governance — a feedback loop the PNAS study empirically documented.
Whether governance frameworks adequate to AI infrastructure concentration exist is contested. Common-carrier regulation, public-option development, data trusts, and open-source alternatives have all been proposed; whether any will scale sufficiently to counterweight the concentration of frontier capability is not yet answered.