Mediators of Individual Data — Orange Pill Wiki
CONCEPT

Mediators of Individual Data

Lanier and Weyl's proposed institutions — voluntary, member-governed organizations that would aggregate the bargaining power of individual data contributors and negotiate collective terms with AI platforms, analogous to labor unions, professional associations, and music collection societies.

Mediators of Individual Data (MIDs) are Lanier and Weyl's answer to a structural problem that individual rights-based approaches cannot solve: individual data contributors have no leverage against AI platforms, no matter how clearly their rights are articulated. A single novelist whose book was used in training cannot negotiate meaningfully with OpenAI. A single developer whose code was absorbed into Codex cannot extract compensation from GitHub. Individual consent frameworks, individual opt-out mechanisms, and individual litigation are all inadequate to the scale asymmetry. What is required is collective organization — institutions that aggregate the bargaining power of millions of individual contributors into a force capable of negotiating meaningful terms with the platforms that consume their work. MIDs are Lanier's proposal for what those institutions would look like: voluntary, member-governed, representing contributor interests, negotiating royalty rates and usage terms, distributing payments, and enforcing compliance. The historical precedents are well-established. The specific application to AI training data is not yet built.

In the AI Story

Hedcut illustration for Mediators of Individual Data
Mediators of Individual Data

The MID concept emerged from Lanier and Weyl's 2018 HBR article and has been developed in subsequent publications and policy advocacy. It belongs to a broader family of proposals that includes data sovereignty, platform cooperatives, and various forms of data trusts — all attempts to restructure the relationship between individuals and digital platforms in ways that give individuals meaningful collective voice.

The analogy to labor unions is deliberate and instructive. Unions emerged in response to a specific power imbalance: individual workers had no leverage against employers who could replace them at will, but collective workers could withhold labor in ways that employers could not ignore. The history of labor rights is the history of individual workers who had no power gaining power through collective action. The same principle applies, Lanier argues, to data contributors. Individually powerless, collectively indispensable.

The music industry offers the closest functional precedent. Collection societies like ASCAP, BMI, and SOCAN aggregate the rights of thousands of individual songwriters and publishers, negotiate blanket licensing terms with broadcasters and venues, collect royalty payments, and distribute them to members. The system is imperfect — distribution often favors already-successful artists, administrative costs consume significant portions of revenue, and smaller creators still struggle to earn livings. But it demonstrates that collective rights management at scale is technically and institutionally feasible.

The MID proposal faces specific obstacles that distinguish it from earlier rights-management institutions. Training data spans virtually every form of creative and intellectual output — text, images, code, music, video, academic research, professional documentation. No existing organization represents contributors across all these categories. The contributors themselves are dispersed across industries, nations, and legal jurisdictions. And the AI platforms have strong incentives to prevent MIDs from forming, because any successful MID would immediately raise the platforms' training data costs from zero to some positive number. The obstacles are real. They are not insurmountable. Every labor right in industrial history was declared unachievable before it was achieved.

Origin

The MID concept emerged from Lanier and Weyl's collaboration, which extended from the 2018 HBR article through subsequent policy work and Weyl's development of plurality frameworks with Audrey Tang and collaborators. The specific framing built on Weyl's economic work on radical markets and Lanier's long-standing argument that the digital economy required new institutional forms to replace the extractive architecture of siren servers.

The concept has been refined through engagement with existing efforts at collective data governance — data cooperatives, data trusts, privacy collectives — each of which represents a partial approach to what MIDs would systematize. None of these efforts has achieved the scale or coverage Lanier and Weyl envision, but their existence demonstrates that the direction is coherent and the building blocks exist.

Key Ideas

Collective bargaining is the proven mechanism. The history of labor rights, consumer protection, and creative rights management all demonstrate that collective organization can rebalance power asymmetries that individual rights cannot address.

MIDs would be voluntary and member-governed. Like unions and collection societies, MIDs would function best when they represent their members' actual interests rather than being imposed from above. The voluntary structure is crucial for both legitimacy and sustainability.

Multiple MIDs would compete for members. Lanier and Weyl envision not a single monopoly MID but multiple competing organizations, each representing a different segment of contributors or offering a different approach to negotiation and distribution. Competition among MIDs would prevent the institutional capture that afflicts some existing collection societies.

The scale of required coordination is unprecedented. Training data contributors span virtually every profession, nation, and language. Organizing them at the scale required to negotiate meaningfully with global AI platforms is a political and logistical challenge without clear precedent.

The platforms will resist. AI companies have strong economic incentives to prevent MIDs from forming. Any successful MID would immediately transform training data from a free resource into a licensed input, raising costs substantially. The political economy of MID formation is therefore structurally adversarial.

Debates & Critiques

The dominant skepticism about MIDs holds that the specific conditions that made unions and collection societies effective — concentrated industries, identifiable worker populations, clear legal frameworks — do not exist for AI training data. Contributors are too dispersed, the legal landscape too unsettled, and the economic value of individual contributions too small to support the institutional overhead. Lanier's response is that these obstacles are genuine but not qualitatively different from those faced by earlier rights-management efforts in their formative stages. The obstacles require institutional innovation, political organizing, and time — not impossibility. A related debate concerns whether MIDs should focus on licensing training data (negotiating fees for inclusion) or on royalty distribution (negotiating payments based on output usage). Lanier generally favors the royalty approach because it ties compensation to value creation, but acknowledges that licensing may be more tractable as an initial mechanism.

Appears in the Orange Pill Cycle

Further reading

  1. Jaron Lanier and E. Glen Weyl, 'A Blueprint for a Better Digital Society,' Harvard Business Review (September 2018).
  2. Imanol Arrieta-Ibarra, Leonard Goff, Diego Jiménez-Hernández, Jaron Lanier, and E. Glen Weyl, 'Should We Treat Data as Labor?' AEA Papers and Proceedings (May 2018).
  3. Audrey Tang and E. Glen Weyl, Plurality: The Future of Collaborative Technology and Democracy (2024).
  4. Trebor Scholz and Nathan Schneider, eds., Ours to Hack and to Own: The Rise of Platform Cooperativism (OR Books, 2016).
  5. Sylvie Delacroix and Neil Lawrence, 'Bottom-up Data Trusts,' International Data Privacy Law (2019).
Part of The Orange Pill Wiki · A reference companion to the Orange Pill Cycle.
0%
CONCEPT