CONCEPT

The Enclosure of the Training Commons

The appropriation of commons-produced knowledge — Wikipedia articles, open-source code, Creative Commons works — as training data for proprietary AI models, extracting value from shared resources while privatizing the resulting capabilities, creating a parasitic dynamic that undermines the incentive structure sustaining the commons.

AI companies trained their language models on the accumulated output of commons-based peer production: Wikipedia's 60 million articles, billions of lines of open-source code, Creative Commons–licensed cultural works, and publicly available research. This training data was freely accessible because communities of contributors had shared it under open licenses, creating a commons of knowledge and expression. The resulting AI models are overwhelmingly proprietary — owned by the companies that trained them, accessed through commercial APIs, governed unilaterally by corporate boards. The commons fed the machine, and the machine's outputs are privatized. This represents enclosure in Benkler's framework: the conversion of a shared resource into private property, extraction without reciprocity, and the disruption of the contribution ecology.

In The You On AI Field Guide

The enclosure operates at a structural level that existing intellectual property frameworks were not designed to address.

In The You On AI Field Guide

Keep reading with YOU ON AI