The CARE Principles establish that indigenous data belongs to the communities that produced it and must serve their collective benefit. Developed through international indigenous consultation and formalized by the Global Indigenous Data Alliance, the principles hold that communities have authority to control how their data is collected, stored, used, and shared; that data handlers bear responsibility to respect community governance; and that data use must align with indigenous ethical frameworks including cultural protocols and values. Applied to AI, CARE challenges the industry's standard practice of treating all available data as training material, asserting instead that indigenous knowledge requires indigenous consent and governance.
The CARE Principles emerged from decades of indigenous activism against exploitative research practices. Historically, researchers extracted knowledge from indigenous communities—collecting specimens, recording languages, documenting practices—and returned little or no benefit while often violating cultural protocols governing what knowledge could be shared and how. The resulting publications, museum collections, and databases became resources for Western institutions while the source communities remained impoverished. The biopiracy debates of the 1990s—when pharmaceutical companies patented compounds derived from traditional medicines without compensating or acknowledging indigenous knowledge-holders—catalyzed demands for indigenous control over biological and cultural resources. The CARE Principles extend this sovereignty claim into the digital domain.
The principles directly challenge the open-data movement's assumption that making data freely available maximizes social benefit. For indigenous communities, unrestricted data availability can cause harm: sacred knowledge disclosed inappropriately, cultural practices misrepresented, community-held knowledge appropriated for commercial gain without consent. The CARE framework insists that some data should not be open—that communities have the right to maintain boundaries around their knowledge, to determine who may access it and for what purposes, and to ensure that use serves collective benefit rather than external extraction. This position conflicts with the AI industry's appetite for comprehensive training data, creating a governance tension that current policy frameworks have not resolved.
Ramesh Srinivasan's application of CARE to AI development emerged from his fieldwork with Zuni Pueblo and communities in Oaxaca. He documented how digital systems designed without indigenous input systematically failed to accommodate indigenous knowledge forms—oral traditions, communal ownership, sacred protocols, relational epistemologies. The CARE Principles provided a governance framework for what his ethnography revealed: that indigenous communities possess sophisticated knowledge about domains—ecology, agriculture, medicine, astronomy—that AI systems claim to address, and that including this knowledge requires not extraction but partnership, not data-scraping but negotiated collaboration governed by indigenous authority.
The tension between CARE and current AI practice is structural. Large language models depend on massive training corpora assembled by scraping available text. Obtaining indigenous community consent for every piece of knowledge embedded in these corpora would require a development process fundamentally different from the current industrial model—slower, more expensive, legally complex, and respectful of communities' right to refuse. The AI industry has not demonstrated willingness to accept these constraints. The resulting impasse reveals that the industry's commitment to inclusion is conditional: communities are welcome to benefit from AI tools provided they do not demand governance authority over the knowledge that makes those tools possible.
The CARE Principles for Indigenous Data Governance were developed through a multi-year international consultation process led by the Global Indigenous Data Alliance, formalized at the International Data Week conference in Botswana in 2018. The principles built on earlier frameworks including the United Nations Declaration on the Rights of Indigenous Peoples (2007) and decades of indigenous resistance to extractive research. Tahu Kukutai (Māori) and John Taylor were instrumental in the development process. Ramesh Srinivasan's contributions included empirical documentation of digital colonialism and advocacy for extending data sovereignty into technology design processes. The principles have been adopted by research institutions, funding agencies, and indigenous organizations worldwide as a governance standard for research involving indigenous communities.
Collective benefit, not extraction. Data use must produce value for indigenous communities—not merely avoid harm, but actively contribute to community-defined goals and priorities.
Authority to control. Communities possess the right to determine data collection methods, access conditions, use permissions, and benefit-sharing arrangements—a governance authority that supersedes researchers' or corporations' claims.
Responsibility to respect protocols. Data handlers bear obligations to honor indigenous cultural protocols, decision-making processes, and ethical frameworks governing knowledge transmission and use.
Ethics grounded in indigenous values. Data governance must align with indigenous definitions of right relationship, reciprocity, and collective wellbeing rather than defaulting to Western ethical frameworks.
Challenge to AI's extractive model. Applied to artificial intelligence, CARE requires that indigenous knowledge cannot be scraped and processed without community consent—a demand that conflicts with the industry's current training-data practices and forces a reckoning with whose knowledge the amplifier was built to serve.