PERSON

Abeba Birhane

The cognitive scientist who opened the black boxes feeding the world’s largest AI models and found, inside them, a vast unaudited inheritance of human prejudice—demonstrating through empirical audit rather than argument that scale does not dilute harm, it concentrates it.

Abeba Birhane came to artificial intelligence from an angle almost nobody else takes: through cognitive science, by way of the embodied and enactive tradition that treats mind as an activity of a living body coping with a world rather than computation running on a chip. That starting point determined what she noticed when she turned to the systems marketed as artificial minds. What she noticed, repeatedly, was that the field had skipped a step. It was building tools to sort, classify, and predict human beings at planetary scale while almost nobody was checking what those tools were trained on. Her landmark audits of large-scale training datasets—exposing racist and misogynistic labels in ImageNet and finding that hateful content in LAION grew measurably as the dataset scaled from 400 million to two billion samples—converted diffuse worry into documented fact and forced one canonical resource offline. But her contribution is not merely forensic. Her framework of relational ethics proposes a different ontology: that a person is not a bounded data point but a node in a dense web of relations, constituted and reconstituted through interaction, and that any system promising to predict persons is therefore engaged in a quiet act of narrowing that no fairness metric can reach. Birhane argues in [YOU] on AI's terms that the deepest danger is not a machine becoming more like us but a humanity agreeing to become machine-readable, convenience by convenience, until the moving target holds still.

In the [YOU] on AI Field Guide

The cycle's central question—are you worth amplifying?—presupposes that the amplifier is at least pointed at you. Birhane is the thinker who opens the amplifier and inspects what it actually contains. Her audits of training datasets reveal that the systems amplifying human language and image at planetary scale were built on foundations that had never been examined, inheriting the web's worst patterns alongside its richest knowledge. The practitioner who takes the orange pill and decides to work with AI is, in Birhane's framing, working with a tool shaped by millions of decisions about what data to include, how to label it, and whose reality to treat as the universal default—decisions made without audit, released as infrastructure.

Her concept of the embodied mind against the disembodied machine runs directly through the cycle's concern with cognitive outsourcing. When a tool that has no body, no stakes, no situation, and no experience of what it means for something to matter becomes part of one's thinking—extended into cognition the way writing or instruments always have been—its biases and reductions do not bias from outside. They bias from within. The practitioner who outsources judgment to a disembodied system built on extracted and unaudited data is not merely using a flawed tool; she is incorporating a flawed model of the human into her own reasoning.

Birhane's positive contribution—relational ethics—offers the most philosophically demanding response to AI alignment in the cycle. Where alignment research typically asks how to specify human values precisely enough to optimize toward them, she argues that the problem is prior: the model of the human implicit in the entire enterprise—a discrete, stable, fully describable unit that can be measured and predicted—is false. Persons are relational, ambiguous, and constituted in webs of relation rather than bounded as data points. Any system built on the false model will reproduce the falseness at scale, and no debiasing algorithm reaches the root because the root is not a bias in the data. It is a way of seeing.

She stands in the cycle's gallery alongside Kate Crawford, who mapped the institutional infrastructure of AI's material reality, and alongside Andrew Abbott, who showed that the organizations building AI are not merely building tools but constructing jurisdictional arrangements that determine who has power over the systems that govern others. Birhane's specific contribution is to have gone inside the data—to have done the unglamorous, often disturbing work of opening the box and counting what is there, so that the conversation can no longer be conducted on the terms of those who built the system.

Origin

Birhane was born in Ethiopia and educated at the Open University and University College Dublin, where she earned a doctorate in the embodied and enactive cognitive science tradition under a complex software lab working on how minds actually behave. Her formation was as far from a machine learning lab as a mind can get while still producing AI scholarship: she came up through psychology and philosophy, through the traditions of Francisco Varela and Evan Thompson that insist cognition is not symbol-shuffling inside a skull but an activity of a whole body coping with an environment. This starting point produced a characteristic method: audit first, then reason. Do not theorize about the data. Open it and count.

Her first major AI work, undertaken with Vinay Prabhu in 2020, applied this method to ImageNet and the 80 Million Tiny Images dataset—canonical training sets on which a generation of vision systems had been built, cited and reused thousands of times without examination. What they found was ugly: racist and misogynistic labels, non-consensual imagery, and categories attaching slurs to photographs of real people who had never agreed to be there. MIT withdrew the dataset. The episode was not merely a finding about data quality; it was a finding about a field that had poured enormous intellectual energy into model architectures and benchmark scores while treating data collection as a solved and uninteresting problem.

Her LAION audits, undertaken with collaborators including Sang Han and Vishnu Naresh Boddeti, extended the method to the billion-scale datasets behind generative AI. The central finding inverted the field's foundational faith: scaling did not dilute hateful content, it concentrated it. As the dataset grew from 400 million to two billion samples, the Hate Content Rate rose by roughly twelve percent, and the anti-Black associations in models trained on the data grew worse with scale. The scaling hypothesis—the belief that capability and, implicitly, safety emerge from sheer size—had no answer for this, because the hypothesis had never been tested for harm the way it had been tested for capability.

Key Ideas

The model is not the world. The central error Birhane identifies across all of AI is the confusion of model and reality: the tendency to mistake a reductive snapshot for the thing it models. A face-recognition label is not a fact about a person. A risk score is not a future. A recommendation is not a preference. The confusion is not random; it bends reliably toward the assumptions of those who built the system and the patterns dominant in the data, which means it bends away from those already on the margins. The first act of resistance is keeping the distinction alive.

Scale scales harm. The audit of LAION demonstrated empirically what intuition should have suspected: that scraping more of the open web does not dilute its toxic fringe but amplifies it. The datasets that feed the most capable and widely deployed generative systems were built on a contaminated substrate, and the contamination grew measurably with scale. No amount of post-hoc debiasing reaches this root, because the root is in the data-generation process itself—in the decisions about what to scrape, what to include, and whose content to treat as raw material without consent.

Relational ethics over rational atomism. Birhane's positive framework begins from the proposition that existence is fundamentally co-existent in a web of relations. A person is not a discrete data point but a node in a dense net of relationships, contexts, and histories, continually constituted through interaction. Relational ethics therefore begins with the individuals and groups most impacted by a system, granting them a central rather than advisory role, and treats justice not as a property a system can possess once and for all but as an ongoing, situated practice to be renewed in each context.

The impossibility of automating ambiguity. Human beings are complex adaptive systems: open, dynamic, self-organizing, irreducible to their measurable averages, and entangled with environments they continuously remake. Attempting to predict human behaviour through machine learning is not merely technically difficult; it is a category error with consequences. The systems impose legibility—order, stability, predictability—onto something that is by nature active, fluid, and unpredictable. Worse, the attempt to capture the moving target may, deployed at sufficient scale, narrow the target until it holds still: not because the prediction became accurate but because the human became more predictable.

The audit as leverage. Birhane established the rigorous empirical audit as a form of political leverage. A measured Hate Content Rate or a documented set of dehumanizing labels in a dataset that a company uses cannot be dismissed the way a critic's worry can. The audit shifts the burden of proof. It converts the vague reassurance that systems are “broadly safe” into a specific and reproducible claim that can be contested. Independent auditing—conducted by researchers with no stake in the outcome and no contract that can be revoked when findings embarrass—is, in her account, the precondition for accountability that means anything.

Debates & Critiques

The central tension in Birhane's work is between the demand for structural change and the need for immediate practical action. Critics who broadly share her goals argue that refusing every technical fix—every debiasing algorithm, every fairness metric, every transparency tool—as a patch that leaves harm-producing structures intact can paralyze the field at the very moment when partial improvements would protect real people from real harm. Birhane's response is not to abandon technical work but to insist that it be accompanied by an honest account of what it cannot reach: a debiasing technique applied to a model trained on a dataset that scales hate, built by a research culture that ignores societal harm, serves concentrated power—such a technique may improve a metric while leaving the entire harm-producing pipeline intact. A second debate concerns the scope of the relational ethics framework: some argue that prioritizing those most impacted, while morally compelling, provides insufficient guidance for designing systems at scale, where the communities of impact are multiple, overlapping, and sometimes in conflict. Birhane acknowledges that her framework offers critical examination rather than tidy solutions—and she regards that honesty as a feature, not a failure, in a field that too often mistakes a clean solution for a complete one. The deepest disagreement is philosophical: whether human ambiguity and irreducibility make behavioural prediction impossible in principle, or merely very difficult in practice. Embodied cognitive science supports the in-principle limit; much of the machine learning community treats the limit as a technical frontier still to be crossed.

The Auditor's Triad

Birhane's three commitments that make accountability real

First commitment

Open the Box

Do not theorize about the data. Do not accept the field's self-presentation. Go inside the training set, annotate the labels, count the hate, document the non-consensual images. The audit converts diffuse anxiety into documented fact, and documented facts are the only basis on which accountability can be demanded.

Second commitment

Name the Direction

Bias in AI systems is not random error but directional error—bending reliably toward the assumptions of those who built the system and the patterns dominant in the data, away from those already marginalized. Naming the direction is the first step toward contesting it.

Third commitment

Center the Impacted

Justice is not a property a system can possess once and for all. It is an ongoing practice, built from the margins inward, beginning with those who bear the harm rather than those who deploy the system. Independence—from corporate funding, from the firms whose systems are audited—is the precondition for accountability that means anything.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

The Auditor's Triad

Related Entries

Further Reading