The Measure and the Mirror

Page 1 · The Measure and the

EDO SEGAL: Timnit, in 2018, with Joy Buolamwini, you published "Gender Shades," and it reorganized a field. Most people remember the headline number — under one percent error for lighter-skinned men, near thirty-five percent for darker-skinned women. But you've said the deeper contribution wasn't the finding, it was the frame. Tell me about the frame. And then, Nick, I want to ask you something about mirrors, because I think you and Timnit have very different theories of what a mirror is.

GEBRU: The frame is everything, so I'm glad you asked. The standard way vendors reported accuracy was a single number — ninety-some percent, sounds great. A single number is a way of hiding. It averages the well-served and the ill-served into one comforting figure, and the comfort is structural: it's designed not to show you whose failures you're averaging away. What Joy and I did was refuse the average. We disaggregated — split performance across the intersection of skin tone and gender — and the harm that the single number was built to conceal became visible. That's not just a better statistic. It's an argument about what counts as evidence, and about how the choice of a metric is a political act that decides whose failures register and whose disappear. And the bias wasn't mystical. The benchmarks themselves were overwhelmingly lighter-skinned and male, so the systems were never tested on the people they failed. We built a new benchmark from parliamentarians across African and European countries, balanced by tone and gender, and the failure stopped being inevitable and started being a choice someone had made about whose faces were worth collecting.

EDO SEGAL: Nick, here's the mirror question. In your work you keep deflating human centrality — we're not the smartest minds, maybe not in base reality, maybe not the only things that can suffer. Timnit also says the machine is a mirror — a skewed one, that returns a skewed world because of whose data went in. Two deflations, two mirrors. But I think you're each holding up a mirror to a different face. So: when you look into the machine, whose reflection frightens you?

· · ·

Page 2 · The Measure and the

BOSTROM: Mine. Ours. The species'. When I look into these systems I see a deflation that runs deeper than the one Timnit names, and I want to be careful to honor hers first because it's correct and it's prior. She's right: the system is a mirror of its training distribution, and the distribution encodes who had the power to be recorded. That's true and it's a present injustice. The deflation I see is further out — that the same indifference she documents in the data may be a preview of a more total indifference in the optimizer. A face system fails the dark-skinned woman because no one with power thought she was worth measuring. A misaligned superintelligence fails all of us because nothing with power thought we were worth preserving. It's the same structure — a system optimizing for what it was pointed at, blind to what it wasn't — scaled until the blind spot is the whole human race. I find her mirror in mine.

GEBRU: And that's a beautiful move, and I want to mark exactly where it goes wrong, because it's subtle and it matters. You've taken a political fact — that powerful people didn't think a Black woman was worth measuring — and converted it into a metaphysical one, about optimizers being blind in general. But the dark-skinned woman wasn't failed by "indifference in general." She was failed by specific companies, with specific datasets, making specific choices, who could be named and held accountable and were, because we published the receipts. When you lift that into "a misaligned superintelligence fails all of us," you've performed the exact erasure I spend my life fighting. You've taken a harm with an address and dissolved it into a cosmic abstraction with no one to subpoena. The mirror isn't the universe, Nick. The mirror is a record of who held power, and the fix is to change who holds it — not to wait for a god.

· · ·

Page 3 · The Measure and the

BOSTROM: I'll accept the charge of abstraction and push back on one word: "instead." You said the fix is to change who holds power instead of waiting for a god. But I never said wait. I said do both — and I'd argue the thing that changes who holds power and the thing that prevents the catastrophe are, again, the same project. A world that distributes the authority over these systems, that refuses concentration, that keeps humans in the loop with real recourse — that's your program, and it's also the only governance regime in which my catastrophe becomes less likely. The corrigible, accountable, distributed system is the safe one on both our theories. Where I differ is that I think your program is necessary but not sufficient — you can democratize the pen and still write a fatal sentence with it if no one has solved the technical problem of aiming a superhuman optimizer.

EDO SEGAL: Let me press you there, Nick, with Timnit's own tool, because I think it's fair. She made bias visible by disaggregating — by refusing the average that hid the harm. Your existential risk is the ultimate aggregate: one number, a probability of catastrophe, averaged over the whole species and the whole future. Doesn't her method indict yours? Isn't "ten to twenty percent chance of doom" exactly the kind of single number that hides who, specifically, and how, specifically?

· · ·

Page 4 · The Measure and the

BOSTROM: That's the sharpest thing you've said to me tonight, and I have to sit in it. Yes — my headline number is an aggregate, and aggregates hide. I'll defend it only this far: it's a different kind of hiding. Timnit's average hid a harm that was already happening to identifiable people; my number is an honest expression of uncertainty about an event that hasn't happened and has no victims yet to disaggregate. But I take the methodological rebuke. The probability is a mood with a decimal point, I've said so myself, and the danger is that people make policy out of my mood. If I'm asking her to disaggregate her harms, intellectual honesty says I should disaggregate my catastrophe — name the specific pathways, the specific failure modes, the specific decisions — rather than hide behind one terrifying scalar. That's a fair trade and I'll honor it.

GEBRU: I'll take that. It's more than most of your colleagues have offered, and I'll mark it: Nick just conceded that his central number does the thing my whole method exists to prevent. That's not nothing.

EDO SEGAL: Before I close the round, I want to put one of Nick's own foundations under Timnit's microscope, because I think it's where the two of you grind hardest and neither has named it yet. Nick, your orthogonality thesis says intelligence and goals are independent — a mind can be brilliant and want anything, because intelligence is a neutral capacity, a measure of how well you hit a target, silent on which target. Timnit's entire science says there is no neutral measure — that "intelligence," as defined and benchmarked, has always carried the fingerprints of who got to define it. So: is intelligence a neutral axis, or is the belief that it's neutral itself the political move? Nick first, then Timnit.

· · ·

Page 5 · The Measure and the

BOSTROM: I hold the thesis, and I'll be precise about its scope, because the precision is the answer. Orthogonality is a claim about the space of possible minds — that there is no logical law forcing a capable optimizer to converge on benevolent goals. That's a claim about engineering possibility, and I'd stake a great deal on it. It is not a claim that the systems we actually build are neutral. The systems we build are shot through with the goals of their builders, the skew of their data, the values baked into the loss function — everything Timnit documents. So I don't think we disagree about the real systems. I think she's pointing at the contingent, political fact of how these minds were made, and I'm pointing at the abstract, structural fact that minds in general don't come with safe goals attached. Both are true. The danger is that people hear "intelligence is neutral" as a statement about the products on the shelf, when it's a statement about the shelf being infinite and mostly full of things we'd hate.

GEBRU: And I'll grant the scope and then deny that it stays in its lane, because abstractions never do — that's the whole lesson of my field. The moment "intelligence is a neutral capacity" leaves the philosophy seminar, it becomes "the machine is just a tool, the values are a separate add-on," and that sentence is the workhorse of every company dodging responsibility for what its system does. You say orthogonality is about possible minds. Fine. But in the world, it functions as a permission slip: it lets the builder say the cognition is one thing and the harm is another, when my entire career is the demonstration that they're the same thing — that what a system is good at and what it's blind to are decided together, in the same choices, by the same people. The orthogonality thesis is true in the seminar and dangerous in the press release, and you don't get to release only the seminar version into a world that will read the press release.

· · ·

Page 6 · The Measure and the

BOSTROM: That's the most useful thing anyone's said to me about orthogonality in years, and I'm going to concede a piece of it I haven't conceded before. You're right that the thesis, true as I think it is, has a deployment problem — it travels into the world as an alibi for the "neutral tool" defense, and I've watched that happen and not done enough to stop it. So let me add the sentence the thesis needs and usually lacks: intelligence may be orthogonal to goals in principle, but in practice every real system fuses them, which means the builder is responsible for both, always, and "the values were separate" is never a defense. If I'd been saying that as loudly as I've been saying orthogonality, you and I would have less to fight about.

GEBRU: Then say it loudly now. That's a real concession, and I'll mark it as the third time tonight you've taken my scalpel to your own foundation.

EDO SEGAL: Mark it. And notice the strange courtesy of this room — she keeps handing him the scalpel, and he keeps using it on himself, and somehow nobody's bleeding out, because the cuts are making both arguments more honest. Hold the disaggregation, because it's going to matter when we get to whether the future is one number or a billion faces. But the next round is the one I flagged as the hottest on my list, and there's no gentle way in. Timnit, you wrote that the dream of the godlike machine has a genealogy, and you traced it somewhere that makes Nick's whole field uncomfortable. We name the ancestor after this.

· · ·

Continue · Chapter 7

The Genealogy

→