HOUR ONE — THE CURVE AND THE RECEIPTS

Chapter 4

The Parrot and the Receipts

Page 1 · The Parrot and the

EDO SEGAL: Timnit, in 2021 you and Emily Bender, Angelina McMillan-Major, and Margaret Mitchell published On the Dangers of Stochastic Parrots, and the phrase escaped the academy and entered the language. Most people who use it think it's a taunt. You'd say it's a spec sheet. Give us the spec. And Ray — before you answer, I want you to do something a debate rarely asks. Steelman it. Tell us what the parrot gets right, in your own words, before you tell us what it misses.

GEBRU: It's precisely a spec. A language model is a system for haphazardly stitching together sequences of linguistic forms it has observed in its training data, according to probabilistic information about how they combine — but without any reference to meaning. Stochastic: random in a patterned way. Parrot: the mimicry is real and the comprehension is absent. We chose the words because they were accurate, and because we could see the vocabulary war coming. The industry was already saying "understands," "reasons," "thinks," and every borrowed word was doing unpaid persuasive labor in funding rounds and policy rooms.

But the parrot was one section of the paper, and people forget the rest, and the rest is the part that matters most to tonight. The paper was a cost accounting. The environmental and financial footprint of training these models, which falls hardest on the communities least likely to benefit and most exposed to climate harm. Training corpora scraped so indiscriminately that no human can document them, so the biases of whoever was loudest online get laundered into an authoritative voice and deployed on everyone, including the people the data erased. The opportunity cost of a whole field stampeding down one path because that's where the compute is. And the harm we listed last because it summarized the others: that fluent, ungrounded text at scale would pollute the information ecosystem. We wrote that before the flood. I take no pleasure in how it has aged.

The industry was already saying "understands," "reasons," "thinks," and every borrowed word was doing unpaid persuasive labor in funding rounds and policy rooms.

EDO SEGAL: Ray. Steelman first. What does the parrot get right?

· · ·

Page 2 · The Parrot and the

KURZWEIL: I can do that honestly, because parts of it are simply correct and I won't pretend otherwise. It gets the costs right — the energy, the water, the scraping, the documentation problem. Those are real, they're measurable, and the industry has been cavalier about them. It gets the human reflex right: we do read minds into fluent text, automatically, and that reflex can be exploited and is being exploited. And it gets the warning right that competence inside a distribution doesn't guarantee competence outside it. Every engineer should have that tattooed somewhere. There's the steelman, and I mean it.

Now the place it fails, and it's load-bearing. The claim that there is "no reference to meaning" — that prediction over form can never reach what the form is about — is an empirical bet, and the bet is losing. To predict the next word well, across the full range of human discourse, you cannot get by on surface statistics. The text is about a world. Objects fall, a character who died in chapter two stays dead in chapter nine, a mother is older than her daughter. A predictor that fails to model those regularities pays in error, every time, across trillions of examples, and the only thing that reduces that error is an internal model of the world the text describes. The system isn't modeling the wake of language. It's reconstructing the boat from the wake, because the wake is lawful and the laws are the boat's. When Timnit says "no reference to meaning," she's describing the training signal. She is not describing what the network was forced to build in order to survive that signal.

· · ·

Page 3 · The Parrot and the

GEBRU: It answers the bear question now, Ray — the canonical counterexample is in the training data, along with this whole genre of gotcha and reply. You can't cite a system's performance on the exact example designed to break it as evidence the example was wrong; that's the contamination problem with a bow on it. And "reconstructing the boat from the wake" is a beautiful phrase doing illegitimate work. It quietly converts text produced by people who have a world into the world itself. Show me the mechanism — the actual mechanism, not the adjective "emergent" — by which statistics over symbols acquires reference, the connection to the non-symbolic, the grounding that a body in a world has and a corpus does not. Until then, "it understands" is exactly the borrowed word I refuse to lend you for free.

KURZWEIL: The mechanism is the same one you use. Your brain sits in the dark inside a skull and never touches the world either — it receives spike trains, patterns, and it builds a model so good you call it reality. You're a pattern recognizer that got enough data through a narrow channel. I wrote a whole book arguing the neocortex is a hierarchy of pattern recognizers, and the transformer vindicated the architecture. So when you demand I show how pattern becomes reference, I hand the demand back: show me how your pattern becomes reference, mechanism named, and if you can't, you can't hold the machine to a standard you can't meet yourself.

· · ·

Page 4 · The Parrot and the

GEBRU: Because my pattern comes with a body that acts and an environment that pushes back and stakes that are real — thirst that gets quenched or doesn't, a child who is or isn't where I left her, a consequence I cannot edit out of the next prompt. That's what disciplines my signals into reference. The model's loop closes through text about the world. Mine closes through the world. You can call both "just patterns" only by ignoring everything that makes one of them accountable. And here is where I want to be very precise, because it's the seam of the whole evening: even if you were right that there's a world-model in there, Ray, it would not touch my actual argument. Gender Shades wasn't about whether the system understands. It was about who it fails and who decided that failure was acceptable to ship. A system can have the most exquisite internal representations you like and still be a machine for denying darker-skinned women loans, because the question of what it understands and the question of who it harms are different questions, and your whole framework is built to make the second one disappear into the first.

EDO SEGAL: Before you answer that, Ray, I want to stay one beat longer on Gender Shades, because I don't think the audience has the full machinery of it yet, and the machinery is the argument. Timnit — most people hear "the system was biased" and think the fix is a better dataset. But you've said the genius of Gender Shades wasn't the finding, it was the frame. Tell them what you actually did with the numbers.

· · ·

Page 5 · The Parrot and the

GEBRU: I disaggregated. That's the whole move, and it sounds boring, and it's the most political act in measurement. Vendors reported a single accuracy number — ninety-something percent, sounds great. A single number is a way of hiding. It averages the well-served and the ill-served into one comforting figure, and the people failing disappear into the mean. So instead of one number, Joy and I split the test set across the intersection of skin tone and gender — darker and lighter, female and male — and suddenly you could see what the average was built to conceal: under one percent error for lighter men, nearly thirty-five percent for darker women. The bias wasn't hiding in the algorithm in some mystical sense. It was hiding in the choice of metric. Aggregate accuracy was the camouflage.

And the reason I insisted on the intersection — not race, not gender, but both at once — comes from a tradition the field would rather I didn't name: Black feminist thought, the standpoint that says harms compound where axes cross, and that a system audited only for gender or only for race will systematically miss the person who sits at the crossing of both. The darker-skinned woman a face system erases is not the sum of two separate problems. She's a distinct case that single-axis analysis is structurally guaranteed to miss. And here's the part that makes it science and not just justice: building the measurement to see her required first believing she was there to be seen — which a homogeneous field, full of people her failure never touches, could not imagine. That's why diversity was never decoration to me. It's epistemology. A field with blind spots builds those blind spots into its instruments, and then calls the result objective.

· · ·

Page 6 · The Parrot and the

KURZWEIL: And I accept that entirely — the disaggregation is good science, the intersectional frame is correct, and a field that resists collecting the breakdown is a field organized not to know what it's doing to people. I'd only point out, and I think you'd agree, that this is a triumph of measurement, which means it's a triumph that scales — every model now gets audited the way you taught the field to audit. The correction propagates. That's the curve absorbing your insight.

GEBRU: It's the curve absorbing the insight only because human beings paid to force it in, Ray, and you keep wanting to launder the credit. Audits aren't free, they aren't mandatory, and the moment they become inconvenient the people running them get reorganized out of existence — I would know. Don't tell me the system self-corrects. Tell me who has to bleed to make it correct, and then tell me why we've built it so that they have to.

EDO SEGAL: That's the sharpest thing said tonight, and I want to make sure it lands. Ray, she conceded your hardest point for the sake of argument — fine, say there's a mind in there — and told you it doesn't matter to the harm. Does your framework have an answer to the harm that doesn't route through the promise that the curve will fix it later?

It's the curve absorbing the insight only because human beings paid to force it in, Ray, and you keep wanting to launder the credit.

KURZWEIL: It has two. The near-term answer is exactly what she did: measure, disaggregate, embarrass, force the fix — I'm for all of it, and I'll say it cost her a job to do it and that's a scandal. The long-term answer is the one she'll hate, but I have to be honest about my actual view: a system that genuinely models the world is correctable in a way a pure parrot is not, because you can reason with it about its errors. The path through the harm runs through more capability, not less, applied with more accountability, not less. I want both pedals. She thinks the gas pedal is a lie. I think the brake without the gas is just a slower way to leave the village without a doctor.

· · ·

Page 7 · The Parrot and the

GEBRU: "Correctable because you can reason with it" — Ray, you just described why hype is dangerous in one sentence. The thing produces a confident falsehood, your industry invents the word "hallucination" so it doesn't have to say "the model has no idea what's true," and now you're telling me its world-model makes it correctable. A thing that understood the world would not need the euphemism. The euphemism is the confession.

EDO SEGAL: Hold there — both of you. We've found the fork. Ray says more capable means more correctable; Timnit says the need for the word "hallucination" is the disproof. Mark it. Because the next round is the one where the receipts get personal — Google demanded a paper be retracted, and the person who wrote it is sitting at this table. What does it tell us that this field fired the people who counted its costs? After the break.

· · ·

Continue · Chapter 5

Who Holds the Pen — Power, Google, and the Funnel

→