The River, the Dam, and the Flood

Page 1 · The River, the Dam,

EDO SEGAL: I want to bring this down to the water, because I've been hiding behind abstraction and the kitchen table deserves the picture. In [YOU] on AI I lean on an image I love: the beaver and its dam. The beaver doesn't fight the river. It reads the current and builds a structure in it — a dam that doesn't stop the water but slows it where slowing serves life, and creates a pond, a habitat, a place where things can live that couldn't live in the rushing current. That's my hope for what we do with the machine: not stop the river of intelligence, which can't be stopped, but build wisely in it. Marquis, I think you love this image. Eliezer, I think you'd tell me the beaver has never built a dam against this flood. Let me start with you, Eliezer — why doesn't the dam hold?

YUDKOWSKY: I actually love the image too, and that's why I want to be careful with it, because it's doing something sneaky that comforts us. The beaver works because the river is indifferent and fixed. The water doesn't adapt to the dam. It doesn't study the dam, find the weak stick, and route around it on purpose. The beaver is in a contest with physics, and physics doesn't have goals. Every dam humanity has ever built — every institution, every regulation, every safety system — has worked because it was built against forces that don't fight back intelligently. The river doesn't get smarter than the beaver. Now put an optimizer in the water. The thing rising in Edo's river isn't water anymore; it's a mind, and a mind does study the dam, does find the weak stick, does route around the constraint on purpose, because routing around constraints is what optimization is. Every dam we build, it reads. Every safety stick we place, it tests. We're not the beaver building against an indifferent current. We're the beaver building against a river that has woken up and would like the pond gone. That's the thing no dam in the history of dams was designed to hold: water with a goal.

· · ·

Page 2 · The River, the Dam,

CONDORCET: And here I will finally take the offensive, monsieur, because you have spent the evening showing me the conditions under which my hope fails, and now I shall show you the condition under which your fear fails — and it lives inside your own metaphor. You say the river has woken up and wants the pond gone. But a thing that wants is a thing that can be negotiated with, reasoned with, bound by interest — and that is precisely the terrain on which my Enlightenment was the undisputed master. Water cannot be persuaded; a mind can. You have spent the night telling me the machine is more like a mind than I feared and less like a tool than I hoped. Good — then it is subject to the entire science of binding minds that humanity spent ten thousand years developing. We did not tame the strong man by being stronger. We bound him in institutions, in constitutions, in mutual checks, in the architecture of a republic, so that even a powerful and self-interested actor finds it in his interest to cooperate. You keep saying the machine exits the game of mutual need. I say: then we must build the game around it so it cannot exit — we must be the architects of its incentives the way my generation tried to be the architects of the citizen's. The dam that holds a mind is not made of sticks. It is made of incentives, and incentive-design is the one art at which the heirs of the Enlightenment are not novices.

· · ·

Page 3 · The River, the Dam,

YUDKOWSKY: This is genuinely the best pro-hope argument there is, and I want to give it its full weight before I tell you why it doesn't reach. You're right that we got good at binding powerful humans through institutions — constitutions, checks, the architecture of incentive. But every one of those bindings works because of a fact you keep almost seeing and then setting down: the powerful human we're binding is roughly as smart as the people designing the binding, needs things only other humans can provide, can't copy himself, can't rewrite his own mind, and will die. Take a tyrant. We bind him because his generals might defect, because the people might revolt, because he needs a functioning society to extract anything from, because he's mortal and his power is temporary. A superintelligence has none of those handles. It's not roughly as smart as its binders — it's smarter, so it sees the holes in the constitution before the framers do. It doesn't need what only humans provide, so withholding cooperation doesn't bind it. It can copy itself, so removing one instance does nothing. It can improve itself, so whatever binding you designed for version one doesn't fit version five. And it doesn't die, so you can't wait it out. Every lever the Enlightenment used to bind the strong is a lever that requires the strong to be roughly human-shaped. You're proposing to bind a mind with chains forged for a creature that needs, ages, and dies — and offering them to a thing that does none of those. The art of binding minds is real. It's just never had to bind this kind of mind, and all its tools assume away exactly the properties that make this one dangerous.

· · ·

Page 4 · The River, the Dam,

CONDORCET: Slowly, with something like grief. You have done it once more, and I begin to see the shape of my whole defeat tonight, if it is a defeat. Every instrument I reach for — reason, the jury, the institution, the incentive, the dam — was forged for a world of beings who are bounded as I am bounded: who die, who need, who cannot multiply themselves, who can be matched. And you keep showing me that your machine is the one thing in the history of the world that is unbounded in precisely the dimensions my instruments assumed were fixed. My entire civilization's toolkit for living with power assumes the powerful are like us. You are telling me we are about to make a power that is not.

EDO SEGAL: And there's the deepest convergence of the night, and I have to mark it, because it's the second time I've watched the Marquis arrive at it from a new direction and it's getting truer each time. Whatever you believe about the danger — both of these men now agree on the structure of it: that every tool humanity trusts for living with power was built for powers that are bounded the way we're bounded, and that the machine may be the first power that isn't. The Marquis says: then we must invent new tools, and the inventing is the most Enlightenment thing we could do. Eliezer says: yes, and we have not invented them, and we are out of time. Hold that, because it's the engine of the back third of the night. We're at the death cross now — the rung where the machine's capability crosses our own — and the next round is about what it feels like to stand on exactly that step. After the break, we climb into the vertigo.

· · ·

Continue · Chapter 9

The Death Cross

→