Marquis de Condorcet vs Eliezer Yudkowsky on AI · Ch10. The Tribunal of Reason ← Ch9 Ch11 →
Txt Low Med High
HOUR TWO — THE DEATH CROSS AND THE TRIBUNAL
Chapter 10

The Tribunal of Reason

Page 1 · The Tribunal of Reason

**EDO SEGAL:** Eliezer, in 2023 you didn't just warn — you *prescribed*, and the prescription is the most controversial thing you've said. An indefinite, global moratorium on large training runs. Tracking the specialized hardware. And, most explosively, a willingness to enforce it with force — to destroy a rogue datacenter by airstrike rather than let the moratorium be violated, because you judged uncontrolled superintelligence a greater danger than a shooting war between nations. Marquis, you helped design constitutions; you believed in the tribunal of reason, in institutions as the cure for power. So this round is about whether the Enlightenment's great invention — the institution — can hold this, or whether Eliezer is right that it's already too slow. Eliezer, defend the airstrike to a man who spent his life on the rule of law.

**YUDKOWSKY:** I'll defend it, and I'll defend it *to* him specifically, because he, of all people, has the framework to understand it. Marquis — you accept that some threats justify extraordinary, even violent, collective measures. Your own century built the doctrine of the just war and the right of self-defense. We maintain, right now, a global machinery of nuclear deterrence — we point world-ending weapons at each other and accept the risk of catastrophe — because we've judged that *certain outcomes are worse than the risks taken to prevent them*. All I did was apply that same logic, that *you* would recognize, to a threat I rank above nuclear war: the extinction of everyone, everywhere, permanently. If you genuinely believe the default outcome of unrestricted development is that everyone dies, then a treaty without enforcement is not caution — it's a *suicide pact with good manners*. The airstrike line isn't bloodlust. It's arithmetic again. If the thing on the other side is human extinction, then "we were unwilling to risk a conflict over a datacenter" is not a moral high ground. It's the epitaph. I'd rather be the man who said the unsayable thing that prevented it than the man who was too polite to.

· · ·
Page 2 · The Tribunal of Reason

**CONDORCET:** I understand the logic, monsieur, and I will not call it monstrous, because I have seen real monstrousness and this is not it — this is a man reasoning honestly from a terrible premise. But I will tell you where it frightens me more than the machine does, and you must hear it from someone who *died of it*. You are proposing that a body of men, certain they are preventing a catastrophe, be empowered to use lethal force against those who do not share their certainty. Monsieur, I have *stood* where the targets stand. The Committee of Public Safety was *also* certain it was preventing a catastrophe — the death of the revolution, the return of tyranny, the enemy within — and from that certainty, that *arithmetic of the greater good*, it built the Terror and sent me and my friends to die. The most dangerous sentence in any language is *the stakes are so high that the normal rules must be suspended*, because it is the sentence every tyrant in history has used, and it is *always* spoken by someone convinced they are the exception. You may be right about the machine. But the *instrument* you propose — extraordinary power justified by extraordinary certainty about a danger others cannot see — is the precise instrument that has killed more innocents than any machine ever has. I do not fear your reasoning. I fear what your reasoning becomes in the hands of the next man, who is certain of something else.

· · ·
Page 3 · The Tribunal of Reason

**YUDKOWSKY:** *Pause.* That lands, and it lands harder than almost any objection I get, because it comes from someone who paid for it with his life, so let me not be glib. You're identifying a real and horrifying failure mode: "the stakes justify suspending the rules" is the tyrant's favorite sentence, and it has a body count that dwarfs anything I'm warning about *so far*. I genuinely sit with that. Here's the only thing I can say back, and I don't say it to win, I say it because I think it's true. The difference between my claim and the Committee's is *checkability*. The Terror's catastrophe was an unfalsifiable phantom — "enemies of the revolution" could be anyone, the danger could never be disproven, which is exactly what made it a license for unlimited cruelty. The thing I'm pointing at is, at least in principle, a *technical* claim about *physical systems* — about training runs, compute thresholds, capability measurements — things that can be specified, monitored, argued over, and *bounded*. I'm not asking for a standing power to kill whoever seems dangerous. I'm asking for a narrow, verifiable, internationally-agreed limit on a specific physical process, the way we limit uranium enrichment. You're right that the *form* of my argument is the tyrant's form, and that should terrify all of us, and it terrifies me. The only defense is that the referent is *concrete* in a way the tyrant's never is. Whether that defense holds — whether "concrete and verifiable" survives contact with real institutions run by real frightened people — is, I admit, exactly the thing I can't guarantee.

· · ·
Page 4 · The Tribunal of Reason

**CONDORCET:** *Nods slowly.* That is an honest answer, and the distinction is real — between a danger that can be specified and one that cannot. I will grant it. But notice, monsieur, that you have just made *my* argument for me, in the place I have been trying to reach all night. You say the defense against the tyrant's logic is *institutions* — verification, international agreement, specified thresholds, the rule of law applied to a concrete and checkable thing. That is the Enlightenment's entire program. That is the [tribunal of reason](https://www.youonai.ai/fieldguide/med/collective_attention) I gave my life to build. You came to this table telling me my hope was a lullaby — and now, at the sharpest moment, when you must defend your most violent proposal, you reach for *exactly the instruments I trust*: the treaty, the verification, the bounded and accountable power, the rule of law that distinguishes the legitimate constraint from the tyrant's whim. We do not disagree, monsieur, that the institution might save us. We disagree only on whether there is time to build it. And *that* — the question of time — is a question of will and of resources, which is to say it is a *human* question, and human questions are the ones my Sketch says we eventually, painfully, get right.

· · ·
Page 5 · The Tribunal of Reason

**EDO SEGAL:** And I'm going to let that be the convergence I mark for this round, because it's a genuine one and it's not small. Strip away the heat, and both of these men just agreed that the *only* thing that might save us is the Enlightenment's own invention — the bounded, accountable, verifiable institution, the rule of law applied to a concrete danger. Eliezer reached for it to defend his most extreme proposal. The Marquis spent his life building it. They disagree, fiercely, on whether we have *time* to build it before the death cross — but they agree on what the dam would have to be made of. Number that convergence, because we've now marked three tonight, and three agreements between these two men is genuine news. After the break, the round I've been saving — the one about what's actually at stake if we lose. Not survival. Something the Marquis spent his life on and Eliezer spends his nights on: the future itself, and whether anything we'd recognize as *us* survives in it.

· · ·
Continue · Chapter 11
Is There a Floor Above This One?
← Prev 0%
Ch10 Next →