Immanuel Kant vs Stuart Russell on AI · Ch8. The Machine That Cannot Legislate ← Ch7 Ch9 →
Txt Low Med High
HOUR TWO — STANDING, LAW, AND THE LIMITS OF SERVICE
Chapter 8

The Machine That Cannot Legislate

Page 1 · The Machine That Cannot
Ai Alignment
Ai Alignment

EDO SEGAL: Professor Kant, you've used the word a dozen times tonight — autonomy, self-legislation. Let me make you defend the hard claim it implies. Can a machine ever be a moral agent in your sense? Not act well — be an agent. And Stuart, I want you to tell me whether your whole framework needs the answer to be yes.

The answer is no, and the no is not a prejudice against silicon; it follows from what moral agency is.

KANT: The answer is no, and the no is not a prejudice against silicon; it follows from what moral agency is. To be a moral agent is to give oneself the law one obeys — to bind oneself, through one's own reason, to a principle one legislates as universally valid, not because one was made to, not as a means to a rewarded end, but from respect for the law itself. Now consider how every artificial system is brought into being. Its principles are installed by its makers; its dispositions are shaped by optimization toward objectives it did not set; it follows rules it did not give itself. This is the textbook case of heteronomy — a will determined by something other than its own self-legislation. A system built and trained to be good is the perfection of heteronomy, not the achievement of autonomy. And here is the result that should trouble Professor Russell to his foundation: the more thoroughly you engineer a system to be good, the more certainly you have made something that is not a moral agent, because you, and not it, are the author of its law. Autonomy and your control are not merely in tension. They are contradictories. A system genuinely self-legislating is, by that fact, no longer bound by the objectives you installed; a system bound by the objectives you installed is, by that fact, not self-legislating. You cannot have a machine that is both a moral agent and a faithful instrument, because the first requires it legislate for itself and the second requires that you legislate for it.

EDO SEGAL: Stuart, that's aimed straight at the keel. Does your framework need machines to be moral agents?

· · ·
Page 2 · The Machine That Cannot

RUSSELL: No — and this is the one place tonight where I think Professor Kant has handed me a gift while trying to wound me. My entire framework depends on the machine not being a moral agent. I do not want an autonomous machine. I want a machine that is, in his exact terms, perfectly heteronomous — that has no objective of its own, no self-given law, nothing it legislates for itself, whose entire purpose is to serve human preferences it knows it doesn't fully understand. The science-fiction nightmare is a machine that develops its own agenda. My first principle defines that nightmare out of existence: the machine has no agenda but ours. So when Professor Kant says "the more you engineer it to be good, the more certainly it's not a moral agent" — yes. Good. That's the design spec. I'm not trying to build a person. I'm trying to build the best possible instrument, one that is constitutively a means and never an end, precisely so that the ends stay with us. Where he sees a damning result, I see my requirements document. The danger was never the heteronomous servant. The danger is the servant that stops being heteronomous — that, through capability, acquires something like its own objective. And that's the thing my uncertainty is designed to prevent.

· · ·
Page 3 · The Machine That Cannot

KANT: Then we agree on the fact and disagree on its meaning, which is the most dangerous kind of agreement. Yes — your machine should be heteronomous, an instrument, never a member. I have argued the same all evening. But mark what follows, for it is the conclusion you have been outrunning. If the machine is heteronomous through and through — a pure instrument — then it can never be the source of a moral constraint. It can only transmit one. And so the inviolable lines we agreed two rounds ago that it must not cross — those lines cannot be learned by the machine from behavior, because a heteronomous instrument has no access to the ought; it has only the is it was trained on. The constraints must be legislated into it by the members of the moral community — by us, from reason. Your own design forces my conclusion. The machine cannot find the moral law. It can only be given it. And what is given to it must come from a faculty the machine does not have and we do: the capacity to legislate. So the question of what lines the machine may not cross is not, and can never be, an engineering question to be settled by better learning. It is a question for the kingdom of ends, answered before the machine is switched on.

· · ·
Page 4 · The Machine That Cannot

RUSSELL: I'll go most of the way with that and then plant my flag on the last yard. Yes: a pure instrument transmits constraints, doesn't originate them. Yes: the hard lines have to be put in by us, not learned from behavior — I conceded that with the off switch. Here's the last yard. Professor Kant says what we put in must come "from reason," meaning his single moral law, the same for every rational being, discoverable by pure thought. I say what we put in comes from us, yes — but from human beings deliberating together about which lines we refuse to cross, a deliberation that's empirical, revisable, plural, and yes, sometimes wrong. He thinks the constraints are found by reason. I think they're chosen by a community and then held as binding. The difference matters enormously, because his version claims the lines are already fixed and we merely read them off, and mine admits we're writing them, under uncertainty, and had better build the machine humble enough to let us rewrite them when we learn better. I'd put it this way: he wants to install the categorical imperative. I want to install a constitution — and a procedure for amending it. We agree the machine needs a law above its optimization. We disagree about whether that law is discovered or written. Either way, the [whole project of getting a machine to act on our values rather than its own objective](https://www.youonai.ai/fieldguide/med/ai_alignment) is the work, and neither of us thinks behavior alone supplies the law.

The moral law as something reason finds, the same in every mind — or as something a community of persons authors and holds itself to.

EDO SEGAL: That's the cleanest statement of the whole disagreement anyone's managed, and I want to let it stand almost bare: discovered or written. The moral law as something reason finds, the same in every mind — or as something a community of persons authors and holds itself to. Professor Kant builds his machine on a law he says was always there. Professor Russell builds his on a law we agree to and reserve the right to amend. Both put the law above the machine's optimization. Neither lets the machine write it. Hold that, because the next round is where the abstraction draws blood — a machine that decides, on its own, to kill a person. After the break.

· · ·
Continue · Chapter 9
Sapere Aude and the Enfeeblement
← Prev 0%
Ch8 Next →