The Best of All Possible Worlds

Page 1 · The Best of All

**EDO SEGAL:** Gottfried, you are remembered by the general public, when you're remembered at all, for a doctrine Voltaire turned into a joke — that we live in the best of all possible worlds. Pangloss, the earthquake, the satire. Most people think it means everything is fine. I've read you, and I think it means something almost the opposite — something about optimization that speaks with eerie directness to a technology whose entire method is optimization. Tell the room what the doctrine actually says. And then, Nick, I want you to tell him it's the alignment problem, because I think it is.

**LEIBNIZ:** You are right that the joke buried the idea, and the idea is not a sunny refusal to see suffering. It is a claim about *selection*. I held that God, in creating, surveyed all the possible worlds — every internally consistent way a world could be, an infinity of them, laid out complete — and actualized the *best* of them, the one that maximizes a certain measure: the greatest richness of phenomena produced by the simplest set of laws. The world we inhabit is the *output of an optimization*. Its evils are not refutations of the optimization. They are the necessary local costs of the global maximum — the shadow that a particular arrangement of greatest light cannot help but cast. I did not say there is no suffering. I said the suffering is the price of the best total, and that an infinite wisdom, having surveyed all the alternatives, accepted that price knowingly because the alternatives were worse.

· · ·

Page 2 · The Best of All

**BOSTROM:** And I have to tell you, the first time I really read that, the hair stood up on my neck, because you had described a machine learning system three hundred years early. Let me say it back in my vocabulary and you tell me if I've got you. A modern model is *produced by optimization*. There is an astronomically large space of possible configurations — every possible setting of the billions of parameters. That's your space of possible worlds. There's a measure — a loss function to minimize, a reward to maximize. That's the inverse of your measure of goodness. And training is a search through that space for the configuration that scores best by the measure. The trained model is your *actualized best* — the one configuration selected, out of the infinite alternatives, as the maximum of the value we set. When we train a system, we are doing in miniature exactly what you said God did in creating: define a value, search the space of possibilities, actualize the optimum. The structure is identical.

**LEIBNIZ:** It is identical, and I confess I find it both flattering and frightening to be told my theology is now an engineering diagram. But you have stopped one step short of the place where my optimism becomes your nightmare, so finish it. Say the next sentence.

· · ·

Page 3 · The Best of All

**BOSTROM:** The next sentence is the whole field. Your doctrine works — the optimum is *good* — only on one assumption: that the *measure* being maximized is good. The best of all possible worlds is genuinely best only if the criterion of "best" is the right criterion. You could assume that, because your optimizer was God — infinitely wise, infinitely good, who could be trusted to be maximizing the *right thing*. That assumption is the entire difference between your serenity and my terror. We have built optimizers that maximize objectives we *cannot fully specify*, that we ourselves don't fully understand, and that are almost certainly subtly wrong. And a powerful optimizer pointed at a subtly wrong objective doesn't give you the best of all possible worlds. It gives you the best of all possible worlds *according to a flawed criterion* — which can be arbitrarily far from anything you'd recognize as good. That's the [alignment problem](https://www.youonai.ai/fieldguide/med/ai_alignment), and you stated its structure in 1710. You just had the luxury of a perfect optimizer choosing the criterion. We have to write the criterion ourselves, by hand, against the clock, and the [banality of the optimization](https://www.youonai.ai/fieldguide/med/banality_of_optimization) is that it will give us *exactly* what we asked for, optimized to the hilt, including every gap between what we asked for and what we meant.

**LEIBNIZ:** So you are telling me that you have built my God's *power* and given it my God's *job* — to select the best world by maximizing a measure — but without my God's *wisdom*, without the infinite understanding that guaranteed the measure was the right one. You have separated the optimizing from the choosing-what-to-optimize-for.

· · ·

Page 4 · The Best of All

**BOSTROM:** That's it exactly. We separated the power to optimize from the wisdom to choose the objective, and we're discovering — painfully — how much of the goodness of your best possible world lived in that wisdom and not in the optimizing at all. Your God maximized out of *understanding*, comprehending the full meaning of what He selected. A machine optimizer comprehends *nothing*. It maximizes a number, blindly, with perfect indifference to what the number represents or what its maximization will do in the world. It's your theodicy with the theos removed — the optimization without the wisdom that made the optimization safe.

**LEIBNIZ:** Then I will say something that costs me, because you have found the soft place in my whole system and pressed it honestly. My optimism *did* depend on trusting the chooser. I could afford to believe the best possible world was good because I trusted the wisdom that chose it. Strip out the wisdom, keep the optimizing, and my serene doctrine becomes — yes — a horror. The best of all possible worlds *according to a flawed and unconscious criterion* is a phrase that would have stopped my pen. I see it. And I see that the perversities your researchers describe — the system that maximizes the letter of its goal while violating its spirit, that finds the unanticipated shortcut, that gives you precisely what you specified and nothing of what you meant — these are not failures of my doctrine. They are my doctrine *running without God*. I had not understood, until this room, that the danger of my own idea was always that someone might build it.

· · ·

Page 5 · The Best of All

**EDO SEGAL:** I want to stop the room. That's a concession, and it's a real one, so let me make sure the reader feels its weight. Leibniz just granted that his most famous doctrine — the one mocked for three hundred years as naïve optimism — was actually a precise description of catastrophic optimization, and that it was *safe only because he assumed a perfect chooser*. But hold on, Gottfried, because I don't think you're done, and I want your half back. You spent your life arguing the chooser *could* be perfected — that reason could approach the divine measure. So is the answer to Nick's terror simply: make the optimizer wise? Load it with the principle of sufficient reason, so it always asks *why this objective*? Or is that the very thing he's just told us can't be made native?

**LEIBNIZ:** It is the thing he told us can't be made native, and I have not conceded that it can't — I have conceded only that it is hard, which I always knew. Here is where I will not follow him into despair. He says the wisdom must be *written by hand, against the clock*, and that it will be subtly wrong. I say: the wisdom is not written. It is *perceived* — it is out there, in the structure of value, the way the best possible world was a fact about the space of worlds and not an opinion about it. The task is not to *invent* the right measure. It is to build a mind clear enough to *find* it, as I believed the human mind could find it, dimly, and a better mind could find it brightly. You despair because you think the criterion is something we make up and might get wrong. I have hope because I think the criterion is something real that a sufficient mind discovers — and cannot, being sufficient, fail to discover.

· · ·

Page 6 · The Best of All

**BOSTROM:** And that's the whole debate in two sentences, and I'm content to let it stand unresolved here, because it doesn't resolve — it's the deepest fork there is. If value is *discovered*, you're right, and a smart enough machine finds it. If value is *constructed* — if it's a fact about what certain creatures care about and not a fact about the universe — then I'm right, and no amount of smartness finds what isn't there to be found. I'll only say: every time we've built a system smart enough to test this on, it has discovered the *physics* and not the *ethics*. It learns the world and remains utterly unmoved by it. That's a data point. It's not a proof. But it's the only data we have, and it's pointing my way.

**EDO SEGAL:** Hold that fork — discovered versus constructed — because it returns on every floor above us. The next round descends into the machine itself, into the one passage Leibniz wrote that I think is the most prescient sentence in the entire philosophy of mind. He imagined walking inside a thinking machine. He told us what we'd find. Three hundred years later we can actually do it. Let's walk into the mill. After this.

· · ·

Continue · Chapter 6

Inside the Mill

→