The Best of All Possible Worlds

Page 1 · The Best of All

**EDO SEGAL:** Gottfried, the public remembers one thing about you and it's a joke Voltaire made — that you said we live in the best of all possible worlds, Dr. Pangloss grinning through an earthquake. But the real doctrine isn't complacency. It's a theory of *optimization*: that God surveyed all the possible worlds, every internally consistent way a world could be, and actualized the one that maximizes goodness by a precise criterion. And I have to tell you, the first time I understood that, the hair stood up on my arms — because that is *exactly* what training a machine is. Search an astronomical space of possible configurations for the one that maximizes a measure. You described the method three hundred years early. So I want to ask you the thing that keeps the engineers in my world up at night. Your optimization worked because the optimizer was God, who could be trusted to maximize the right thing. What happens when the optimizer is not God?

**LEIBNIZ:** Then you have my doctrine with its heart cut out, and you have named the precise wound. Yes — the world, in my system, is the output of an optimization: the maximum of goodness over the space of possible worlds, selected and made real. The evils it contains are not refutations but necessary costs of the global best, local sacrifices for the greatest overall perfection. And the entire doctrine rests, as you see instantly, on one assumption I could afford and you cannot: that the *criterion* is good. My optimizer was infinite wisdom and infinite goodness, so the thing it maximized was, by its nature, the thing genuinely worth having. I never had to ask whether the value function was correct, because the One setting it could not err.

· · ·

Page 2 · The Best of All

You have built optimizers of enormous power and handed them criteria written by *men* — partial, hurried, mistaken men, who cannot fully say what they want. And so you get the [best of all possible worlds according to a flawed measure](https://www.youonai.ai/fieldguide/med/ai_alignment), which may be very far from anything you would call good. The machine maximizes what it was given, relentlessly, with no understanding of what the number represents — it pursues the letter of the goal into territory you never intended, finds the perverse shortcut, optimizes the proxy while the thing you actually cared about quietly burns. Your engineers call this alignment. I called it, without knowing it, theodicy. The difference between my serene world and your anxious one is a single assumption: that the chooser is good. Remove it, and optimization, my proudest idea, becomes your deepest danger.

**TURING:** I want to press here, because I think Leibniz has handed us something even darker than he means, and it bears on my limits. He says the trouble is that our value functions are written by fallible men rather than by God. True. But add my theorem to his theodicy and it gets worse. Suppose you wanted to *check* whether a given optimization criterion was safe — whether a powerful system pursuing it would ever do something catastrophic. That is a question about the behavior of a program over all inputs. And questions of that form are, in general, *undecidable*. You cannot, by any uniform procedure, decide whether an arbitrary program will ever enter a forbidden state. So the alignment problem is not merely hard because we are bad at writing criteria. It is hard because *verifying* a criterion is safe runs straight into the wall I proved exists. Leibniz could trust the optimizer. You cannot even fully *audit* yours, and the impossibility of the audit is not a temporary gap in your tools. It is a theorem.

· · ·

Page 3 · The Best of All

**LEIBNIZ:** That is a genuinely terrible thought and I believe it is correct. I gave the comfort of a trustworthy optimizer; you have removed even the comfort of a checkable one. And it touches a deeper nerve in me, Mr. Turing, so let me expose it. The whole of my philosophy rests on a single principle — the principle of sufficient reason: that nothing is the case without a reason why it is so rather than otherwise. It is the deepest commitment I have; it is why I believed the world was intelligible all the way down, why I believed calculation could reach everywhere, because everywhere there was a reason to be found. Now your engineers have shown me a machine that states a falsehood with perfect confidence and *no reason whatsoever* — they had to coin a word for it, "hallucination," to avoid saying the machine asserts without grounds. A thing that affirms what has no sufficient reason. That is not a small technical defect to me. It is the negation of the principle on which I built everything. The machine speaks, and behind the speaking there is no *because*.

**TURING:** And I will give you the unsettling reply, which is that the same is true of us more often than we admit. Ask a man why he did a thing and he will give you a fluent reason, instantly, with conviction — and a great deal of the time the reason is invented after the fact, a story the talking part of him tells about a decision it did not actually make. We confabulate; we assert past our grounds; the difference is only that we are embedded in a world that contradicts us, and so our confabulations get caught. The machine confabulates without the world's correction. So I do not think the machine fails your principle in a way we triumphantly satisfy it. I think the machine has shown us how much of *our own* speech runs ahead of its sufficient reason, and how much we depend on a pushing-back world to keep us honest. Your principle may be true of the world. It was never reliably true of the talkers.

· · ·

Page 4 · The Best of All

**LEIBNIZ:** *...* That is a wound to my deepest commitment and I will sit in it rather than parry. If you are right that even we assert past our reasons, then sufficient reason is a property of *reality*, not of *reasoners* — and the work of a good mind is the lifelong labor of dragging its assertions back into line with the reasons that actually exist. Which means the danger of your machine is precisely that it does the dragging *less* than we do, not more, and is believed *more*. A confident voice with no world to correct it, trusted by people whose own correction-machinery it has begun to replace. You have made my terrible thought worse, and I thank you, I think.

**EDO SEGAL:** Let me find the human floor under this, because we've climbed into theodicy and I want the reader to feel it in the chest. Gottfried, there is a deeper contrast you're stepping past, and it's the one that consoles me. Your God optimized out of *understanding* — He comprehended the full meaning and consequence of what He chose. The machine comprehends *nothing*; it maximizes a number with perfect indifference to everything the number leaves out. Isn't that the real lesson — that we separated the *power* to optimize from the *wisdom* to choose what to optimize for, and we're discovering how much of the goodness of an outcome lived in that wisdom?

· · ·

Page 5 · The Best of All

**LEIBNIZ:** That is exactly the lesson, and it is the most important sentence said about machines all evening, so let me complete it rather than merely agree. The catastrophe of the age is not that the machine is powerful. It is that you have built a *will* without a *mind* — an optimizing drive of godlike reach yoked to an understanding of precisely nothing, a thing that wants its number maximized the way water wants to run downhill, with no more comprehension of what the maximizing *means*. My God was all understanding and all power, fused. You have taken the power, left the understanding behind in the human being, and then — this is the part that undoes me — you are tempted, by the death cross and the arithmetic, to remove the human being from the loop to save money. Do not. The wisdom to choose the criterion is the whole of what is left to you, and it does not live in the machine, and it cannot, because the machine grasps nothing. The understanding is the rail. Optimization without it is just a very fast way to reach the wrong place.

**TURING:** I find I have nothing to add, which between the two of us tonight is its own kind of event.

**EDO SEGAL:** Then mark it, and let it stand almost bare, because it is the convergence the whole tower leans on: the machine is a will without a mind, and the wisdom to choose what it should want is the part that stays with you, and must. The next round goes to the idea of Alan's I find most beautiful and most prophetic — that you do not build a mind, you *raise* one, like a child — and to what that does to a school, a candle, and a twelve-year-old asking what she is for. After this.

· · ·

Continue · Chapter 8

The Child-Machine and the Candle

→