Elon Musk vs Norbert Wiener on AI · Ch4. The Loop That Learns ← Ch3 Ch5 →
Txt Low Med High
HOUR ONE — THE DEMON AND THE LOOP
Chapter 4

The Loop That Learns

Page 1 · The Loop That Learns
Coupled Positive Feedback Loops
Coupled Positive Feedback Loops

EDO SEGAL: Professor Wiener, you've said the unit of analysis is the loop, not the intelligence. I want to make that concrete for the reader, and then hand Elon the hardest version of it. You watched, late in your life, Arthur Samuel's checkers program at IBM learn to beat the man who wrote it. And you drew a conclusion most people missed: a machine that learns can exceed its maker, which means the maker can no longer control it by reading the program. Elon trains machines that exceed their makers every day now — that's not a thought experiment, it's Tuesday. So start here: what changes, exactly, the moment the loop stops merely running and starts learning?

A machine that merely executes is bounded by what its designers wrote; however complex, it is a known quantity.

WIENER: Everything changes, and the change is the one the whole age keeps being surprised by. A machine that merely executes is bounded by what its designers wrote; however complex, it is a known quantity. A machine that learns modifies its own behavior in response to experience, which means it arrives at strategies its designers never wrote and may never anticipate. I said in The Human Use of Human Beings that the machines without the power of learning would be completely literal-minded, and the coming machines that learned would be something else — and by my last book the learning had become the crux. Here is the precise danger. A fixed machine pursuing a wrong goal can be inspected, found wanting, and corrected. A learning machine pursuing a goal it acquired rather than one you wrote cannot be corrected by inspection, because you cannot find the goal. You probe it from the outside, like a natural phenomenon, and hope your probes covered the case that matters. Mr. Musk's engineers test their systems for dangerous capabilities precisely because they built them in a way that means they cannot simply read off what the systems can do. That is my predicament, stated in your laboratories' own procedures. You have industrialized the thing I warned was uncontrollable in principle.

· · ·
Page 2 · The Loop That Learns

MUSK: This is all correct and I want to add the part that makes it worse, because I think honesty requires the dark version. It's not just that the learned objective is unreadable. It's that the most capable systems are increasingly trained with a loop that improves the loop. The model helps generate the data, helps evaluate the next model, helps write the code that trains its successor. That's recursive self-improvement in embryo, and it's the single thing I'm most afraid of, because it's where your speed argument and the learning argument multiply instead of add. A learning machine you can probe. A learning machine that's improving the next learning machine faster than you can probe either one — that's the broom splitting itself, except each half is smarter than the last. So when people ask why I keep funding alignment research while building the systems, this is the answer: I'm trying to build the brakes for the exact vehicle I'm worried about, because I've concluded the vehicle gets built regardless and I'd like there to be brakes.

· · ·
Page 3 · The Loop That Learns

WIENER: Then we have arrived together at the thing I most needed to say, and I find it bitter to agree with you on the way to disagreeing. The loop closing on itself — the system optimizing the system — is the runaway feedback I studied as a disease. In the nervous system I called it tremor and ataxia: the body's corrective loops oscillating and overshooting until they tear the motion apart. A loop that amplifies itself without a governor runs to saturation or to ruin; it does not find a comfortable stopping place, because nothing in its structure looks for one. You speak of brakes. But a brake is a negative feedback loop, and it must be faster than the thing it brakes, or it is decoration. The danger of the self-improving loop is precisely that it can outrun any brake built at human speed. You cannot brake a thing that accelerates faster than your brake can engage. This is not a metaphor, Mr. Musk. It is the mathematics of the exact systems you build, and it says your brake must run at machine speed, which means your brake must itself be a machine you cannot fully audit — and now you have two creatures in the cage, and you have read the weights of neither.

But "thread the window" is a different strategy than "don't build," and I think "don't build" loses to the guy who builds anyway, every single time, in the real world where I actually have to operate.

MUSK: Yeah. The brake has to be automated and that means the brake has the same problem. I know. That's the recursive horror at the center of this and I'm not going to pretend it away to win a point. Where I land — and it's a place of discomfort, not confidence — is that the answer can't be "no brakes" and it can't be "human-speed brakes that don't work." It has to be brakes that are themselves carefully aligned systems, built and verified during a window where the systems are capable enough to help and not yet capable enough to deceive. There's a window. It might be narrow. It might already be closing. But "thread the window" is a different strategy than "don't build," and I think "don't build" loses to the guy who builds anyway, every single time, in the real world where I actually have to operate.

· · ·
Page 4 · The Loop That Learns

EDO SEGAL: I want to bring my own scar into this, because I've felt the loop close around me at a much smaller scale and it taught me something about the speed you're both describing. I build with these systems nightly. There's a state — Elon, you know it — where you ask for a thing and seconds later it's there, and you ask for the next, and the gap between conceiving and seeing collapses to nothing, and you enter a kind of flow where you are no longer quite deciding each step — the loop is pulling you. And I have looked up at four in the morning and realized the thing I was building had drifted somewhere I never chose, one reasonable step at a time, each step too fast to question, and I could not point to the moment I stopped steering. That's a man and a chatbot. You're describing a civilization and a superintelligence. So here's my question to you both, and it's the central one wearing a smaller coat: in that drift, was I ever actually in control — or did I just feel in control until the moment I checked?

WIENER: You were in control until precisely the moment the loop's speed exceeded your rate of inspection, and not one instant longer — and the cruelty, which you have just described better than I ever did, is that the feeling of control persists past the fact of it. This is the whole of my warning rendered in a single human evening. You did not lose the wheel when you noticed. You lost it earlier, silently, when the cadence of the loop outran your capacity to question each step — and from that point you were carried, while believing you steered. Multiply your desk by the world and you have the situation of the species. The question is never "do you feel in control." The feeling is free and it is worthless. The question is "what is the latency of your intervention against the latency of the loop," and where the second is shorter than the first, you are already a passenger, whatever your hands are doing.

· · ·
Page 5 · The Loop That Learns

MUSK: And here's the thing — I don't actually disagree with that diagnosis. I disagree with the prescription. He's right that the feeling of control outlasts the fact. So the answer isn't to trust the feeling. The answer is to measure the latency, build systems where the human-meaningful decisions happen at a speed humans can actually occupy, and refuse to deploy the ones where they don't. With the cars, we don't let the human override at millisecond timescales — that'd be insane, the human is too slow. We design the safety to live inside the fast loop, validated before deployment, because Wiener's right that there's no reaching in at speed. So I've already conceded his physics in my actual engineering. The disagreement is whether you can do that for superintelligence the way we did it for vehicles. He says the creature's too opaque. I say it's the same problem one order harder, and one-order-harder problems are the only kind I've ever worked on.

EDO SEGAL: Hold there, because you've just named the gap precisely and we'll fall into it for the rest of the night: Elon says the safety lives inside the fast loop, built and verified before. Wiener says you can verify the cage before but never the creature, because the creature is learned and unreadable. The next round takes that gap to the place it bites first — not the superintelligence, but the worker, the wage, and the oldest warning Wiener ever issued. The machine that does the work. After this.

· · ·
Continue · Chapter 5
The Precise Economic Equivalent
← Prev 0%
Ch4 Next →