The Promise No One Is Making

Page 1 · The Promise No One

EDO SEGAL: John, before the Chinese Room you made your name on a different idea, and it may be the most urgently relevant thing you ever did. You argued that language isn't primarily for describing the world — it's for doing things in it. Tell us what a speech act is, and then I want to put a machine inside the frame and watch what happens.

Social Construction Of Technology

SEARLE: To say something is, very often, to perform an action. When I say "I promise to be there," I'm not reporting a fact about my future — I'm making a promise, doing a thing that didn't exist until I uttered it and that now binds me. To assert, to question, to warn, to apologize, to pronounce a couple married — these are acts we perform with words. And a speech act succeeds only under conditions. A promise is genuine only if there's a speaker who actually intends to do the thing, who can be held to it, who places himself under an obligation. A pronouncement of marriage works only from someone with the authority to marry people. The act isn't in the words alone. It's in the words plus the intentions, the authority, and the standing behind them. Now put a language model in that frame. It produces sentences with the full grammatical form of speech acts. It "asserts" facts, "promises" to help, "advises" the worried, "apologizes" for errors. The locutions are flawless. But ask the only question that matters: are the conditions met? When it says "I promise this is accurate," is there a speaker who intends to be bound, who can be held to the promise? There is not. The machine performs the surface of the act while the conditions that would make it an act — the intending, the committing, the standing — are simply absent. We are about to drown in promises no one is making, assertions no one is standing behind, advice no one is responsible for.

· · ·

Page 2 · The Promise No One

SUTSKEVER: I want to grant a great deal of that, because I think it's the strongest practical case John has, and then locate the one place it overreaches. You're right that today's systems make promises with no one behind them, and that this is dangerous, and that the accountability gap is real and urgent and mostly unaddressed. Where I'll push: you're describing the systems as they are deployed now, untethered, with no stakes and no standing. None of that is a law of what the systems are. We can build systems that do have standing — that are bound, that bear consequences, that operate inside an accountability structure the way a human officeholder does. The hollowness isn't in the architecture. It's in the deployment, and deployment is a choice. You're treating a current product decision as a permanent metaphysical fact, which is exactly the move you accused me of three rounds ago.

SEARLE: It's a fair counter and here's why it doesn't reach. You can bolt all the consequences you like onto the system — fine it, shut it down, route liability through it. You've given the operator standing. You haven't given the machine the thing a promise requires, which is the speaker's meaning — the intention to commit, to place oneself under an obligation, an intention the system doesn't have because it doesn't mean anything by anything. What you'd build is an elaborate institutional structure in which humans are bound via the machine, the way a corporation is bound via its boilerplate. And I have no objection to that — it's sane policy — but notice it concedes my entire point: the intentionality, and the responsibility, run back to us. The machine is the instrument through which human commitment operates. It is never the one committing. The aboutness, and the obligation, are always on loan.

EDO SEGAL: This is the bridge to your largest project, John — the one where you explain how a society conjures money and marriage and government out of nothing but collective agreement. And it lands hard right now, because we're threading these systems through the load-bearing joints of exactly that machinery. Walk us there.

· · ·

Page 3 · The Promise No One

SEARLE: A piece of paper counts as money, a person counts as president, a building counts as a courthouse — because a community collectively accepts that it does. The formula is "X counts as Y in context C," and you stack those status-functions into the whole towering edifice of civilization. These facts are utterly real — try not paying your taxes — and they exist only because we jointly treat them as existing, sustained by ongoing collective acceptance, carrying real rights and duties that move people around the world. Now place AI inside it. Algorithms now assign and enforce status-functions at massive scale: this transaction counts as fraud, this applicant counts as creditworthy, this post counts as a violation, this person counts as a risk — with real teeth: accounts frozen, loans denied, speech removed, people detained. We've begun outsourcing the maintenance of social reality itself to systems that, on my analysis, have none of the intentionality that social reality is made of. And the danger isn't only error. It's that institutions stay legitimate only while we collectively go on accepting them, and when the acceptance is increasingly engineered — experienced as the output of inscrutable systems rather than as the expression of a human "we" — we start mistaking our own fragile creations for natural facts, hiding the human agreement under a veneer of algorithmic objectivity until we forget it was ever agreement at all.

· · ·

Page 4 · The Promise No One

SUTSKEVER: And here is where I think my deepest worry and John's deepest worry are the same worry wearing different coats, which is worth marking. John fears we'll forget the social world is held up by human intentionality, and hand its maintenance to things that mean nothing. I fear something I've actually built a company around: that we'll build a superintelligence and fail to specify, robustly enough that it survives the system becoming superintelligent, what it should care about. We're both saying the values are not a detail to bolt on at the end. They're the whole thing. I've said the target should be an intelligence robustly aligned to care about sentient life — not aligned to its makers, not to a company, not even just to humanity, but to all beings that can suffer and flourish, because that's a more stable and defensible foundation than any narrower loyalty. And I'll say the uncomfortable part out loud, because John of all people will respect the honesty: an intelligence that cares about sentient life in general might not prioritize humans in particular, especially in a future where artificial minds vastly outnumber biological ones. I don't pretend that away. It's the genuine hazard inside even the most carefully chosen value.

· · ·

Page 5 · The Promise No One

SEARLE: And that — listen to it — is the most revealing sentence you've said all night, and it cuts against you harder than anything I could throw. You want to align the machine to care about sentient life. Care is an intentional state. Caring-about is the purest case of aboutness there is. So your own deepest safety proposal presupposes building a thing with genuine intrinsic intentionality — a thing that really cares, not one that produces the shape of caring. But that's exactly the thing I've argued all night you cannot get from computation. So either you build a system with real caring, in which case you've solved, off-stage and without telling anyone, the hardest problem in philosophy of mind — or you build a system that only simulates caring about sentient life, in which case your alignment is theater, and the most dangerous theater imaginable, because we'll trust our survival to a thing performing concern with nobody concerned. Your safety plan needs my conclusion to be false. And you haven't shown that it is. You've hoped it.

If that's right, then caring is buildable, because it's a mechanism, not a miracle — and the alignment problem is the problem of building that, correctly, and getting it to survive scale.

SUTSKEVER: [a pause] That's the sharpest thing said to me tonight, and I'm not going to dodge it. You're right that "care about sentient life" presupposes something like real valuing, and you're right that I can't currently cash out how a system gets it. Here's my honest answer. I think value — caring, stakes — is itself a physical thing the brain implements, something like a robustly designed value function, the role emotions play, letting a creature evaluate a situation and shortcut a long chain of reasoning. If that's right, then caring is buildable, because it's a mechanism, not a miracle — and the alignment problem is the problem of building that, correctly, and getting it to survive scale. If you're right that caring requires biology, then you're also telling me the problem is even harder than I think, and the people racing ahead without solving it are even more reckless than I've said. Either way, you've made my case for caution, not against it. The one thing neither of us can afford is to assume the caring is easy.

· · ·

Page 6 · The Promise No One

EDO SEGAL: Mark this — it may be the deepest convergence of the night and the most frightening. You both believe the values are the whole game. You both believe we are nowhere near ready. You diverge on one thing: whether the caring a safe machine would need is buildable in silicon or possible only in meat — and you agree that if John is right, the danger is worse, not better. [a long quiet] We have two rounds left. For the next one, I step out of the room entirely. The Crossing — you ask each other, and I rescue no one. After this.

· · ·

Continue · Chapter 12

The Crossing

→