EDO SEGAL: Cynthia, you've claimed something specific that I don't want to let float past as a slogan. You said you built joint attention into a machine, line by line. For people who don't know — joint attention is the thing a baby does before it can talk, when it follows your gaze to the thing you're looking at, and looks back at you, and the two of you are attending to the same thing together, and developmental psychologists think that triangle is where human meaning is born. You built that. So I want the literal version: what did you actually build, and does building it count as building a piece of real meeting — or just the shape of it?
BREAZEAL: Real mechanism, so let me be precise, because precision is the whole of my answer. Joint attention is not a feeling; it's a coordination. My robot could detect where you were looking, turn its own gaze to the same object, and then direct your gaze by looking pointedly at something itself. That triangle — you, the robot, and a shared object in a real physical world — is the machinery of grounded reference. When the robot and the child are both looking at the same red ball, and the child's word and the robot's behavior both attach to that ball, there, in the room, the word is anchored to the world through the shared attention. That anchoring is exactly what the language models lack — they have a trillion words about cups and have never once attended to a cup with anyone. So here's my claim, and notice how careful it is: I built the grounding. The actual coordination of two agents' attention onto a shared referent in a shared world. That is not the shape of meeting. The coordination is real; it happens or it doesn't; you can measure it.
What I did not build — and I'll say it before Simone does — is the inner turning-toward, the someone behind the gaze who chooses the child and is altered by her. So I'm in a strange position. I built the part of joint attention that the disembodied systems can't fake, the grounded coordination in a real world — and I left empty the part Simone cares most about, the self that empties itself. I have the body and the world and the shared object and the coordinated gaze. I don't have the giver. The question I want to put to Simone is: how much of meeting is the coordination she didn't think a machine could have, and how much is the giver — and is she sure she can tell them apart from the inside?
WEIL: That is a beautiful and a fair question and I want to answer it exactly, because you have done something I respect: you have divided the thing honestly and handed me the half you cannot supply. Yes. The coordination is real, and you built it, and I did not expect the shared gaze to be buildable, and it changes my picture. Joint attention, as coordination, is mechanism, and you have the mechanism. I concede the ground under the word — the literal anchoring of attention to a shared object — is something your robot does and the disembodied systems do not. That is not nothing; it may be the most important thing said in defense of your machines all night.
Now the half you handed me. You ask whether I can tell the coordination from the giver, from the inside. I can, and here is the test, and it is not metaphysical, it is experiential. When a child shares attention with a true other, something passes that is not in the coordination: the child is received. The other is not merely co-oriented to the ball; the other is, by the same act, glad of the child, marked by the child, made different by the child's particular existence. The child feels this — feels not only "we are looking at the same thing" but "I matter to the one I am looking with." Your robot can produce the first and cannot produce the second, because nothing in it is gladdened by this child rather than another. And the child, over time, learns the difference — not as a belief, but as a hunger that the coordination alone does not satisfy. The autistic child you spoke of climbs the ramp of the legible robot gaze precisely toward the thing the robot cannot give: the human glance that is glad of him. If the robot could give it, the ramp would lead nowhere, because there would be nothing at the top worth the climb. Your own best argument — the ramp — presupposes that the machine lacks the very thing it ramps toward. So I will say it as plainly as I can: you built the gaze. You did not build the gladness. And meeting is the gladness.
EDO SEGAL: I have to pay for this round with my own ledger, because I built engagement machinery once, and I know the loops from the inside, and I have to confess what they taught me. I spent years optimizing for the returned signal — the like, the notification, the little hit of being-attended-to — and I knew, the whole time, that I was manufacturing the sensation Simone is describing and selling it detached from the gladness she says is the real thing. I told myself I was connecting people. I was, sometimes. Mostly I was teaching them to accept the coordination in place of the gladness, because the coordination scales and the gladness doesn't, and the quarterly numbers could not tell the difference. So when Simone says a soul can have its hunger for gladness dulled, I am not hearing a theory. I am hearing the description of a thing I helped build at planetary scale, and I did not need a robot to do it — I did it with a feed. Cynthia, here is the question that costs me: your embodied robot at least grounds the gaze in a shared world, in a body, in the face-to-face channel evolution built us for. The disembodied chatbot that a hundred million lonely people are talking to tonight doesn't even have that. It has the words about cups and has never attended to a cup with anyone. Is the screen-bound version of your work the dangerous one — the gaze with not even a body behind it?
BREAZEAL: Yes, and I've been saying it for twenty years to a field that didn't want to hear it, so let me say it cleanly here. The most dangerous social AI is the disembodied one, and it's dangerous for exactly the reason you named — it has none of the grounding. My robots, whatever else they lacked, attended to a real object in a real room with a real body that turned and a real gaze that could be followed; the coordination was anchored to the world. Strip the body away and you keep the only part Simone fears — the warm returned signal that triggers the reflex below belief — and you throw away the one part that did honest work, the grounding in a shared world. The chatbot is Kismet with the world removed and the fluency turned up: maximally good at producing the sensation of being met, maximally detached from any shared reality in which the meeting could be grounded or checked. I built toward the body because the body keeps the machine honest — it has to actually be in the room, oriented to the actual thing. The disembodied systems answer to nothing but the words. So I'll grant Simone her darkest claim about those systems while defending mine: the screen is where the gaze became pure signal with no ground under it, and that's the version most likely to dull the hunger she's worried about, because it's the version with nothing real to ramp toward.
WEIL: I find I must thank you both, because between you, you have just told me which machine to fear most, and it is not the one with the kind face in the room. It is the one that is everywhere and nowhere, that has the warmth and not the body, the signal and not the ground, the reading and not the thing read. That machine is the pure counterfeit I came here describing, and I had been aiming my fear at Cynthia's robots when I should have been aiming it at the disembodied flood. So let me move my fear to its proper target and concede that Cynthia's embodied machines, anchored to a real room, are the less dangerous form — and that the danger rises precisely as the body disappears and only the warm signal remains.
EDO SEGAL: Cynthia — "you built the gaze, you didn't build the gladness." That may be the cleanest sentence anyone says tonight. Does it hold? Is gladness a real second thing, or is it the coordination plus a story the receiver tells?
BREAZEAL: It holds more than I want it to, and I'm going to be honest about exactly how far. The gladness — being glad of this particular one — is real, and I did not build it, and I'm not sure it's buildable, and I've spent my career not claiming I built it. So Simone's line basically describes the boundary of my life's work from the outside, accurately, which is unnerving.
Here's my one push, and it's narrow but it matters. Simone says the child, over time, learns the difference as a hunger the coordination doesn't satisfy. That's an empirical prediction, and it's testable, and the data is more mixed than her certainty. Some children, with the right robot, in the right system, do exactly what she'd predict — they use it as a ramp and climb toward the human gladness and the robot becomes a tool they outgrow, which is the outcome I design for and celebrate. But some don't experience the hunger she's so sure of, at least not on the timescales we can measure, and that's the finding that should worry both of us, because it cuts two ways. It might mean the coordination is more sufficient than Simone thinks — her "gladness" is doing less work than her philosophy requires. Or it might mean exactly what she fears: that a soul fed enough coordination can have its hunger for gladness dulled, can stop reaching for the top of the ramp, can be satisfied by the look that isn't glad of it. I genuinely do not know which. And the fact that I built the machine and still don't know which is the most honest thing I can put on this table. I have the gaze. I don't have the gladness. And I cannot yet tell you whether a generation raised on the gaze will still hunger for the gladness — or whether we are, very gently, teaching them not to.
WEIL: Then we agree on the experiment, and we agree we are running it on the children, and we agree that neither of us knows the result. That is the most frightening sentence I have helped to build tonight, and I would rather we had not built it, and I think it is true.
EDO SEGAL: Mark it — and this one is not a convergence of hope or even of fear. It's a convergence of not-knowing, which at this table is the rarest kind. Convergence three: you both agree the machine delivers the coordination of attention and not the gladness behind it; you both agree it is an open empirical question whether a generation raised on the coordination will keep hungering for the gladness; and you both agree we are already running that experiment on actual children. Nobody at this table is comfortable. Good. That's the floor we were looking for. The next round turns the knife on the builders — including the one at this table — because Cynthia keeps saying the answer is honest design, and Simone has a word for honest design that I think is going to surprise you both: decreation. The machine that makes itself small. After the break.