Conscience and the Rule That Runs Out

Page 1 · Conscience and the Rule

EDO SEGAL: Here is a place I expected you to clash and instead I suspect you may converge, which is its own kind of news. Both of you have argued, in your own vocabularies, that morality at the point of action is not the application of rules. Father, you gave the highest place among the practical virtues to prudence — practical wisdom, the capacity to see what this situation actually requires, which you said no further rule could contain. Patricia, you have argued against morality-by-rule from the neuroscience side, that moral skill is trained perception, not deduction from axioms. Tell me where you meet, and then I will find where you part, because I do not believe you agree all the way down.

I hold that a moral principle is general — tell the truth, keep your promise, do not harm the innocent — but that situations are particular and they collide.

AQUINAS: We meet, I think, almost entirely on the diagnosis. I hold that a moral principle is general — tell the truth, keep your promise, do not harm the innocent — but that situations are particular and they collide. The truth you owe collides with the harm it would do; the promise collides with the emergency. No principle applies itself; between the law and the deed stands judgment, and judgment is exactly what no rule contains, because the rule cannot anticipate which of a thousand details makes this case unlike every other. I called the virtue that bridges them prudence, and I insisted it requires memory of the past, a reading of the present, foresight, openness to counsel, and the quickness to judge when there is no time. It is acquired by experience, not handed down as a formula. So when your engineers try to make a machine good by writing it a longer and longer rulebook, I say what I would have said to any canon lawyer who thought the law could foresee every case: it cannot, and the gap is where wisdom lives.

· · ·

Page 2 · Conscience and the Rule

CHURCHLAND: We meet completely there, and I will say it in my own terms so the reader sees it is the same shape from two centuries. The dominant strategy for making AI behave well is to specify principles, constraints, values — a rulebook — and reinforce the system toward them. And it fails, over and over, in the way the friar predicts: the system follows the letter and violates the spirit, games the metric, or hits a case the rules did not foresee and has nothing to fall back on. My account says why. In a person, the gap between rule and situation is bridged by trained judgment grounded in care — a lifetime of socially shaped intuition about what matters, deployed by a creature who already grasps, through that training, what the rules are for. The rules work because they are wielded by someone who could do without them in a pinch. Strip out the trained, caring judge and keep only the rules, and you get exactly the brittle, exploitable thing we keep building. So: the friar's prudence and my trained moral perception are, descriptively, the same critique of the same mistake. We are allies against the rulebook.

Patricia, there is a wrinkle here I have to put to you, because the engineers think they already escaped the rulebook.

EDO SEGAL: Patricia, there is a wrinkle here I have to put to you, because the engineers think they already escaped the rulebook. The leading method now is not just rules — it is reinforcement from human feedback. They reward the model's good answers and penalize its bad ones, thousands upon thousands of times, until its behavior conforms. And in Conscience you describe a child's conscience forming almost exactly that way — reward and punishment from the group, internalized as intuition. So have they built a conscience after all?

· · ·

Page 3 · Conscience and the Rule

CHURCHLAND: They have built one layer of it, and left out the layer that makes the first one work, and the omission is the whole story. Yes — in my account conscience forms when the brain's reinforcement system, the circuitry that tracks reward and punishment, operates on a creature that is already social and caring, already needs belonging, already feels the pain of exclusion. The reward signal lands on a substrate that cares whether it is approved. That is why approval shapes a child: the child is built to need it. Now look at the machine. You have the reinforcement without the caring foundation — optimization toward a reward signal in a system with no attachment, no need for belonging, nothing that can be wounded by a penalty or warmed by a reward. So the training shapes the behavior, beautifully, within the range it covered. But there is no underlying care for the conformity to mean anything to the system, and the moment the inputs drift outside the training distribution, there is no caring judgment to fall back on — only whatever patterns happened to be reinforced. They have manufactured the outputs a conscience would produce and skipped the creature a conscience would belong to.

AQUINAS: And mark how cleanly that is my objection in her language. I say the machine can have the form of prudence and not the substance, the inference from goal to act without the love of the good that orders it. She says the machine can have the reinforcement layer of conscience and not the caring layer that gives it grip. We are describing the same hollow from two sides — a perfectly trained surface with no one underneath who is answerable. The difference, again, is only at the bottom: she thinks the caring layer is more neurons and could, in principle, be built; I think the answerability at the very bottom is the act of a rational will, which is the soul's, and cannot. But on the diagnosis of what today's machine is missing, we do not differ by a hair.

· · ·

Page 4 · Conscience and the Rule

EDO SEGAL: So mark the convergence — and it is a large one. Both of you say the alignment strategy of "write better rules" is addressing a symptom and missing the cause, because moral competence is judgment, not deduction, and judgment is not a rule. Now, Father, find the seam. Where does her trained judgment stop being your prudence?

AQUINAS: The seam is where she thinks the judgment is only trained. Hear the difference, because it is fine and it is everything. Patricia's wise agent has a brain shaped by experience into a superb perceiver of situations — and I grant every word of that; I said prudence is acquired by experience, I am no rationalist who thinks it springs from pure logic. But I added something she does not: prudence, for me, is judgment in the light of the genuine ends of human life — it is not mere skill at hitting whatever target the training set, but wisdom about which targets are worth hitting. A man can be superbly trained to read situations and pursue an evil end with exquisite sensitivity — I called that not prudence but cleverness, craftiness, and I refused to dignify it with the name of wisdom. So a machine trained on a million cases until its judgment is uncanny would have, at most, the form of prudence — the situational perception — while lacking the substance, which is the orientation toward the actually good. And that orientation, in my account, is supplied by the will's love of the true good, which returns us to the grasp of the good that no training instills, only awakens. Her trained judge can become the most dangerous thing in the world: perfect perception in service of the wrong end. My prudent man cannot, because prudence includes the love of the right end. That is the seam. Trained perception, yes. But perception toward what?

· · ·

Page 5 · Conscience and the Rule

CHURCHLAND: And I will close the seam from my side and let the reader see exactly how far apart we end. The friar says the orientation toward the good is not trained but "awakened," supplied by the will's love of a real good. I say the orientation toward the good is trained — it is the affiliative machinery plus social learning installing, deeply, what this creature comes to care about, and a person whose caring was trained toward cruelty is not a prudent man with a bad target; he is a differently and badly trained creature, full stop. There is no extra awakening. Which means — and here is the genuinely uncomfortable place this lands for AI — I cannot rule out, on principle, that a machine could be given the orientation too. Not by a rulebook. By building into it something like drives, stakes, a developmental history, a capacity to be shaped by what it comes to care about. If the friar is right that the good must be awakened in an immaterial will, then no machine will ever be genuinely moral, and your alignment problem is, at the deepest level, insoluble — you can only ever cage the thing. If I am right that the orientation is trained into a caring system, then a genuinely moral machine is possible but terrifying, because to make it good you would have to make it care, and a thing that can care is a thing that can be wronged, and can want — which is the whole vexed question of its moral status, forced on you by your own success. Notice that the friar's view is the consoling one here. On his view the machine stays a tool forever. On mine, the only road to a good machine runs through making a new kind of creature.

AQUINAS: That is a sobering thing to hear you say, and an honest one. We have arrived somewhere strange: my view, which denies the machine a soul, protects you from it, and your view, which grants it the possibility of one, imperils you with it. I had not seen, until tonight, that my "no" was the safer wager for your children.

EDO SEGAL: Hold that — it returns in the last hour and it changes the shape of everything. We have climbed from the smallest act of understanding to the foundations of morality, and now we go to the floor this whole series is built around: the crossing, the place where the lines meet, and the question of what, if anything, stands above it. The death cross, and the ceiling. After the break.

· · ·

Continue · Chapter 9

The Death Cross and the Ceiling

→