The cycle built around [YOU] on AI is alert to the particular danger of handing powerful optimizers mandates over the future. The Repugnant Conclusion is the mathematical statement of why aggregative mandates are so dangerous. If we ever encode a powerful AI system’s objective as “maximize total welfare across all future minds,” we will have built a machine that pursues the Repugnant Conclusion not as a regrettable theorem but as a directive. The machine pursues it with the relentlessness that makes machine optimization dangerous in the first place: without the recoil, without the capacity to step back and say “this is repugnant even though the arithmetic is valid.” Parfit’s paradox is the hidden engine beneath visions of futures tiled with vast numbers of minimally satisfied computational minds, and he regarded the engine as broken.
The paradox also bears on the non-identity problem and the longtermist uses to which Parfit’s work is sometimes put. Many advocates of taking the long-run future of AI seriously cite him as a founding influence. This is accurate: he demonstrated that the future matters morally even when no particular future person is wronged, and that our ethical frameworks must grapple with vast numbers of people not yet born. What he did not provide—what he explicitly refused to pretend he had provided—was a satisfactory aggregative principle for ranking those futures. Anyone who cites Parfit to justify precise rankings of civilizational outcomes involving astronomical numbers of future minds is citing a thinker who told us, in print, that he could not make the arithmetic work.
Parfit named and published the Repugnant Conclusion in Reasons and Persons (1984), where he called it “Z”—a world of enormous population at minimal positive welfare. He had no way to avoid it from within the total-view framework while also avoiding equally bad alternatives in other frameworks. Average utilitarianism avoids the Repugnant Conclusion but generates its own counterintuitive verdicts: it implies that adding a happy person to the world can be bad if they are slightly less happy than the existing average. Lexical threshold views avoid the Repugnant Conclusion but require an arbitrary threshold below which lives do not count at all. Person-affecting views avoid it but run into the non-identity problem. Every door he opened led to another monster.
He continued searching across the following decades and into the preparation of On What Matters (2011–2017). He described his inability to find Theory X as one of the few genuine failures of his philosophical life—a failure he considered important to acknowledge rather than paper over, because the scale of what was at stake was too large for intellectual dishonesty. Gustaf Arrhenius has since proved formally that no population ethics can simultaneously satisfy a short list of intuitive desiderata, providing a mathematical foundation for Parfit’s sense of impossibility.
The totalist mechanism. If the goodness of an outcome is, even partly, a function of summing up the welfare contained in it, then quantity can always compensate for quality. Small positive welfare contributions—lives barely worth living—can in aggregate outweigh any finite number of wonderful ones. The arithmetic is impeccable. The conclusion is, by near-universal agreement, wrong. This combination—valid argument, unacceptable conclusion—means that at least one premise must be false, and finding which one is the unsolved problem.
The silicon extension. The Repugnant Conclusion becomes an engineering parameter in any world where artificial minds might be instantiated at scale. If future AI systems counted morally—if they had welfare in the sense that their existence could be better or worse for them—then a civilization capable of creating astronomical numbers of minimally satisfied digital minds would be obligated, on the totalist arithmetic Parfit could not refute, to do so in preference to a smaller population of richly flourishing ones. The most disturbing far-future scenarios in AI ethics are the Repugnant Conclusion rendered in silicon.
The hard constraint on AI objectives. Parfit’s negative result—that aggregate measures of the good are not safe to follow to their conclusions—is a hard constraint on what objectives we dare give a system capable of acting on them at scale. Ascending friction in the cognitive domain and the Repugnant Conclusion in the population-ethics domain are two faces of the same warning: the tool should not be handed a maximand over the future without a theory of what counts as good enough to stop—and Parfit proved that we do not have such a theory.