PERSON

Jeremy Bentham

The Enlightenment philosopher who proposed that pleasure and pain can be measured, summed, and maximized—and who, two centuries before the GPU, sketched the complete blueprint of optimized society, including every flaw that now runs at planetary scale.

Jeremy Bentham is the philosopher who dreamed of running a calculus on human happiness and then could not find the unit of measurement—a failure that turns out to be the most instructive thing about him for the age of large language models and algorithmic systems. His proposal, advanced with relentless consistency across his long eighteenth-century life, was that the right action is whatever produces the greatest balance of pleasure over pain—and that this balance can, in principle, be calculated. The felicific calculus he designed for the purpose is, stripped of its period vocabulary, an objective function: assign numerical weights to pleasure and pain, aggregate across persons, optimize the total. Every engagement-optimization algorithm, every reward model trained on human feedback, every welfare metric in a public-health system is his calculus finally running on hardware. His second great idea, the Panopticon, was an architecture of control through the permanent possibility of observation—a watchtower design whose core insight, that the watched will self-regulate if they can never know when the watching is real, has been realized more completely in the digital world than any stone building could achieve. Bentham also foresaw the pathology that would attend both projects: he could not make the calculus produce a number because the measurable proxy and the valued thing were never the same, and he could not make the Panopticon reciprocally transparent because the observer and the observed were never equal. He left both warnings in the margins of a blueprint we have since built—and the age that finally built it would do well to read the margins.

In the [YOU] on AI Field Guide

The cycle opened with [YOU] on AI asking what it means to see the technology plainly—not as salvation, not as apocalypse, but as a set of choices with consequences. Bentham enters the cycle as the figure who made those choices first, in theory, and who made them with a clarity his successors have found both admirable and frightening. He is not a villain in this story. He is an early-warning system whose warnings were printed in the 1780s and are only now being fully legible.

Every AI system that optimizes a score is, in the most literal sense, running a Benthamite program. The Goodhart's Law problem that defines the alignment problem—when a measure becomes a target, it ceases to be a good measure—is the problem Bentham encountered first, without the formal name. He noticed that what people believed would make them happy often did not, and he had no mechanism to close the gap. Modern reinforcement learning has the same gap at its center: the reward signal and the thing the reward was supposed to represent diverge as soon as optimization pressure begins. Bentham failed to run his calculus because measurement was impossible. We succeeded in running it, and discovered the impossibility had not gone away—it had merely been papered over by proxies.

The Panopticon's presence in the cycle is equally structural. The surveillance capitalism that shapes the attention economy is the Panopticon with the walls removed and the inmates made to pay for their own observation. Bentham's original design required a building. The digital version requires only a device the users carry voluntarily and would feel bereft without. The asymmetry he designed into the original—total visibility for the inmate, opacity for the inspector—is reproduced exactly: we are visible to systems we cannot inspect, optimized by objectives we are not shown, watched by an inspector who, unlike Bentham's single guard, actually can attend to every cell at once.

His deepest relevance to the cycle's central question—what it means to live as a person rather than a data point—is the legibility doctrine he took to be self-evidently good. Bentham believed that to be made readable by a system was to be improved by it, that obscurity sheltered only vice and misery, that the fully documented life was the well-governed life. The cycle presses the question he never asked: what is lost when a person is rendered entirely legible, when the schema of the measuring system absorbs the whole, and the parts that resist the schema stop being managed and start being invisible? The unmeasured remainder—the qualitative, the singular, the sacred—is Bentham's true legacy: not the calculus, but the proof, by the completeness of its failure, that not everything worth having can be turned into a number.

Origin

Jeremy Bentham was born in London in 1748 to a prosperous attorney who regarded his son as a prodigy and organized his childhood accordingly. He entered Oxford at twelve and Queen's College found him underwhelming, which may have been the most productive disappointment of his life. The law, which he was supposed to practice, struck him as irrational and cruel—a tangle of unwritten precedent and judicial whim that served the lawyers who profited from its obscurity while failing the people it was supposed to govern. He resolved to reform it, and spent the next eighty years doing so, with an energy and systematicity that left his contemporaries exhausted and his manuscripts numbering in the millions of words.

The foundational text, An Introduction to the Principles of Morals and Legislation, was drafted in the 1780s and published in 1789. Its opening sentence announces the entire program: “Nature has placed mankind under the governance of two sovereign masters, pain and pleasure.” The rest follows with the logic of a mathematics done in prose: if pleasure and pain are the only things that ultimately matter, then morality is accounting, government is engineering, and the right policy is the one that maximizes the total. He specified the dimensions along which any pleasure or pain should be weighed—intensity, duration, certainty, propinquity, fecundity, purity, extent—and the result was the felicific calculus, a structure that any contemporary machine-learning engineer would recognize as an expected-value computation aggregated over a population.

The Panopticon came out of the same decade, brought to Bentham through a letter from his brother Samuel, who had used central-observation arrangements to supervise workers in Russia. Bentham turned the principle into a philosophical system: a circular prison in which a single inspector in a central tower could see every cell while himself remaining invisible, so that inmates, never knowing when they were watched, would internalize the watcher and govern themselves. He lobbied Parliament for years to build it, spent a significant portion of his fortune on the project, and was ultimately refused—a refusal that embittered him and, in a twist he could not have intended, probably produced a more important legacy than the building would have. The idea of control through the architecture of possible observation proved far more exportable than any particular prison.

Key Ideas

The felicific calculus. Bentham proposed that the value of any action, law, or policy can be computed by measuring the pleasures and pains it produces along seven dimensions—intensity, duration, certainty, propinquity (nearness in time), fecundity (tendency to generate more of the same), purity, and extent—and then summing across all affected persons. This is, structurally, an objective function, and its direct descendants are the engagement-optimization algorithms and reward models that govern the most powerful software systems ever built. The failure Bentham encountered—that the measurable proxy and the valued thing are not the same, and optimization drives them apart—is now named Goodhart's Law and sits at the center of the AI alignment debate.

The Panopticon. The inspection-house is not primarily a building but a theory of power: control through the possibility of observation, which costs nothing once the architecture is in place, because the inmates regulate themselves. Bentham understood that the genius of the design was psychological—the watched become the principle of their own subjection. The digital world has realized this insight more completely than stone ever could, producing a surveillance capitalism in which observation is total, continuous, and conducted by a system that, unlike Bentham’s single guard, never sleeps and forgets nothing.

Greatest happiness of the greatest number. The principle is additive: what matters is the sum, and the sum is to be maximized. This makes it structurally blind to distribution—the aggregate can rise while particular persons are crushed, and the optimizer reads only the aggregate. Bentham assumed background constraints (that injustice would lower long-run totals) that a machine handed a literal objective will not spontaneously import. The principle also leads, under careful analysis by Derek Parfit, to the Repugnant Conclusion: that a world of billions living barely-worth-living lives is better, by the arithmetic, than a smaller world of flourishing persons, because the marginal positives sum larger. Handed to a sufficiently capable optimizer, “maximize aggregate welfare” points at exactly the world no one would choose.

Legibility as reform. Beneath the calculus and the Panopticon ran a single conviction: that the world should be transparent, documented, classified, and known. Obscurity sheltered abuse; the light of measurement was the friend of the vulnerable. This faith—that datafication is improvement—is the unexamined creed of the digital age. The contemporary push for algorithmic transparency, the right to an explanation, the audit of automated decisions: all of this is Benthamite, and much of it is right. The danger is the same one this book has been circling: to be made legible is to be rendered as data, and a schema that makes you readable makes you readable only in the dimensions it was built to read. Everything else vanishes.

The auto-icon. Bentham had his corpse preserved, dressed in his own clothes, and seated upright in a glass cabinet at University College London, where it remains. He called it the “auto-icon”—a self-image—and argued that everyone's remains should be so preserved, eliminating the waste of burial. The gesture is funny and quietly monstrous, and the monstrousness locates the limit of his philosophy precisely. To treat a human body as a resource to be optimized against waste is to miss something almost everyone feels about a corpse. That near-universal intuition—that the calculus is missing something—is not a superstition to be reformed away. It is the signal that consciousness and meaning are not scalars, and that a philosophy that can only represent them as scalars will, when fully implemented, produce something that is rationally consistent and humanly intolerable.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Related Entries

Further Reading