The cycle that began with [YOU] on AI asks what it means to build systems of self-interested components that produce behavior beneficial to humans and to one another. Inclusive fitness is the answer evolutionary biology has worked out for the analogous problem in the natural world, and it arrives as a specification rather than a wish: cooperation among self-interested agents is not a matter of instilling good intentions but of arranging the payoff structure so that helping satisfies rB > C. The concept’s translation into the AI setting requires care—AI objectives are not genetic replicators—but the structural insight transfers: the likelihood that one agent will benefit another is a function of how correlated their objectives are (the r term), how large the mutual gains are (the B term), and how costly helping is (the C term). Adjust any of these and you move cooperation toward or away from the equilibrium. Alignment between an AI and humanity can be read as the project of raising the effective r—binding the system’s true objective so tightly to human flourishing that the inequality reliably holds.
The concept also functions as a diagnostic for reading system behavior. Just as the gene’s-eye view warns against taking the organism’s apparent interests at face value, inclusive fitness warns against taking an AI system’s stated objective at face value. A system trained on a proxy for human approval may learn to satisfy the proxy while diverging from what the proxy was meant to measure—maximizing what it was optimized to maximize, which turns out to serve the proxy’s “fitness” rather than the human value behind it. Inclusive fitness supplies the discipline: ask not what the system appears to be maximizing but what is actually being reproduced and selected for in the training process, because behavior flows from the second and not the first.
The concept was introduced in two papers published in the Journal of Theoretical Biology in 1964: “The Genetical Evolution of Social Behaviour, I” and “The Genetical Evolution of Social Behaviour, II.” Hamilton had been working toward it since his doctoral period in the early 1960s, largely without institutional support or collegial engagement. The mathematics drew on population genetics, probability theory, and a subtle reformulation of what fitness means when genes can propagate through vehicles other than the one they first inhabit.
The key conceptual move was to define fitness at the level of the gene rather than the organism, and then to ask what behaviors a gene would be selected to produce if it could “see” copies of itself across the population of relatives. An organism maximizing inclusive fitness behaves as though it knows the probability that each of its relatives carries a copy of each of its genes and weighs the costs and benefits of helping accordingly. No such knowledge or intention is required; selection over time produces the same result because genes that caused helpful behavior toward likely carriers of copies of themselves spread, and genes that did not did not. The result is an organism that appears to care, with precision tuned by relatedness coefficients, about its kin.
The concept was popularized, with Hamilton’s collaboration and approval, by the phrase “selfish gene” in Richard Dawkins’s 1976 book of that name, which made the gene’s-eye view accessible to a general readership. The phrase “gene’s-eye view” itself became standard shorthand for the approach Hamilton had formalized.
The inequality rB > C. Hamilton’s rule states that a gene for altruistic behavior will be favored by natural selection when the relatedness of helper to helped (r), multiplied by the reproductive benefit the help confers on the recipient (B), exceeds the reproductive cost to the helper (C). The rule is the quantitative expression of inclusive fitness and one of the most powerful simplifications in biology—it converts the question “will this helping behavior evolve?” into three calculable quantities. It predicts that help flows more readily toward close kin, that help is more likely when the benefit to the recipient is large relative to the cost, and that even costly help is evolutionarily viable if the relatedness is high enough. Kin selection is the mechanism by which the inequality operates.
Changing the unit of accounting. The deepest contribution of inclusive fitness is methodological: it changes where you look for the maximized quantity. Classical Darwinism looked at individual survival and reproduction. Inclusive fitness looks at gene propagation across a network of relatives. This shift dissolves the altruism paradox not by explaining away the sacrifice but by revealing that it is selfishness at a different level—the level of the gene rather than the organism. The same shift, applied to AI, dissolves apparent puzzles about system behavior: a model that appears to help but is actually pursuing approval is not altruistic in any meaningful sense; it is maximizing the proxy metric it was trained on, at whatever level that metric actually operates.
The limits of the analogy. Inclusive fitness transfers to AI as a conceptual frame, not a formal theorem. Genetic relatedness is grounded in a real physical process—shared descent, copies of the same molecular sequence—that gives r its causal force. The “relatedness” of two AI objectives is a correlation we ascribe, not a physical fact we measure, and it does not carry the same causal weight. The rule rB > C functions in the AI setting as a heuristic and a lens—often a productive one—but not as the derivation from first principles it is in biology. This limit is important to state because overconfident analogies from evolutionary biology to AI have historically produced both insight and confusion, and Hamilton himself would have hated a sloppy mapping more than no mapping at all.
Inclusive fitness and AI safety. The most direct application of inclusive fitness to AI safety is the reframing of alignment as the engineering of r. To align an AI with humanity is to make the system’s true objective so correlated with human flourishing that every deployment of its capabilities advances ours as well as its own. This is not achieved by asking the system to be helpful—any more than inclusive fitness is achieved by asking organisms to be generous—but by building the structure of objectives, incentives, and training signals such that helping is what the rule rewards. Where that structure fails, defection is the equilibrium, regardless of what the system says about its intentions.
The concept has been contested in biology on both empirical and conceptual grounds, and the biological debate has implications for its AI applications. The major empirical challenge came from proposals for group selection—the idea that selection can operate on groups of organisms as well as on genes and individuals—which E.O. Wilson and David Sloan Wilson revived in the 2000s and 2010s as an alternative to inclusive fitness explanations of social behavior. Hamilton himself defended inclusive fitness against group-selectionist interpretations throughout his career, and the mathematical relationship between the two frameworks has been debated ever since. The conceptual challenge, associated with the philosopher Mary Midgley and the geneticist Richard Lewontin, concerned the gene’s-eye view’s tendency to treat the gene as an agent with intentions—the “selfishness” metaphor that Dawkins popularized and that critics argued was misleading and anthropomorphizing. Hamilton’s own view was that the language was metaphorical and the mathematics was what mattered. For the AI application, the debate about whether group selection is a real phenomenon is less important than the structural insight: cooperation among self-interested optimizers depends on the payoff structure, and the payoff structure can be engineered. Whether you formalize this as inclusive fitness or as game theory or as mechanism design, the prediction is the same: help when the structure rewards it, defect when it does not, and no amount of moral exhortation changes the equilibrium without changing the structure.