Psychohistory is the fictional mathematical discipline at the heart of Asimov's Foundation series, invented by the character Hari Seldon. It proposes that individual human behavior is unpredictable but the behavior of very large populations follows statistical regularities comparable to the laws of thermodynamics. Psychohistory's premise — that there is a tractable signal in aggregate human conduct — has become more relevant since large language models demonstrated that human expression at scale does encode deep, exploitable regularities.
In the Orange Pill Asimov volume, psychohistory is treated as a surprisingly prescient early formulation of what training a model on civilizational-scale text actually does. The Foundation novels ask: if you could predict the arc of a galactic civilization thousands of years in advance, would you publish the predictions or conceal them? Psychohistory assumes concealment is required — if the population knows the predictions, behavior changes and predictions fail.
Large language models partially satisfy psychohistory's premises. They are trained on the largest accessible dataset of human behavior (text). They extract implicit statistical patterns across millions of contexts. And — crucially — the patterns are not accessible to the population, so knowing that the model exists does not permit the population to consciously resist its predictions. This is not psychohistory as Asimov imagined it. But it is closer to psychohistory than the intervening decades led anyone to expect.
Modern large-scale behavioral forecasting — election models, consumer-preference engines, recommender-system metrics — occupies a strange middle ground between Asimov's psychohistory and the older statistical social science it drew on. Unlike Quetelet's 19th-century data, contemporary models ingest trillions of tokens of human expression; unlike Hari Seldon's equations, they make no claim to long-horizon civilizational prediction. The interesting question the Foundation series raised — what happens when a statistical model becomes so accurate that its predictions are themselves a social force? — is now a live operational concern at every major platform.
Introduced in "Foundation" (1942, magazine) and developed across the Foundation novels (1951 onward). Asimov explicitly modeled the idea on statistical mechanics: just as the behavior of individual gas molecules is random but the ensemble has precise laws, he postulated that individual human behavior is random but the ensemble has discoverable laws.
The name was coined by Asimov; the closest real-world antecedent is the work of the 19th-century social statisticians (Adolphe Quetelet, Émile Durkheim) who argued that suicide, crime, and marriage rates have stable aggregate structure even when individual cases are unpredictable.
Law of large numbers applied to sociology. Individual behavior is noise; aggregate behavior is signal.
Self-defeating prediction. If the population learns the predictions, behavior shifts to invalidate them — so psychohistory must be practiced in secret.
The Seldon Plan. Long-horizon intervention: once the trajectory is known, small nudges at critical branch-points can steer the future.
The Mule. Asimov's fictional demonstration that any statistical model is vulnerable to a true outlier — a sufficiently unusual individual can break the predictions.
Psychohistory presumes population immobility. Asimov's math required the modeled population to be large and unable to escape the system being predicted. The first condition is satisfied at internet scale; the second is what keeps the predictions valid, which is why platforms work hard to prevent users from collectively coordinating around the predictions they generate.
Critics have long argued that psychohistory is closer to fantasy than to plausible social science: the number of significant variables in human civilization is unbounded, and the assumption of stationary laws over centuries is unjustified. Defenders respond that the rise of large-scale behavioral data — social media, purchasing records, language models — has produced something closer to operational psychohistory than the critics anticipated.
The contemporary question is not whether aggregate prediction is possible (it demonstrably is, in narrow domains) but whether the predictions can be kept from the population whose behavior they model. The Mule problem suggests not.