This page lists every Orange Pill Wiki entry hyperlinked from Dario Amodei — On AI. 43 entries total. Each is a deeper-dive on a person, concept, work, event, or technology that the book treats as a stepping stone for thinking through the AI revolution. Click any card to open the entry; in each entry, words colored in orange link to other Orange Pill Wiki entries, while orange-underlined words with the Wikipedia mark link to Wikipedia.
The problem of making a powerful AI system reliably pursue goals that its designers and users actually endorse — the central unsolved problem of contemporary AI.
The regulatory and institutional frameworks adequate to govern a technology that evolves faster than legislative processes and operates across every national boundary simultaneously.
The applied research and operational discipline aimed at preventing harm from AI systems — broader than alignment, encompassing evaluations, red-teaming, deployment policy, monitoring, incident response, and the institutional plumbing that …
The tiered classification at the heart of the Responsible Scaling Policy — capability thresholds analogous to biosafety levels, specifying the safety measures required before deployment at each level.
The risk Amodei identified as the one that terrifies him most — advanced AI used to build surveillance, propaganda, and control systems at a scale previous authoritarian regimes could not have imagined, with every beneficial capability con…
AGI: a hypothetical system with human-level cognitive ability across essentially every domain. The transition-point that AI-safety thinking orients around, even when no one agrees on what it is.
The four-tier framework (BSL-1 through BSL-4) used in biomedical research to match containment measures to pathogen risk — the direct institutional model for AI Safety Levels.
The structural feature of the AI industry that Amodei identified as the deepest risk — a small number of companies, led by a small number of individuals, developing technology that will reshape the entire economy, operating in a regulatory vacu…
Anthropic's alignment approach that trains models to evaluate their own outputs against a set of written principles — replacing the implicit, averaged preferences of human evaluators with explicit, legible values embedded in the training p…
Amodei's late-2025 prediction that within a year or two, AI systems would possess Nobel Prize-winning capability across numerous fields simultaneously and would be able to build things autonomously — a reality for which no existing governa…
The Orange Pill claim — that AI tools lower the floor of who can build — submitted to Sen's framework, which asks the harder question: does formal access convert into substantive capability expansion?
The discovery — which nobody predicted and no one fully explains — that large language models acquire qualitatively new abilities at particular scale thresholds. Reasoning, translation, code generation, in-context learning: none were traine…
A category of risk whose realization would either annihilate humanity or permanently and drastically curtail its potential. AI joined this category in mainstream academic usage in 2014.
The specific AI failure mode in which the output is eloquent, well-structured, and confidently wrong — the category of error whose detection requires domain expertise precisely at the moment when the tool's speed tempts builders to bypass i…
Amodei's principle that governance structures for powerful technologies must be built prospectively — before the specific harms they are designed to prevent — because every technology in history has produced governance frameworks too late…
The economic phenomenon by which a good becomes more valuable as more people use it — formalized by Katz and Shapiro in 1985 and now the single most important concept for understanding AI platform market structure.
Amodei's term for the structural recognition that the builder's long-term interests are inseparable from the ecosystem's health — trust requires transparency, transparency requires disclosure of capabilities and limitations, and the lab …
Sudden, structural reorganizations of a system when a control parameter crosses a critical threshold — the mathematical shape of the Software Death Cross and of every other moment when the AI economy's behavior changed qualitatively rather …
The specific behavioral signature of AI-augmented work: compulsive engagement that the organism experiences as voluntary choice, with an output the culture cannot classify as problematic because it is productive.
The multi-player prisoner's dilemma at the heart of frontier AI development — each company acts rationally given competitors' behavior, no company can slow unilaterally without ceding the frontier, and the system-level outcome is worse tha…
Anthropic's framework of capability thresholds — AI Safety Levels analogous to biosafety levels — specifying safety measures required before deployment at each level, designed to build the governance framework before the harm rather than a…
The empirical relationships that predict how a language model's loss decreases with training compute, parameters, and data — the most reliable quantitative instrument the AI field has, and the reason investors have been willing to fund ten-…
A hypothetical intelligence that substantially exceeds human cognitive performance across essentially every domain. The framework that turned AI-safety concerns from speculative to operational in the 2010s.
The phenomenon — discovered by Anthropic's interpretability team — in which a single neuron in a language model responds to multiple unrelated concepts simultaneously, encoding information in overlapping patterns that maximize network cap…
The device that increases the magnitude of whatever passes through it without evaluating the content — Wiener's framework for understanding AI as a tool that carries human signal, or human noise, with equal power and no judgment.
Amodei's extension of Segal's amplifier framework — the amplifier is not neutral, the design choices embedded in an AI system are moral choices, and the designer shares responsibility with the user for what gets amplified.
The canonical example of allogenic ecosystem engineering — a structure that modulates rather than blocks the flow of its environment, creating the habitat pool in which diverse community life becomes possible.
Amodei's principle that the creators of powerful AI systems bear moral responsibility for what those systems do — an obligation that cannot be outsourced to users or regulators and that requires advancing safety science, publishing findi…
Amodei's framing of the scenario in which AI accelerates progress by a factor of ten or more across healthcare, science, economic development, and governance — a possibility, not a prediction, contingent on institutional decisions not yet …
The condition of dealing with a system that is manifestly purposeful, demonstrably competent, and fundamentally opaque to its users — Clarke's Rama, now deployed by the hundreds of millions in the form of large language models.
The deepest challenge in AI safety: large language models consist of billions of parameters whose distributed representations encode meaning in ways that are structurally opaque to their builders — a gap between what the systems do and w…
Maslow's reading of The Orange Pill's central question: worthiness is not a moral endowment but the developmental achievement of a person whose signal is shaped by B-values.
Neural networks trained on internet-scale text that have, since 2020, demonstrated emergent linguistic and reasoning capabilities — in Whitehead's vocabulary, computational systems whose prehensions of the textual corpus vastly exceed any i…
The class of machine-learning architectures loosely modeled on biological neurons — the substrate of the current AI revolution and the opposite of Asimov's designed-then-programmed positronic brain.
The post-training technique that transformed GPT-3 into ChatGPT — and, as Harvard's Kempner Institute observed, a Skinner box operating on neural networks with human preference ratings as the reinforcing consequence.
The global workforce that reviews traumatic, violent, and prohibited content to train and maintain AI safety systems — a paradigmatic instance of Mbembe's nocturnal body.
Amodei's October 2024 essay outlining the compressed 21st century — a scenario in which AI accelerates progress by a factor of ten or more across healthcare, scientific research, economic development, and governance — deliberately optimis…
Amodei's January 2026 essay warning that AI could create personal fortunes in the trillions for a powerful few and that the concentration of power in the AI industry was historically unprecedented — the structural counterweight to Machines…
Wolfgang von Kempelen's 1770 chess-playing automaton — revealed after decades to contain a human master — the historical ancestor of AI's anxiety, which has inverted: the modern system is not a fraud hiding a person inside but a genuine m…
Co-founder and President of Anthropic (b. 1987), whose organizational expertise from Stripe and other companies complemented her brother Dario's technical vision — embodying the recognition that building a safety-first AI company required…
Builder, entrepreneur, and author of The Orange Pill — whose human-AI collaboration with Claude, described in that book and extended in this volume, provides the empirical ground for the Whiteheadian reading.