
The cycle built around [YOU] on AI offers the first sustained empirical account of what the AI transition feels like from inside a working organization. Coyle offers the complementary account of what it looks like from the perspective of the systems designed to evaluate it—and her verdict is that those systems cannot see what is happening. The productivity statistics will record the twenty-fold multiplier as an unambiguous gain. They will not record whether the gain was achieved through efficiency (the same output with less cognitive strain) or intensity (twice the output at twice the cognitive cost). They will not record the household production gap—the domestic and relational work displaced when the exhilaration of AI-augmented building colonizes the hours previously reserved for the people at home. They will not record the quality dimension: whether the ten features shipped in the time previously required for one represent ten genuine improvements or ten competent-but-shallow artifacts.
Her concept of counting governs managing is the sharpest formulation of what this means for governance: the policy apparatus responds to what is measured, and what is measured is production. A government guided by the production metrics will celebrate the AI transition and invest in digital infrastructure to accelerate it. A government guided by metrics that could also see depletion, displacement, and quality erosion might ask different questions. The transition is generating gains that the dashboard can see and costs that it cannot; the gains will be celebrated and the costs will compound until they appear in the form of a burnout wave, a quality crisis, or a political backlash that the smooth metrics offered no warning of.
The invisible surplus argument is where Coyle’s framework most directly engages the Orange Pill’s core observation about the democratization of capability. When AI enables a marketing manager to build a custom tracking tool in an evening that she would previously have paid a developer to build, she creates real value that generates no market transaction and therefore no GDP contribution. When every knowledge worker becomes a potential software producer through the language interface, the production boundary of the economy is being redrawn in real time by millions of individual decisions—and no statistical office has updated the framework to capture the result. The invisible surplus from AI tools is potentially the largest single economic effect of the transition, and the dashboard cannot find it.
Diane Coyle (b. 1961) trained as an economist at Oxford and Harvard and built her career at the intersection of academic research, policy advisory work, and public intellectual writing. She was a senior advisor to the UK Treasury in the 1990s and has since served on the UK’s Migration Advisory Committee, the BBC Board, and numerous other policy bodies. She co-directs the Bennett Institute for Public Policy at Cambridge, where her research program on the political economy of AI and digital technology has produced some of the most technically rigorous empirical work on AI measurement currently available.
Her 2014 book GDP: A Brief but Affectionate History is the definitive popular account of the metric’s origins, capabilities, and limitations—affectionate because she respects what GDP does, brief because the argument is simple even though its institutional implications are not. Her 2021 Cogs and Monsters: What Economics Is, and What It Should Be expanded the critique to the broader methodological complacency of economics as a discipline. Her 2025 The Measure of Progress synthesizes the measurement reform argument in its most mature form. Her 2025 working paper with Jörden and Poquiz on AI adoption determinants, and her Stanford Digital Economy Lab white paper on measuring AI, represent her most recent empirical engagement with the specific measurement challenges the AI transition creates.
She is unusually honest about the political economy of measurement reform—about the difference between winning the argument intellectually and changing the institutions that embed the old metrics. GDP is not kept in place by bad economists or corrupt politicians; it is kept in place by quarterly reporting cycles, central bank models, international lending conditionality, and the career incentives of every policy analyst who has ever written a brief calibrated to GDP. The same institutional infrastructure that Coyle is trying to reform is the infrastructure she operates within, and she has never pretended the tension away.
GDP as wartime instrument. Simon Kuznets delivered the first national income accounts to the U.S. Senate in 1934, and by 1944 Bretton Woods had adopted the metric as the international standard. Kuznets himself warned Congress that national welfare could scarcely be inferred from a measurement of national income. He was heard, noted, and ignored. The GDP as wartime instrument—designed to count factory output during mobilization—became the universal standard for comparing economies in peacetime, and its architecture has never been updated for a world in which the dominant form of production is cognitive rather than physical.
The invisible surplus. Consumer surplus has always exceeded market value; the digital economy made the gap enormous by making the price zero. AI extends the gap further by turning users into producers: the marketing manager who builds a custom tool with Claude is producing real value that generates no market transaction. The invisible surplus from AI-assisted personal production is potentially larger than any measured productivity gain, and it is distributed in ways that income inequality statistics cannot capture, because the capability expansion reaches people regardless of their prior economic position.
The household production gap. The systematic exclusion of unpaid domestic labor from national accounts is a structural failure Marilyn Waring documented in 1988; Coyle has argued for decades that it must be corrected. The AI transition supercharges the problem: when AI-augmented work is supernormally stimulating and the flow state it produces competes with domestic engagement for the same hours, the household production that is displaced does not appear in any metric, and the policies shaped by those metrics will not address the displacement. The household production gap is not a rounding error; it is where the parenting, the caregiving, and the relational maintenance that sustain human capability are recorded—which is to say, nowhere.
Efficiency vs. intensity. The productivity statistic divides output by labor input but cannot distinguish an efficiency gain (the same output with less effort) from an intensity gain (more output with proportionally more cognitive strain). The Berkeley study on AI-augmented knowledge workers that the cycle documents—where AI colonized pauses, expanded task scope, and produced what the researchers called “task seepage”—shows intensity masquerading as efficiency. The cognitive intensity metric Coyle proposes would attempt to distinguish them; without it, the policy conversation proceeds as though the sustainability of the gain has been established when it has only been assumed.
Counting governs managing. Measurement systems are not neutral recording devices; they are incentive structures. What they measure, they reward; what they cannot measure, they penalize by neglect. A measurement system that counts output without assessing quality does not merely fail to capture quality; it actively discourages the investments—in time, attention, and the friction of genuine understanding—that quality requires. The AI transition, measured by the current system, will look like an unambiguous triumph. Measured by a system that could also assess the sustainability of the gains, the quality of the outputs, and the wellbeing of the people producing them, it might look like something more complicated.
Coyle’s measurement reform agenda faces three consistent lines of resistance. The first is institutional: GDP has been integrated into quarterly reporting cycles, central bank models, and international lending conditionality for eighty years, and the political economy of displacing it—even with supplements—is formidable. No new metric has yet achieved the institutional weight that would make it a genuine input to economic governance rather than supplementary information. The second is methodological: critics argue that composite or multi-dimensional wellbeing indices are inherently political—that the weights assigned to different components reflect value choices that cannot be made by statisticians and should not be made by economists. Coyle’s response is that GDP is equally political in its architecture (it values market transactions and not household production, quantity and not quality) but conceals this politics behind the appearance of objectivity. The third is AI-specific: some economists argue that the productivity J-curve—the pattern in which transformative technologies produce apparent stagnation before gains materialize—explains the current measurement gap, and that the statistics will catch up to the reality eventually without methodological reform. Coyle’s counter-argument, developed in her October 2025 essay on measuring AI’s economic impact, is that the J-curve explanation applies to technologies that eventually produce measurable physical output, whereas AI’s primary impact operates through decision quality and organizational transformation—neither of which will appear in a system designed to count widgets, however long we wait. The measurement dashboard is not merely lagging; it is constitutionally blind to the relevant signals.