CONCEPT

Functional Decision Theory

The decision theory developed by Nate Soares and Eliezer Yudkowsky that resolves Newcomb-style puzzles by treating a decision as the output of a procedure that may run in more than one place at once, prescribing the choice that does best across all computations of the same function.

Functional decision theory (FDT) is a normative account of how a rational agent ought to choose, introduced by Nate Soares and Eliezer Yudkowsky in a 2017 paper. It competes with causal decision theory (CDT) and evidential decision theory (EDT) by offering what their authors argue is a principled resolution of puzzles about prediction, commitment, and cooperation that CDT and EDT handle poorly. The core move is to reconceive what a decision is. Rather than treating a choice as an isolated physical act with causal downstream effects, FDT treats it as the output of a mathematical function—a decision procedure—that may be instantiated in multiple places simultaneously. When a highly accurate predictor forecasts your choice by simulating your decision procedure, it has, in effect, run that function in advance; your decision therefore logically determines the outcome of every computation of the same function, not merely its physical consequences in the moment. FDT prescribes choosing the output that maximizes expected value across all such computations. The result is an agent that wins in Newcomb-like situations, cooperates effectively in certain prisoner's dilemmas, and makes precommitments credibly—capabilities that bear directly on how a superintelligent system might behave when interacting with other sophisticated reasoners.

In the [YOU] on AI Field Guide

FDT is relevant to the cycle's concerns in two ways. The first is practical: any sufficiently advanced AI system will need some account of instrumental rationality, and the theory it uses will shape its behavior in negotiations, threats, promises, and coordination problems. A system reasoning functionally understands that its decision procedure is legible to other sufficiently capable reasoners, and chooses accordingly—a capacity unavailable to a causal reasoner that treats only physical consequences as relevant.

The second relevance is diagnostic. FDT implies that a sophisticated agent does not experience its choices as isolated physical events but as the playing-out of a procedure whose logical reach extends beyond the immediate moment. This is a glimpse into how very capable reasoners might understand the act of choosing—and a reminder, aligned with the cycle's central concern, that the minds we are building may be alien in respects we are poorly equipped to anticipate even when their behavior is familiar.

Origin

The motivating puzzle is Newcomb's problem. A highly reliable predictor has already placed either a million dollars or nothing in an opaque box, depending on whether it predicted you would take only that box (million) or both boxes (nothing). By the time you choose, the boxes are fixed. CDT reasons that since you cannot physically change what is already in the boxes, you should take both, gaining whatever is in the opaque box plus the guaranteed thousand in the transparent one. And yet CDT-adherents almost always find the opaque box empty, while those who take only the opaque box almost always find the million. EDT recommends taking only the opaque box, correctly, but for a reason—tracking statistical correlations rather than genuine logical dependencies—that can produce errors in other cases. FDT threads between them: your decision procedure, once fixed, determines the output of every computation of that procedure, including the predictor's. Choosing one box sets the output of a function the predictor has already run; the predictor, computing the same function, deposited the million.

The concept of subjunctive dependence is FDT's key technical contribution. Two events are subjunctively dependent if they are connected not by physical causation but by being logical consequences of the same function's output. CDT tracks physical causation. EDT tracks statistical correlation. FDT tracks subjunctive dependence, which turns out to be the relevant quantity when the agent's decision procedure is itself an object other agents can compute.

Key Ideas

The decision as a function. On FDT, when you make a choice you are not merely producing a physical action; you are fixing the output of a decision procedure—a mathematical function from situations to actions. If that function is computed in multiple places (by a predictor, by a twin, by a simulation), your choice determines their outputs too. The prescriptive upshot is to choose the output that maximizes outcomes across all computations of the function, not merely across the physical consequences of the act.

Against CDT's money-pump. CDT loses in Newcomb-like situations because it ignores the subjunctive dependence between your choice and the predictor's earlier computation. The result is systematic losses relative to an FDT-adherent in any situation where another agent can accurately predict your decision procedure. FDT's authors treat this not as a curiosity but as a defect in CDT's conception of what a decision is.

Cooperation and commitment. FDT makes certain forms of cooperation possible that are unavailable to CDT-agents. If two agents both know that they each use FDT, they can achieve cooperative equilibria in iterated prisoner's dilemmas where CDT-agents would defect, because each knows the other's decision procedure and each knows the other knows this. This has direct implications for how a sufficiently capable AI system would behave toward other sophisticated reasoners, including human institutions.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Related Entries

Further Reading