The evaluative space is the dimension in which an assessment is conducted — the units, the currency, the metric by which progress or regress is measured. Sen's central methodological insight is that the choice of evaluative space determines what is visible and what is invisible in any welfare assessment. Choose income, and a society in which average income rises has progressed even if distribution worsens. Choose utility, and adaptive preferences can mask deprivation. Choose output, and AI's power can be celebrated while its effects on human freedom go unmeasured. The evaluative space problem is the analytical engine behind Sen's call for migration from output-based to capability-based evaluation of technology.
Sen's insight operates at a depth that most methodological debates miss. Arguments about which metric is best — GDP versus HDI, user satisfaction versus capability, output versus outcome — are typically framed as technical disputes about measurement precision. Sen's argument is deeper: the choice of evaluative space is not a technical decision but a value decision. It determines what counts as success, what counts as failure, what counts as evidence for either. The metric is not a neutral instrument; it is a normative claim embedded in measurement methodology.
The AI application is specific. The technology industry evaluates AI in the output space — parameters, benchmarks, revenue, adoption, productivity. These metrics measure real phenomena but produce systematic blindness to phenomena they were not designed to capture. Worker burnout is invisible to productivity metrics. Capability set contraction is invisible to functioning metrics. Adaptive preferences render capability losses invisible to satisfaction metrics. Each blindness is a structural feature of the chosen evaluative space, not a failure of measurement precision within it.
Sen's prescription is migration: a move from output to capability as the primary evaluative space for welfare assessment. The migration is not additive — it is not sufficient to add capability metrics alongside output metrics, because the two will produce conflicting signals and institutional decision-makers will tend to optimize for whichever signal is more immediately monetizable. The migration requires that capability metrics become primary, with output metrics treated as instrumental means rather than terminal ends.
The evaluative space problem also operates at the level of policy. Regulatory frameworks have evaluative spaces built into their design. The EU AI Act evaluates systems by risk category. Corporate AI governance frameworks evaluate by compliance with principles. User research evaluates by satisfaction. Each evaluative space makes certain phenomena visible and others invisible. Capability-sensitive regulatory frameworks — such as the Capability-Coverage Ratio and Life-Plan Alignment Score proposed by Saptasomabuddha and colleagues — represent specific proposals to migrate regulatory evaluation into the capability space.
Sen introduced the evaluative space framework in his 1979 Tanner Lectures ('Equality of What?') and developed it across subsequent work, including Inequality Reexamined (1992).
What you measure is what you manage. The evaluative space determines which phenomena are visible to institutional decision-making.
Metrics are normative. The choice of evaluative space is a value decision, not a neutral technical choice.
Structural blindness. Every evaluative space produces systematic blindness to phenomena outside its scope.
Migration, not addition. Moving to capability-based evaluation requires making capability metrics primary, not merely supplementary.