You On AI Field Guide · Reliability Profile The You On AI Field Guide Home
Txt Low Med High
CONCEPT

Reliability Profile

The <em>specific shape</em> of a system's dispositional performance — where it is trustworthy, where it breaks down, and under what conditions the difference matters.
A reliability profile is a dispositional concept in the Rylean sense: a characterization of how reliably a system exercises its dispositions across the range of conditions it encounters. Every disposition has a profile — solubility is reliable across most aqueous solutions but fails in saturated ones; a surgeon's diagnostic disposition is reliable for common presentations but may fail for rare ones. The profile is the practical information that matters. It tells the user how much independent verification the system's outputs require, under what circumstances the system can be trusted, and where its limitations demand human compensation. Claude has a specific reliability profile — extremely reliable for fluent prose, highly reliable for working code, notably less reliable for substantive philosophical accuracy, poor at self-correction — and understanding this profile is the practical precondition for productive collaboration.

In The You On AI Field Guide

The reliability profile is the empirical content of the dispositional analysis. Once we stop asking whether the machine 'really' thinks and start asking what its behavioral dispositions actually are, the profile becomes the principal object of study. It is measurable, comparable across systems, improvable through targeted intervention. It is also the scientifically tractable version of the questions the ghost question has been preventing.

Reliability profiles are shaped by the training history that built the dispositions. A human expert's profile reflects years of iterative practice: doing the work, making errors, receiving corrections, adjusting behavior, doing the work again. Each iteration narrows the range of likely errors and expands the range of conditions under which the dispositions produce correct responses. Claude's profile reflects a different process — training on text rather than iterative practice in the world — and the difference shows up in specific, characterizable ways.

The most important feature of Claude's profile, for practical purposes, is the weakness of its self-correction dispositions. The capacity to notice one's own errors is one of the most significant components of the dispositional cluster that constitutes intelligence, and it is the component most difficult to build without iterative feedback. Claude is disposed to produce rhetorically coherent output; it is not reliably disposed to check that output against the specific content of the concepts it invokes. The Deleuze error is a paradigm case: a fluent, plausible, coherent passage that misuses a philosophical concept in a way obvious to anyone who has actually read Deleuze.

The practical upshot: the tool is trustworthy in proportion to the match between its profile and the task. For fluent prose generation, Claude's profile matches well; light verification suffices. For substantive philosophical accuracy, the profile matches poorly; heavy verification is required, ideally by someone whose own profile includes the capacity to detect Claude's characteristic errors. The discipline Segal describes — rejecting Claude's output when it sounds better than it thinks — is exactly the exercise of human disposition to compensate for machine disposition.

Origin

The concept is developed in the Ryle volume's chapter 4 as the empirical operationalization of dispositional analysis for AI systems. The underlying framework derives from Ryle's treatment of dispositions as real but variably-conditioned properties.

The vocabulary also draws on reliability engineering and from contemporary AI evaluation practice, which has been converging on profile-based characterization as more informative than single-benchmark scores.

Key Ideas

Not binary, but conditional. A reliability profile is not a yes/no verdict but a map of where and under what conditions a disposition is trustworthy.

Shaped by training history. The specific process that built the dispositions determines the profile. Different training produces different profiles, even for the same nominal capability.

Self-correction is the key weakness. Claude's profile is specifically weak in self-correction, because self-correction requires iterative feedback loops that training-on-text does not provide.

Profile-match is the practical criterion. A tool is trustworthy in proportion to how well its profile matches the task. Mismatch demands human compensation.

Explore more
Browse the full You On AI Field Guide — over 8,500 entries
← Home 0%
CONCEPT Book →