
The cycle that began with [YOU] on AI is centrally concerned with how the human should stand in relation to the machine—what it means to use these systems well, to keep the human in the loop, to point at what you mean and have the machine understand. Direct manipulation is the engineering specification for that relationship. When a user can identify the machine's error and correct it in place, they are in a direct-manipulation relationship with the output. When they can only re-describe and hope the next generation is better, they are not. The degree to which current AI interfaces approach or fall short of direct manipulation is a precise measure of how much human control is actually available.
The gradual restoration of direct manipulation in AI interfaces is visible in the evolution from pure text prompt to richer interaction modalities: the image generator where the user paints a region to be regenerated, the coding assistant where the user selects a span and asks for a transformation in place, the canvas where edits propagate through a structured document. Each of these is a small step toward Sutherland's standard. Each restores to the human the ability to point at what they mean rather than describing it from outside. The direction of the best interface design in AI is back toward 1963, and the destination has been known since Sutherland drew it.
The term “direct manipulation” was coined by Ben Shneiderman in a 1983 paper describing a class of interfaces characterized by: continuous representation of the objects of interest; physical actions rather than complex syntax; rapid, incremental, reversible operations; and immediate feedback on the result of each action. Shneiderman traced the concept's lineage to Sutherland's Sketchpad, which he identified as its originating demonstration. The principle was subsequently elaborated by Donald Norman, Brenda Laurel, and others in the human-computer interaction tradition, and it became the dominant framework for the design of graphical user interfaces through the 1980s and 1990s—the era of the mouse, the window, the desktop metaphor, and the touchscreen.
The arrival of large language models temporarily disrupted this tradition by making text prompts the primary mode of interaction, returning the human-machine relationship to something resembling the batch-processing era Sutherland's work was designed to supersede. The field has since worked to restore direct manipulation through richer interfaces, and the debate about how far that restoration is possible with inherently opaque neural systems is now one of the central design problems of the AI era.
The shared legible model. Direct manipulation requires that the thing the human operates on be legible to both human and machine—a shared representation both parties can see and modify. Sketchpad's geometry was such a representation: lines, constraints, and relationships the user understood and the machine maintained. A language model's internal representations are not such a thing; they are distributed, entangled, and opaque. The absence of a shared legible model is the structural reason why current AI interfaces fall short of direct manipulation, and it identifies what the field of interpretability would need to deliver to close the gap.
Constraint declaration over procedural specification. Sketchpad demonstrated that a powerful way to interact with a complex system is to declare constraints—what must remain true—rather than specifying procedures—what to do. This inversion is the generative-AI interaction mode at its best: the user specifies what the output must satisfy (style, content, constraints, structure), and the model searches for something that meets the specification. The failure mode Sutherland identified in 1963 persists: a system can satisfy the letter of a constraint while violating its spirit, and the gap between the declared rule and the intended meaning is a fundamental feature of any constraint-based interaction with a system that does not share the user's values.
The interpretability quest as direct manipulation restored. The emerging field of AI interpretability seeks to identify, inside neural networks, structures corresponding to human-understandable concepts—features, circuits, directions in the model's internal space that a researcher could point at and adjust, watching the model's behavior change in predictable ways. This is the program of restoring direct manipulation to AI: finding the legible handle on an otherwise opaque system. Whether the legible handle exists is an open empirical question, and if it does not—if the model's competence is irreducibly distributed—then direct manipulation of language models may have a permanent limit that no amount of interpretability research can overcome.