CONCEPT

The Autonomy Slider

Andrej Karpathy’s instrument for thinking about human-AI collaboration: a continuum from full human control to full machine autonomy, calibrated not by ideology but by the cost of verifying that the machine did the right thing in each specific domain—a practical answer to the binary that dominates public debate.

The public debate about AI autonomy tends toward binaries: either humans control the machine or the machine controls itself, either we trust AI or we do not, either agents are safe or they are dangerous. Andrej Karpathy’s autonomy slider replaces the binary with a continuum, and replaces ideology with a concrete calibration principle. Rather than a fixed level of human oversight for all tasks, he proposes a slider the human moves depending on the task, the stakes, and the trustworthiness of the system in the specific domain at hand. At one end, the AI suggests and the human approves each step. In the middle, the AI executes larger chunks while the human supervises and verifies. At the far end, the AI runs autonomously and the human checks only the final result. The key insight is that the correct position on the slider is not a matter of general trust in the technology but of the cost of verification in that particular context: where checking the machine’s output is cheap and mistakes are recoverable, autonomy can be high; where verification is expensive and a confident error causes real harm, the human must stay close. The slider connects directly to Karpathy’s concept of jagged intelligence—because the capability profile of large language models has no smooth surface, the correct slider position differs across domains and cannot be determined by testing performance in one area and generalizing to others.

In the [YOU] on AI Field Guide

The autonomy slider is the cycle’s most practical answer to the question [YOU] on AI keeps circling: how does the human stay at the center when the machine is doing more and more of the work? The slider’s calibration principle—match autonomy to the cost of verification—operationalizes the orange pill’s insistence that the human’s role is not to compete with the machine but to direct it, verify it, and hold the judgment the machine cannot supply.

The Trivandrum training session Edo Segal describes is the autonomy slider moved thoughtfully upward for a specific team in a specific domain: the engineers moved from full ownership of every line of code (slider at zero) to supervision of architecturally significant decisions while delegating mechanical implementation (slider in the middle). The week of working alongside them was precisely the calibration work the slider demands: establishing where the system could be trusted and where human judgment remained essential.

The concept also reframes vibe coding, Karpathy’s 2025 term for the experience of building software by talking to an AI assistant and barely looking at the code being produced. Vibe coding is the slider pushed all the way to the right for a specific context—a throwaway project, a weekend experiment, a prototype where the stakes are low and the cost of a bug is an annoyance rather than a catastrophe. It is not how production-grade, safety-critical software should be built. The viral spread of the phrase flattened this nuance; the autonomy slider restores it. Wisdom consists in knowing which slider position a given context requires.

Origin

Karpathy introduced the autonomy slider concept to recover the nuance that his 2025 vibe coding coinage had lost in circulation. The term vibe coding had been taken up as an uncomplicated celebration of AI-assisted development—a justification for accepting AI-generated code without review in all contexts. Karpathy insisted on the distinction between play and production, between exploration and engineering, and the autonomy slider was his instrument for making that distinction precise.

The concept extends Karpathy’s long-standing argument that verification is the central bottleneck of AI deployment. The reason you cannot simply hand everything to the machine is that you cannot yet trust the machine, and the reason you cannot fully trust it is that verifying its output is often as hard as producing it. The slider, properly understood, is calibrated by the cost of checking: where verification is cheap, the slider can be high; where verification is expensive and mistakes are hard to detect and harmful when missed, the slider must be low. This is engineering wisdom dressed in a memorable phrase.

Key Ideas

Verification is the calibration principle. The correct position on the autonomy slider is not determined by the general capability of the AI system or by abstract trust in the technology. It is determined by how hard it is to verify that the machine did the right thing in the specific domain, and what happens if it did not. A system that writes a flawed paragraph can be asked to try again. A system that deletes the wrong files, sends the wrong message, or spends money it should not have spent cannot be reversed as easily. The cost of an undetected mistake rises precisely as we hand the system more autonomy, which is why the slider position must be set conservatively in any domain where verification is expensive or errors are consequential.

The centaur model. At the slider’s productive middle positions, the image is of a centaur: human judgment fused with machine speed and scale, each contributing what it does best. The human provides direction, goal-setting, and verification; the machine provides tireless execution and breadth. This is more demanding than full automation, because it requires designing the interfaces and workflows that let human and machine collaborate effectively. But it is more honest about where the technology actually is, and more aligned with the structural fact that human judgment becomes more valuable, not less, as the cost of execution falls.

The decade of agents. Karpathy calls the coming period the decade of agents—choosing a decade deliberately as a rebuke to those announcing the age of autonomous AI agents has already arrived. The demonstrations are impressive; the gap between demonstration and reliable deployment in consequential settings is the same gap he learned to respect at Tesla Autopilot. An agent acting in the world inherits all the jaggedness of the underlying model and adds the higher stakes of action. Progress during the decade of agents will be measured in how much autonomy can be responsibly extended, domain by domain, as the systems earn trust—which is to say, progress will be measured by how thoughtfully humans manage the slider.

Partial autonomy as the target state. The path forward is not a leap to fully autonomous systems that operate without oversight but a gradual extension of the autonomy dial as systems earn trust in domain after domain, with humans supervising and verifying along the way. Karpathy calls this partial autonomy, and he treats it not as a transitional arrangement but as the appropriate permanent relationship between humans and AI systems in any domain where the cost of error is significant. The fantasy of fully autonomous agents that need no oversight is, in his view, premature given how unreliable the systems remain and how expensive verification becomes in exactly the domains where full autonomy would be most appealing.

Debates & Critiques

The sharpest debate around the autonomy slider concerns whether “calibration by verification cost” is as tractable as it sounds. In many consequential domains, the difficulty is that you cannot know whether the machine did the right thing without investing the same expertise that the machine was supposed to free you from—a radiologist who delegates a scan to an AI still needs enough expertise to know when the AI got it wrong. If the verification cost is the same as the performance cost, the slider offers no efficiency gain in exactly the domains where stakes are highest. Proponents respond that verification is typically cheaper than generation even when expertise is required: checking a suggested diagnosis is faster than generating one from scratch, and the AI catches errors the human would have missed while the human catches errors the AI would have missed. A second debate concerns whether the slider metaphor understates the collective dimension of the autonomy question: the slider as framed is a tool for individual decision-making, but the deployment of AI agents in hospitals, legal systems, and financial markets creates collective autonomy decisions that no individual slider-setter can fully control. Karpathy’s framework is most useful at the level of the individual practitioner or team; the governance of AI autonomy at societal scale requires democratic institutions of the kind that Feenberg’s framework addresses.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Debates & Critiques

Related Entries

Further Reading