You On AI Field Guide · AI Alignment The You On AI Field Guide Home
Txt Low Med High
CONCEPT

AI Alignment

The problem of making a powerful AI system reliably pursue goals that its designers and users actually endorse — the central unsolved problem of contemporary AI.
AI alignment is the research program concerned with ensuring that AI systems' behavior matches the intentions of their developers and deployers, especially as systems become more capable. It encompasses technical work (reward modeling, interpretability, oversight) and conceptual work (what "alignment" means when humans disagree). The Three Laws of Robotics and Zeroth Law are the ur-examples of alignment-by-rule, which the field has largely moved past.
AI Alignment
AI Alignment

In The You On AI Field Guide

Alignment is the contemporary name for the problem Asimov was exploring in 1942. His approach was specification: write the rules, hard-code them, rely on the substrate to execute. The modern approach is different in nearly every dimension — the substrate is learned, the rules are implicit, the values are extracted from behavior rather than specified in advance.

You On AI Cycle returns to alignment repeatedly because the problem has no settled solution and because every thinker in the cycle has a perspective on what alignment requires. Asimov's view, read forward, is that alignment-by-rule is a structural dead end

← Home 0%
CONCEPT Book →

Keep reading with YOU ON AI

Unlock the full book, 10,000+ field-guide entries, and a 1000+ thinker library. If you have a book code, register now — it takes a minute.

Register with book code Sign in