You On AI Field Guide · Reward Prediction Error The You On AI Field Guide Home
Txt Low Med High
CONCEPT

Reward Prediction Error

Wolfram Schultz's 1990s discovery that dopamine neurons fire not at the reward itself but at the difference between expected and actual reward. The neural signal is a correction term, a learning update — and the mathematical template that seeded modern reinforcement learning.
Reward prediction error (RPE) is the quantity encoded by dopamine neurons in the ventral tegmental area: the difference between the reward an organism expected and the reward it actually received. Schultz's recordings in monkeys established the canonical pattern. On early trials, dopamine neurons fire when the reward arrives. As the monkey learns that a cue predicts the reward, the firing migrates backward in time — onto the cue — and goes silent at the reward itself. When a cue predicts reward but the reward fails to arrive, the neurons produce a brief dip below baseline at the exact moment the reward was expected. The neurons are encoding a prediction-error signal that drives learning. The discovery reshaped neuroscience and, through its mathematical identity with temporal difference learning algorithms, supplied the computational architecture of modern AI.
Reward Prediction Error
Reward Prediction Error

In The You On AI Field Guide

The RPE framework was greeted as an elegant unification: the

← Home 0%
CONCEPT Book →

Keep reading with YOU ON AI

Unlock the full book, 10,000+ field-guide entries, and a 1000+ thinker library. If you have a book code, register now — it takes a minute.

Register with book code Sign in