You On AI Field Guide · Scaling Laws The You On AI Field Guide Home
Txt Low Med High
CONCEPT

Scaling Laws

The empirical relationships that predict how a language model's loss decreases with training compute, parameters, and data — the most reliable quantitative instrument the AI field has, and the reason investors have been willing to fund ten-figure training runs.
Scaling laws are empirical power-law relationships, discovered by Hestness et al. (2017) and formalized for language models by Kaplan et al. (2020) and Hoffmann et al. (2022, the "Chinchilla" paper), between a transformer language model's training loss and the three inputs most under practitioner control: compute, parameters, and tokens. The relationships hold across five orders of magnitude and have been the most reliable forecasting instrument in the field for the past five years. They predict that doubling compute reduces loss by a known fraction, that the optimal parameter–data ratio scales predictably, and — most consequentially — that continued investment in scale will continue to produce capability gains until something structural breaks.
Scaling Laws
Scaling Laws

In The You On AI Field Guide

The Kaplan paper's headline finding was that language-model loss decreases as a power law in compute, parameters, and dataset size — and that the exponents of these power laws are stable across model sizes and architectures.

← Home 0%
CONCEPT Book →

Keep reading with YOU ON AI

Unlock the full book, field guide, and 555-thinker library. If you have a book code, register now — it takes a minute.

Register with book code Sign in