You On AI Field Guide · Transformer Architecture The You On AI Field Guide Home
Txt Low Med High
TECHNOLOGY

Transformer Architecture

The 2017 neural network architecture, built around self-attention, that replaced recurrent networks for sequence modeling and became the substrate of every large language model since.
The transformer is a neural network architecture introduced in the 2017 paper "Attention Is All You Need" by Vaswani and colleagues at Google Brain. Its distinguishing mechanism — self-attention — allows each position in a sequence to weight its relationship to every other position directly, replacing the sequential processing of earlier recurrent neural networks. Every major LLM of the current era (GPT, Claude, Gemini, LLaMA) is a transformer variant.
Transformer Architecture
Transformer Architecture

In The You On AI Field Guide

The transformer is the enabling technical precondition of the You On AI Cycle's subject matter. The jump from pre-2017 language models (recurrent, slower to train, shorter-context) to post-2017 (transformer-based, parallelizable, scalable to internet-sized corpora) was what made the current LLM era possible.

The Orange Pill Asimov volume uses the transformer sparingly — Asimov's framing is more about neural networks as a category — but the implicit comparison is between the positronic brain's rule-following and the transformer's distribution-learning. The transformer is the thing that is neither designed nor inspectable but nevertheless works

← Home 0%
TECHNOLOGY Book →

Keep reading with YOU ON AI

Unlock the full book, field guide, and 555-thinker library. If you have a book code, register now — it takes a minute.

Register with book code Sign in