CONCEPT

Synaptic Plasticity vs. Weight Update

The foundational parallel and the foundational divergence of the AI age: both biological memory and artificial learning work by changing the strengths of connections among units, but the mechanism, locality, sample-efficiency, and biological stakes of the change are so different that sharing the vocabulary of “learning” papers over a chasm that runs through every serious question about what machines can know.

The vocabulary of artificial intelligence is, at its core, borrowed from neuroscience. The artificial “neuron” sums its inputs; the artificial “synapse” is a weight, a number scaling one unit’s influence on another; “learning” is a rule that nudges those weights so the system’s behavior improves. Every layer of this vocabulary traces back to Eric Kandel’s biology: the discovery that a memory is a change in the strength of a synaptic connection, and that the pattern of those altered connection-strengths is what the animal has learned. The artificial network and the biological brain are thus genuine cousins at the level of abstract description: both are connectionist systems that store what they know in patterns of connection-strengths rather than in a library of explicit rules, and both can generalize because the connection pattern captures regularities rather than memorizing instances. But the moment you look at how the connection changes, the cousins turn out to be strangers. Biological synaptic plasticity is local—driven by the activity of the cells the synapse connects, by timing, by chemistry at the junction—and gated by biological significance, evolving in the service of a living body’s survival. The machine’s weight update is global—driven by a gradient signal computed from the system’s total output error and propagated backward through every layer, a signal no biological neuron is known to produce or receive. The parallel is real and the divergence is real, and holding both in mind simultaneously is the first requirement of honest thinking about what machines actually do when they “learn.”

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI asks what it would mean to see the machine clearly. The synaptic plasticity vs. weight update comparison is one of the clearest places where the vocabulary we use to describe machines actively prevents clear seeing. When a model is said to have “learned” a language, the literal fact underneath the claim is that an array of weight-values has been tuned by exposure to training data. Calling this “learning” imports the biological implication that something like what the slug does when it learns to avoid a noxious stimulus has occurred. The import is not wholly wrong: both processes result in improved performance on future inputs. But it is not wholly right either: the mechanisms are different, the data requirements are different, the sample-efficiency is radically different, and the question of whether the change is “about” anything—whether the machine that learned a language has any stake in that language or any model of the world the language refers to—is precisely the question the vocabulary forecloses.

The distinction matters most when assessing what machines can and cannot do reliably. The brain’s synaptic changes are gated by biological significance—a frightening or nourishing or painful experience triggers stronger consolidation than a neutral one. The machine’s weight updates treat every training example as equally important, which is part of why machines memorize indiscriminately, require vastly more data than any animal, and may encode spurious correlations with as much confidence as genuine regularities. Pearl’s curve-fitting critique and Kandel’s biological analysis converge here: both are saying that the machine’s version of “learning” is a real but limited thing, different from the learning of a creature with a life at stake in what it learns.

Origin

The parallel was implicit from the earliest days of artificial neural network research. Warren McCulloch and Walter Pitts’ 1943 paper modeled the neuron as a threshold logic unit; Donald Hebb’s 1949 “cells that fire together wire together” principle proposed the first explicit rule for synaptic change—a rule that looked like learning and that inspired the first generation of artificial learning rules. The perceptron’s learning rule, the delta rule, and eventually backpropagation all descend from this tradition of borrowing the biological vocabulary while simplifying away the biological mechanism.

Kandel’s molecular work, beginning in the 1960s and culminating in the Nobel in 2000, gave the parallel its most precise biological ground: it showed exactly what synaptic plasticity is in molecular detail, and thereby made it possible to see exactly where the artificial version diverges. The most consequential divergence Kandel’s work reveals is in the direction of the teaching signal. Backpropagation requires a global error signal computed at the output and propagated backward through every layer. No neuron in any nervous system Kandel ever studied receives such a signal. The gap between the biological mechanism and the artificial mechanism is not a simplification; it is a difference of kind in how the learning rule is structured.

Key Ideas

The genuine parallel. Both biological memory and artificial learning store knowledge in patterns of connection-strengths rather than explicit rules. Both generalize: neither memorizes instances, but extracts regularities from which new instances can be handled. Both are distributed: no single synapse and no single weight holds a memory by itself; the memory is in the configuration of the whole. These parallels are genuine, important, and partially vindicate the connectionist idea that the biological template was pointing at something real about the structure of intelligence.

The locality gap. Biological synaptic plasticity is local: the synapse changes based on the activity of the pre-synaptic and post-synaptic cells, the timing of their firing, the neurotransmitters released at the junction. No global error signal is required or computed. Backpropagation—the dominant method for training artificial networks—requires the opposite: a global error signal, computed at the output of the entire network, propagated backward through every layer so each weight can be adjusted in proportion to its contribution to the total error. This global backward pass has no biological analog known to neuroscience. The machine learns by a process the brain, as far as anyone can tell, does not use.

The sample-efficiency gap. A single encounter with a predator can produce a lasting memory in an animal, because the biological stakes of failing to consolidate the lesson are survival. The machine requires thousands or millions of examples to encode a pattern reliably, because data, not survival, is its environment. This gap is not merely quantitative; it reflects the difference between a learning system embedded in a living body with something at stake in what it learns and a learning system optimizing a mathematical loss function with no stake of its own.

Catastrophic forgetting. The brain protects its long-term memories—the grown synaptic structures are stable, resistant to overwriting—while it continues to consolidate new experience throughout life. The artificial network’s weights are perpetually vulnerable to the next gradient update; new training can obliterate old competence wholesale. Memory consolidation in the brain is a commitment of structure that protects what is known while allowing new learning. The machine has no equivalent protection, which is why AI systems trained on new tasks frequently suffer dramatic loss of prior capabilities.

Explore more

Browse the full You On AI Field Guide — over 8,500 entries