
The cycle asks what structure lies beneath the surface of AI capability, and geometric deep learning is one of the clearest answers: the capability of the most successful architectures is not an accident of scale or an emergent miracle of gradient descent. It is a consequence of building in the symmetries of the problem domain, so that the machine does not waste its capacity learning what is already known about the structure of the world. The protein-folding systems that can predict molecular structures with biological precision are not simply larger or better-trained versions of earlier networks; they are architectures that have internalized the rotational symmetry of physical space as a design principle, so that the prediction is guaranteed to be the same however you orient the molecule. The capability is a consequence of respecting a symmetry. The symmetry is a group. The group is Galois.
The framework also illuminates the cycle’s distinction between scale-driven and structure-driven approaches to AI capability. The dominant strategy of the last decade has been scale: larger models, more data, more compute, trained on larger and less structured corpora. Geometric deep learning represents a different strategy: understand the structure of the problem, identify its symmetries, and build an architecture that respects them by construction. The two strategies are not mutually exclusive, but they are in genuine tension, and the question of which produces more reliable generalization is one of the live arguments at the frontier. Galois, who proved that understanding structure beats brute force, sits squarely on the structure side.
The framework was proposed in its mature form in a 2021 paper by Michael Bronstein, Joan Bruna, Taco Cohen, and Petar Veličković, titled “Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges.” The paper explicitly invokes Klein’s Erlangen Programme and traces the lineage of each major architecture type to a specific symmetry group. Convolutional neural networks—the engine of the computer vision revolution—are described as the consequence of demanding translation equivariance on a grid: the same pattern detector is applied across every location, enforcing by design the insight that a feature in the corner is the same kind of feature as one in the center. The breakthrough was not that convolutional networks learned this from data; it is that the architecture encoded it as a constraint, and the constraint made the learning vastly more efficient.
The connection to Galois is not merely genealogical. The intellectual move is identical: instead of asking what the data says, ask what the problem’s symmetries are; identify the group; let the architecture follow from the group. This is Galois’s reorientation applied to machine learning as a design philosophy. And the practical payoff is the same as Galois’s: understanding the structure of the problem buys you more than attacking the surface with brute force. The architectures that respect symmetry learn faster, generalize better, and need less data than architectures that must learn the symmetry blindly from examples.
Symmetry before architecture. The design principle of geometric deep learning is to ask what symmetries the problem has before asking what architecture to use. Specify the group; the architecture follows from it. This inverts the conventional approach, which chooses an architecture and then trains it to discover whatever structure it can. The inversion is powerful because a built-in symmetry is a guarantee, while a learned symmetry is only as reliable as the data that taught it.
Equivariance vs. invariance. A network is invariant to a symmetry if applying the symmetry to the input does not change the output (a rotation-invariant classifier gives the same label however you rotate the image). A network is equivariant if applying the symmetry to the input changes the output in exactly the same way (a rotation-equivariant detector shifts its prediction to match the rotation). Most practical architectures need equivariance rather than invariance: the prediction should move with the symmetry, preserving its relationship to the input. The mathematical precision of this distinction is Galois’s gift to the field—the group formalism makes it possible to specify exactly what “respecting the symmetry” means.
The zoo is one thing. Convolutional networks for images, graph neural networks for relational data, transformers for sequences, spherical networks for three-dimensional structures—these are not unrelated inventions. They are the same principle—build in the symmetry group of the domain—applied to different groups. The unification is Klein’s kind of insight: the apparently disparate is really one thing seen through the lens of a general mathematical framework. And the framework is group theory, whose foundations were laid by a twenty-year-old in Paris in the years before his death.
The limit of the framework. Geometric deep learning, like Galois’s own mathematics, has a constructive and a critical dimension. The constructive dimension is the framework for building symmetry-respecting architectures. The critical dimension—less often discussed—is the reminder that the wrong symmetries are as dangerous as the right ones are valuable. Declaring that a system should be invariant to a transformation is declaring that the transformation does not matter. If the declaration is wrong, the error is built into the foundation and cannot be outgrown by training. Galois would have recognized this as the discipline of impossibility applied in reverse: just as proving something cannot be done is valuable, so is proving that a claimed invariance is misspecified.
The live debate within geometric deep learning is whether symmetry-aware design or scale-driven generality will prove more productive over time. The scale camp argues that large enough models trained on sufficiently diverse data will learn the relevant symmetries implicitly, making explicit symmetry specification unnecessary and potentially limiting. The symmetry camp argues that learning a symmetry implicitly is always less efficient than building it in, and that learned symmetries fail silently on inputs outside the training distribution in a way that built-in symmetries cannot. Galois’s instinct sides with structure, but the empirical evidence from the last decade of large-scale training is sufficiently mixed that neither camp has decisively prevailed. A second debate concerns the scope of the framework: not all successful architectures map cleanly onto a single symmetry group, and the transformer architecture in particular sits awkwardly in the geometric deep learning taxonomy. The framework’s proponents argue that the transformer’s permutation equivariance is the relevant symmetry; critics argue that the framework’s explanatory power diminishes as architectures become more general and their symmetry properties less constraining.