CONCEPT

The Central Dogma

Francis Crick’s 1957 principle that biological sequence information flows from nucleic acid to protein and never back—a precise, falsifiable constraint on the direction of information in living systems that AI is now dissolving at the level of design.

Crick named it deliberately badly. He later admitted he had misunderstood the word “dogma”—he thought it meant a bold assertion held without sufficient evidence, which is exactly what he intended, rather than an established belief beyond dispute. The mischief mattered because the central dogma was a bet, and bets can be lost. What the dogma actually stated—precisely—is that detailed sequence information cannot flow out of protein back into nucleic acid. Information passes from DNA to RNA to protein; it does not travel the reverse path. Within a decade, the discovery of reverse transcriptase showed that information can flow from RNA to DNA, against the expected direction. Crick’s formulation survived because it was precise enough to be repaired: the deep prohibition held even as one of the forward arrows proved reversible. The episode teaches the AI age its most durable methodological lesson: directionality in information systems is a property of a particular mechanism, not a law of information as such. Large language models now apply this lesson to Crick’s own field—running the dogma’s arrow backward in protein design, specifying a desired protein structure and computing the sequence that would produce it—not because the molecular prohibition has been repealed but because a layer of computation now sits above biology where the direction of inference is whatever the designer chooses.

In the [YOU] on AI Field Guide

The cycle that began with [YOU] on AI finds in the central dogma an exact model for the kind of clarity it demands. Crick stated a constraint precisely enough to be falsifiable, admitted when the constraint needed repair, and produced a formulation that remained useful across the repair. This is the intellectual discipline the cycle asks of anyone thinking about AI: state what you believe precisely enough to be wrong, and update when you are.

The dogma’s second lesson for the cycle concerns the relationship between understanding and prediction. Protein folding—predicting a protein’s three-dimensional structure from its sequence—was solved by a neural network that learned the statistical regularities without understanding the physics. The machine predicts better than any explanation we possess. Has the problem been solved or merely circumvented? The pragmatic answer is: solved, because the predictions are correct and usable. The Crickian answer is: not yet, because to understand is to know why, and the machine cannot say why a sequence folds as it does. The cycle navigates this tension throughout.

Origin

Crick articulated the central dogma in a 1957 lecture and refined it in a 1958 paper, “On Protein Synthesis.” The formal statement distinguished two classes of transfers: those that were possible (DNA to DNA, DNA to RNA, RNA to RNA, RNA to protein, DNA to protein), those that had been observed, and those that were, in his view, forbidden (protein to protein, protein to RNA, protein to DNA). The forbidden transfers were the dogma’s core: the claim that the sequential information content of a protein cannot be used to specify the sequence of a nucleic acid.

The discovery of reverse transcriptase by Howard Temin and David Baltimore in 1970 showed that RNA could specify DNA, adding an arrow to the permitted list. Crick absorbed the finding without abandoning the dogma, because the deep prohibition—protein back to nucleic acid—survived intact. His ability to accept the correction without collapsing the framework was a demonstration of scientific maturity, and it is the feature of the episode most relevant to the present moment: a principle can be partially wrong and still organize a field productively.

Key Ideas

Directionality as mechanism. The dogma’s one-way arrow was not a law of information but a fact about the specific molecular machinery of the cell. The ribosome reads nucleotides and adds amino acids; it has no mechanism for reading amino acids and specifying nucleotides. When you build a different mechanism—a trained generative model—the arrow can run the other way. This is the lesson AI is teaching Crick’s own field.

Fidelity and its costs. The dogma is not only about direction but about faithful copying. DNA is replicated with extraordinary accuracy, with elaborate proofreading machinery, because an unfaithful copy is a mutation and most mutations are harmful. A large language model is optimized to be plausible rather than faithful: it produces fluent text and, when its training is thin, confabulates with full confidence. Biology spent billions of years evolving error correction. We have built information systems without equivalent mechanisms and are discovering, expensively, why life invested so heavily in proofreading.

The code is arbitrary. The mapping from DNA triplets to amino acids is a frozen accident: there is no deep chemical necessity that methionine’s codon should be what it is. The code is a symbol system—symbols standing for things by convention rather than resemblance—and the ribosome runs it without any understander inside. This is the clearest natural example of meaning-without-understanding, and it should make us cautious about granting understanding to any system merely because it handles symbols well.

Engineering the reverse arrow. Inverse protein design—specifying a desired function and computing the gene that would produce it—is the central dogma inverted as engineering. Generative models now design entirely novel proteins, never seen in nature, from a desired structure backward to a sequence. This is the informational vision Crick inaugurated being fulfilled by the least mechanistic method he would have imagined: black-box pattern recognition that works without explaining itself.

In the [YOU] on AI Field Guide

Origin

Key Ideas

Related Entries

Further Reading