The fossil metaphor is diagnostic. A fossil preserves the surface structure of a living organism but none of its metabolism — none of the continuous biochemical exchange through which the organism participated in the living system of its ecosystem. Similarly, the training corpus preserves the surface structure of human imitative acts — the words, the sentences, the logical structures — but none of the biographical specificity through which those acts participated in the living imitative flow. The fossil is the shape without the life. The model trained on the fossil can reproduce shapes but cannot participate in the living flow except through the biographical specificity that humans introduce downstream.
This reframing has consequences for how the imitative process operates in AI-collaborative work. The builder who receives the model's output receives patterns processed from the fossil record, not patterns flowing from the living current. The builder's biographical specificity — her position in the living flow, her participation in ongoing relationships, her stake in specific futures — is what reconnects the fossilized patterns to the current. Without this reconnection, AI-generated output floats in a kind of temporal limbo: competent, fluent, but disconnected from the living concerns that make text meaningful to readers who are themselves participating in the living flow.
The opacity of provenance that characterizes AI output is, in this framework, the structural consequence of fossil compression. A human author's prose carries its imitative lineage in traceable form — you can follow the influences, identify the sources, reconstruct the chain of modification. AI output compresses the entire corpus into statistical regularities that render provenance untraceable. The output sounds like many sources and like no source — because it is, statistically, the average of all sources. The loss of traceable provenance is not a technical problem to be solved by better citation. It is a structural feature of statistical aggregation applied to the fossilized record of imitative activity.
The framework extends Tarde's general observation that the present always rests on the sedimented weight of past imitative acts. Tarde noted that every institution, every linguistic form, every technological practice carries the traces of its lineage — the modifications introduced at each link of the imitative chain that produced it. The model's training corpus is simply this observation made mechanically operational: the entire fossil record compressed into a processable form.
The corpus is a fossil record, not a dataset. It preserves the surface structure of prior imitative acts but not their living metabolism — their participation in ongoing relationships, stakes, and flows.
Statistical aggregation loses biographical specificity. When the model processes the corpus's regularities, it smooths away the specific modifications that gave individual texts their distinctive character.
Provenance becomes opaque. The output sounds like many sources and like no source because it is, structurally, the average of all sources weighted by statistical regularity.
The builder reconnects fossil to flow. Human biographical specificity — participation in the living current — is what transforms fossilized patterns into contributions to the living imitative flow.
The corpus is always partial. What gets preserved as fossil depends on what gets written down, digitized, and included in training — making the corpus a specific selection, not a neutral representation of human thought.
The framework raises difficult questions about the politics of corpus composition. If the corpus is the fossil record of the imitative flow, whose imitative acts get fossilized? Critics working within frameworks of epistemic decolonization have argued that training corpora systematically underrepresent non-Western, non-English, non-dominant imitative traditions — producing models that reflect a selected slice of human imitative activity rather than its totality. The Tardean framework accommodates this critique: the flow that feeds the corpus is not the totality of the imitative flow but the portion that dominant institutions have recorded, preserved, and included. The fossil record bears the marks of the selection process, not just the selection.