Heteroglossia (Russian: raznorechie, literally 'different-speech-ness') is Bakhtin's term for the stratification of any living language into multiple social varieties. A language is never unitary; it is always a collection of languages — the speech of doctors differs from that of lawyers, the language of the young from the old, the dialect of one region from another. Each variety carries an implicit worldview, a set of values, a way of organizing experience. The novelist's art, in Bakhtin's analysis, consists not in writing in language but in orchestrating languages — allowing characters to speak in their own registers, letting social dialects collide, refusing to subordinate this diversity to a single authoritative voice. The AI tool introduces a novel form of heteroglossia: the statistically smoothed voice of the training corpus, which has its own tendencies, rhythms, and ideological colorations that the builder must recognize and navigate.
Bakhtin developed the concept of heteroglossia most fully in his essay 'Discourse in the Novel' (written mid-1930s, published in Russian 1975, in English 1981). He was responding to the formalist literary theories dominant in early Soviet criticism, which treated language as a unitary system and literary style as deviation from a norm. Bakhtin insisted that this picture falsified linguistic reality: no one actually speaks 'the language.' Everyone speaks a socially situated variety — shaped by profession, class, region, generation, and ideological commitment. The great realist novel, from Cervantes through Tolstoy, succeeds precisely by capturing this linguistic diversity, giving voice to multiple social worlds without flattening them into a single perspective. The polyphonic novel perfects this technique by granting different voices equal ontological weight.
In the AI writing context, heteroglossia operates at a new level of complexity. The machine's output draws on an unprecedented range of registers — it has ingested scientific papers, legal briefs, social media posts, classic literature, technical documentation, and everything between. Its default tendency is toward a statistically average register: clear, competent, stylistically neutral, ideologically centrist. This is not neutrality but a specific social language, the language of educated professional discourse, carrying its own unacknowledged assumptions about what counts as clear explanation, valid argument, appropriate tone. The builder working with AI must navigate this heteroglossia consciously, distinguishing her own voice from the machine's preferred register, maintaining the roughness and particularity that mark genuine individual expression. The Orange Pill documents moments when Edo Segal recognized that Claude's prose was smoother than his thinking — a heteroglossic subordination he had to resist by returning to handwritten notes and rebuilding the argument in his own voice.
The concept connects to social construction of technology frameworks: every technology embeds the social languages of its designers. Large language models, trained predominantly on English-language internet text from developed-world sources, carry the heteroglossic biases of that corpus. They are more fluent in corporate-speak than in working-class vernacular, more comfortable with Western philosophical vocabularies than with non-Western epistemologies. This is not a correctable bug but a structural feature: the training data is heteroglossic, stratified by power, and the model reproduces that stratification in its outputs. Builders from marginalized linguistic communities face an additional burden — they must translate their questions into the machine's preferred register to receive useful responses, then translate the responses back into their own community's language. The asymmetry is real and consequential.
The prescriptive implication is clear: builders must develop what we might call heteroglossic literacy — the capacity to hear the multiple social languages operating in any text, to recognize when the machine's register is overriding their own, to consciously choose which voices to amplify and which to resist. This is not a skill that comes automatically. It requires the kind of sustained critical attention to language that literary education traditionally provided but that productivity-oriented AI training systematically ignores. The danger is not that AI will impose a single voice (it won't; it's too statistically diverse) but that it will impose a narrow band of acceptable voices — the smooth, professional, authoritative registers that dominate its training data — while marginalizing the rough, the vernacular, the resistant, the genuinely other.
Bakhtin introduced raznorechie in his essays of the 1930s, particularly 'Discourse in the Novel,' as part of his broader project to rescue the novel from formalist reduction. Russian formalists like Viktor Shklovsky treated literary language as a deviation from ordinary language; Bakhtin inverted the framework, arguing that there is no 'ordinary language' — only a heteroglot field of competing social varieties, and literature's achievement is to organize this diversity into aesthetic form. The concept was barely known in the West until the translations of the 1970s and 1980s, when it became foundational to sociolinguistics, discourse analysis, and poststructural literary theory.
The term's application to AI is new but structurally apt. Large language models are the first artifacts in human history to operationalize heteroglossia at scale — to compress the linguistic diversity of millions of speakers into a single interactive system. The resulting outputs are heteroglossic by construction: statistically weighted averages of how different communities use words differently, presented in a register the model has learned maximizes user acceptance across diverse audiences.
Language is never unitary. Every national language contains multiple social languages stratified by class, profession, ideology, and historical moment.
The novel orchestrates heteroglossia. Great fiction captures the diversity of social speech without subordinating it to a single authorial perspective.
AI introduces statistical heteroglossia. The machine's voice is a weighted composite of all the social languages in its training data, with a bias toward educated professional registers.
Builders must resist heteroglossic subordination. Maintaining an authentic voice in AI collaboration requires conscious refusal of the machine's statistically smoothed defaults.
Heteroglossic literacy is a new essential skill. Recognizing which social language is speaking — and whose interests it serves — becomes critical when machines generate prose.