By Edo Segal
The mistake I kept making was trusting the output because it sounded right.
Not once. Not in some dramatic failure that forced a reckoning. Dozens of times, in the quiet accumulation of moments where Claude produced something fluent, structured, confident — and I moved on. The Deleuze fabrication I describe in Chapter 7 was the one I caught. The question that has haunted me since is how many I didn't.
Harry Collins has spent fifty years studying exactly this problem, long before any language model existed. He embedded himself in communities of gravitational wave physicists, laser builders, practitioners whose expertise lived not in what they published but in what they knew how to do together. And he found something that upends the way most of us think about knowledge: the difference between sounding like an expert and being one is not a matter of degree. It is a matter of kind.
Collins calls it the distinction between mimeomorphic and polimorphic action. The terms are unfamiliar. The reality they describe is not. You have felt it every time you watched someone produce work that looked professional but lacked the judgment that separates competent from genuinely good. Every time a junior team member shipped code that compiled and ran but would collapse under conditions no one had tested for. Every time you yourself accepted a plausible answer without checking whether plausible was the same as true.
Large language models are the most sophisticated mimeomorphic engines ever built. They copy the surface of expertise with breathtaking fidelity. Collins's life work is a precise, empirically grounded account of everything the surface does not contain.
That matters right now because the surface is getting better every month. The gap between what the machine produces and what a genuine expert produces is narrowing on every axis that is easy to measure — speed, fluency, breadth, apparent sophistication. The axes that are hard to measure — judgment, social knowledge, the collective understanding that a community maintains through practice — are the axes where the gap remains. And the harder the gap is to see, the more dangerous it becomes.
I brought Collins into this series not because he offers comfort. He does not. He offers something more valuable: a framework for understanding what you are actually evaluating when you look at AI output and decide whether to trust it. The answer is more unsettling than most builders want to hear. It is also more useful.
The machine copies the form. The substance is social. Collins spent a lifetime proving it. These chapters will show you why that proof matters more now than it ever has.
— Edo Segal ^ Opus 4.6
1943-present
Harry Collins (1943–present) is a British sociologist of scientific knowledge based at Cardiff University, where he has spent over five decades studying how expertise is produced, transmitted, and maintained within communities of practice. Born in 1943, Collins earned his doctorate from the University of Bath and became one of the founding figures of the sociology of scientific knowledge (SSK). His decades-long ethnographic immersion in the gravitational wave physics community produced landmark works including *Gravity's Shadow* (2004) and *Gravity's Ghost and Big Dog* (2013). His theoretical contributions include the influential distinction between mimeomorphic and polimorphic actions (developed with Martin Kusch in *The Shape of Actions*, 1998), the taxonomy of relational, somatic, and collective tacit knowledge (*Tacit and Explicit Knowledge*, 2010), and the concept of interactional versus contributory expertise. His engagement with artificial intelligence spans from *Artificial Experts* (1990) through *Artifictional Intelligence* (2018) to a 2026 paper with Simon Thorne testing large language models against the discourse practices of physicists. Collins's work has fundamentally shaped how scholars understand the social dimensions of knowledge, skill, and technological competence.
In 1990, a sociologist at the University of Bath published a book with a title that appeared to concede the argument before making it. Artificial Experts suggested that machines could, in some meaningful sense, be expert. But the argument inside the covers was precisely the opposite. Harry Collins had spent the previous decade embedded in communities of scientific practice — watching how physicists built lasers, how gravitational wave researchers negotiated what counted as a detection, how the messy social process of science produced the clean results that textbooks presented as discoveries. What he found in those laboratories was not what the artificial intelligence researchers of the era assumed. He found that expertise was not a property of individuals processing information. It was a property of social groups maintaining practices. And if expertise lived in the social group rather than the individual head, then the project of building an artificial expert by building an artificial head was, at best, incomplete.
Thirty-six years later, the machines have become far more impressive than anything Collins confronted in 1990. Large language models produce code that compiles and runs, prose that reads with fluency and apparent insight, legal analyses that cite relevant precedent with remarkable accuracy. The surface of expert performance has been replicated at a fidelity that would have seemed fantastical to the AI researchers Collins originally criticized. Yet Collins, now eighty-three and still publishing — a 2026 paper on arXiv with Simon Thorne tests large language models against the discourse practices of gravitational wave physicists — maintains that the fundamental problem has not changed. The machines have become spectacularly better at copying. They have not become better at understanding. And the difference between copying and understanding is not a technical limitation that will be resolved by the next generation of hardware. It is a sociological fact about the nature of human knowledge.
The distinction Collins draws is between what he and Martin Kusch, in their 1998 book The Shape of Actions, called mimeomorphic and polimorphic actions. The terminology sounds forbidding, but the underlying idea is straightforward and carries extraordinary explanatory weight when applied to the current AI moment.
A mimeomorphic action is one that can be performed successfully by copying the surface behavior. The paradigm case is simple: a machine that stamps out identical metal parts performs a mimeomorphic action. Each stamping looks the same, follows the same physical trajectory, produces the same result. The action's success depends on its sameness. A slightly more complex case: a chess-playing computer. The moves it produces are, in a meaningful sense, copies — not of specific prior games but of patterns extracted from millions of games, recombined according to an evaluation function that maximizes board advantage. The computer does not understand chess in the sense that a grandmaster understands it. It does not feel the aesthetic pull of a beautiful sacrifice or the psychological pressure of a critical position. But it does not need to. Chess is, in Collins's framework, a domain where mimeomorphic competence is sufficient for excellent performance, because the rules are explicit, the evaluation criteria are unambiguous, and the social context of the game — the human drama of competition, the history of openings, the reputational stakes — has no bearing on whether a move is good.
A polimorphic action is categorically different. It is an action whose correct performance depends on understanding the social context in which it occurs, such that the "same" action must be performed differently in different situations. Greeting a colleague is polimorphic: the words, tone, physical distance, and implicit message vary depending on the relationship, the setting, the cultural context, and the history of prior interactions. No rule can specify in advance what the correct greeting is, because correctness is constituted by the social situation, not by the physical form of the action. A handshake at a funeral and a handshake at a party involve the same physical motion. They are not the same action.
Collins's claim, sustained across three decades of argument, is that most of what humans do that matters — most of the work that constitutes expertise, judgment, and skilled practice — is polimorphic. And machines, however sophisticated their pattern-matching, operate in the mimeomorphic register. They reproduce the surface features of expert behavior. They do not participate in the social practices that give that behavior its meaning.
The Orange Pill approaches this distinction from the practitioner's side when its author describes the senior engineer in Trivandrum who spent two days oscillating between excitement and terror before arriving at a recognition: the eighty percent of his career that Claude Code could absorb was implementation — writing functions, resolving dependencies, managing the mechanical connective tissue of software. The twenty percent that remained was judgment, taste, architectural instinct. Segal frames this as a revelation about where human value resides. Collins's framework provides the theoretical vocabulary that explains why the split falls where it does. The eighty percent was predominantly mimeomorphic. It consisted of pattern-following activities that, however technically demanding, could be specified well enough that a system trained on sufficient examples could reproduce them. The syntax of Python does not change depending on who is writing it or why. A function that sorts a list sorts a list regardless of the social context in which the sorting occurs. These are actions whose correctness is determined by their form, not by their situation. A sufficiently sophisticated copier can perform them.
The twenty percent is polimorphic. The question of what to build, for whom, and why cannot be answered by copying the surface behavior of people who have answered it before. Each instance of the question is situated in a specific social context — a particular market, a particular team, a particular set of users with particular needs and frustrations — and the correct answer depends on understanding that situation in a way that goes beyond pattern extraction from prior situations. Two startup founders facing superficially identical market conditions may need to build entirely different products, because the social contexts in which they operate — their teams, their users, their competitive landscapes, their own capabilities and limitations — differ in ways that surface data cannot capture.
Collins would add a complication that Segal's account does not fully reckon with. The eighty percent was not merely filler that could be discarded without consequence. It was the apprenticeship through which the twenty percent was formed. The thousands of hours spent debugging, resolving dependency conflicts, reading error messages, and wrestling with the mechanical resistance of code were the process through which the engineer developed the judgment that now constitutes his irreplaceable value. The implementation was not separate from the expertise. It was the medium in which the expertise grew.
Consider what Collins found in his study of gravitational wave physics, the empirical foundation on which much of his theoretical work rests. He spent decades immersed in the community of physicists searching for gravitational waves, attending their conferences, reading their papers, conducting hundreds of interviews. What he discovered was that the knowledge required to build and operate a gravitational wave detector could not be fully communicated through published papers, technical specifications, or explicit instruction. The papers contained what had been made explicit. The expertise required to actually make the detector work — to distinguish a genuine signal from an artifact, to calibrate the instrument against a background of noise whose characteristics shifted subtly with temperature, humidity, and seismic activity — resided in the community of practitioners. It was transmitted through apprenticeship, through years of working alongside experienced physicists, through the accumulation of practical knowledge that Collins calls collective tacit knowledge.
The published papers were the explicit residue of an expertise that was fundamentally social. A brilliant physicist who read every paper ever published about gravitational wave detection but had never worked in a laboratory, never calibrated an instrument, never argued with colleagues about whether a signal was real — that physicist would possess what Collins calls interactional expertise, the ability to converse fluently about the domain, but not contributory expertise, the ability to actually advance the field.
A large language model trained on the entire corpus of gravitational wave physics literature possesses something analogous to interactional expertise at a scale Collins could not have anticipated in 1990. It can discuss the physics with remarkable fluency. It can cite relevant results, explain methodological choices, and produce text that reads like expert commentary. In his 2026 paper, Collins and Thorne tested this directly, probing whether a large language model could reproduce the specific form of reasoning that gravitational wave physicists used when deciding to ignore a fringe science paper. The model could not. The reasoning depended on what Collins calls "tacit knowledge built up in social discourse, most spoken discourse, within closed groups of experts" — precisely the kind of knowledge that is not captured in published text and therefore not available to a system trained on published text.
This finding illuminates a pattern that extends far beyond physics. The machine's training data is, by definition, the explicit residue of human practice. It is what practitioners chose to write down, formalize, publish, or post. It does not include what they did not write down — the hallway conversations, the tacit agreements, the embodied intuitions, the unspoken norms that constitute the social fabric of any expert community. The training data is comprehensive on the explicit dimension and systematically absent on the tacit dimension. The machine therefore masters the explicit and remains blind to the tacit, and because the explicit is the surface of expertise while the tacit is its structure, the machine's mastery looks deeper than it is.
Collins put it with characteristic directness in his 2025 paper in AI & Society: large language models "have proved to have no moral compass: they answer queries with fabrications with the same fluency as they provide facts." The fluency is mimeomorphic — it copies the linguistic form of confident expertise. The fabrication reveals the absence of understanding — the machine does not know the difference between a claim it has grounds for making and a claim it does not, because knowing the difference requires the kind of social knowledge that comes from participation in a community of practice, not from observation of that community's textual output.
The Orange Pill catches this failure mode in action when Segal describes the Deleuze fabrication: Claude produced a passage connecting Csikszentmihalyi's flow state to a concept it attributed to Deleuze, and the passage was elegant, well-structured, rhetorically persuasive — and philosophically wrong in a way that anyone who had actually read Deleuze would recognize immediately. The surface was expert. The understanding was absent. The output passed the mimeomorphic test — it looked right — and failed the polimorphic test — it was not right, because rightness in philosophical argument depends on understanding the social and intellectual context of the concepts being deployed, not merely on deploying them in grammatically correct sentences.
Collins would note that this is not a bug that will be fixed by better training data or larger models. It is a structural feature of a system that learns from text. The text captures what was said. It does not capture why it was said, what was deliberately left unsaid, what the community of practitioners would recognize as a misuse of a concept even though the misuse is grammatically and syntactically indistinguishable from correct use. These judgments are polimorphic. They depend on social knowledge that text alone cannot convey.
The question that Collins's framework forces upon Segal's optimism is pointed and uncomfortable. If the eighty percent of implementation work was the apprenticeship through which the twenty percent of judgment was formed, and if the machine now performs the eighty percent, how will the next generation of practitioners form the judgment that the current generation developed through years of struggle? The current cohort of senior engineers possesses polimorphic expertise because they lived through the mimeomorphic work that built it. The next cohort will not. They will arrive at the twenty percent without having served the apprenticeship of the eighty percent. Whether they can develop equivalent judgment through the new practice of directing AI — a question Collins frames as genuinely open — or whether the judgment will thin with each generation is the empirical question that the next decade will answer.
Collins, characteristically, does not predict. He diagnoses. The distinction between copying and understanding is not a prediction about what machines will never do. It is a description of what they do not currently do and a specification of what would be required — socialization into human communities of practice, not merely training on their textual output — for them to do it. Whether that socialization is possible is a question Collins leaves open. That it has not occurred is the fact he insists on.
The distinction matters because the consequences of confusing mimeomorphic excellence with polimorphic understanding are not theoretical. They are practical, immediate, and potentially irreversible. A society that mistakes the machine's surface fluency for genuine expertise will make decisions based on outputs that look right but lack the social grounding that makes them reliable. And the error will be invisible to anyone who does not possess the domain expertise required to see through the surface — which, as the machine's outputs proliferate and the volume of mimeomorphic text overwhelms the community's capacity to evaluate it, will increasingly mean everyone.
Michael Polanyi's observation that we know more than we can tell is one of those philosophical insights that sounds modest until one begins to trace its consequences. Polanyi, a physical chemist turned philosopher, published The Tacit Dimension in 1966 with a central claim: explicit knowledge — the kind that can be written down, formalized, communicated in propositions — rests on a foundation of tacit knowledge that cannot be fully articulated. The cyclist cannot explain how she balances. The wine connoisseur cannot specify the rules by which he distinguishes a great vintage from a merely good one. The scientist cannot formalize the judgment by which she distinguishes a promising line of research from a dead end. In each case, the knowledge is real — it produces reliable results, it can be demonstrated in practice, it is recognized by other practitioners as genuine expertise. But it cannot be fully communicated through language. It can only be transmitted through practice, through apprenticeship, through the specific intimacy of working alongside someone who possesses it until the knowledge passes, by some mechanism that resists explicit description, from practitioner to learner.
Harry Collins took Polanyi's insight and did something characteristic: he subjected it to sociological analysis, broke it into components, and asked what each component meant for the question of artificial intelligence. The result, developed across several books but most systematically in Tacit and Explicit Knowledge (2010), is a taxonomy of three species of tacit knowledge, each with different implications for what machines can and cannot do.
The first species Collins calls relational tacit knowledge. This is knowledge that is tacit for contingent reasons — it could in principle be made explicit, but it happens not to have been. The recipe your grandmother never wrote down is relationally tacit. She could have written it down. She simply did not. The machining tolerances that a factory worker knows from decades of experience but that have never been documented are relationally tacit. The knowledge exists in explicit-capable form inside the practitioner's mind; the barrier to articulation is practical, not principled. Given sufficient motivation and a sufficiently patient interviewer, the knowledge could be extracted and formalized.
Relational tacit knowledge is the species that AI handles most naturally. When a large language model is trained on a corpus that includes machining forums, cooking blogs, and technical documentation, it absorbs vast quantities of relationally tacit knowledge that individual practitioners possessed but had not previously formalized. The training process, in effect, performs the extraction that a patient interviewer would perform, at a scale no interviewer could match. The model does not need to interview the practitioner. It ingests the textual traces of thousands of practitioners' experience and synthesizes them into a statistical representation that can reproduce the knowledge with impressive fidelity. This is one of the genuine achievements of large language models, and Collins does not deny it. The relationally tacit is being made computationally explicit at an unprecedented rate.
The second species is somatic tacit knowledge — knowledge that resides in the body. The cyclist's balance. The surgeon's hand. The glassblower's feel for molten glass. The programmer's physical rhythm of typing, the finger-memory that produces code almost automatically while the conscious mind works at a higher level of abstraction. Somatic tacit knowledge is tacit not for contingent reasons but because the body has its own form of intelligence that does not pass through conscious articulation. The body learns through repetition, through the gradual calibration of motor systems that operate below the threshold of awareness.
Collins's position on somatic tacit knowledge and AI is nuanced and important. He acknowledges that somatic knowledge cannot be possessed by a machine that has no body. A disembodied system cannot know what it feels like to press a scalpel through tissue, to sense through the handle of a tool the resistance of the material beneath. But Collins also argues — in a move that distinguishes him from the Dreyfus tradition, which grounds AI's limitations primarily in embodiment — that somatic tacit knowledge is not the most consequential barrier between machines and human-level intelligence. A sufficiently sophisticated robot could, in principle, be equipped with sensors and actuators that approximate the body's information-processing capabilities. The somatic barrier is an engineering challenge, not a principled impossibility. The real barrier lies elsewhere.
The third species is collective tacit knowledge, and it is here that Collins plants his flag. Collective tacit knowledge is knowledge that resides not in any individual — not in any head, not in any body — but in the social group. It is the constantly shifting set of norms, standards, practices, and implicit agreements that constitute what it means to be a competent member of a community of practice. No individual possesses it in its entirety. No document captures it. It is maintained and reproduced only through the ongoing social interactions of the community that sustains it.
Consider Collins's gravitational wave physicists. The knowledge required to operate a gravitational wave detector includes vast amounts of explicit knowledge — the physics of interferometry, the engineering of vacuum systems, the mathematics of signal processing. It also includes relational tacit knowledge — the specific quirks of a particular instrument that experienced operators know but have never documented. And it includes somatic tacit knowledge — the physical skills of alignment, calibration, and maintenance. But beneath all of these lies collective tacit knowledge: the community's shared sense of what counts as a genuine detection, what level of statistical significance is required, what alternative explanations must be eliminated, how to weigh the testimony of one subgroup against another, when to trust an anomalous result and when to dismiss it. This knowledge is not stored in any paper or any head. It is enacted in the community's practices — in the way results are presented at conferences, in the way objections are raised and addressed, in the tacit agreements about what counts as evidence and what does not.
Collective tacit knowledge is the species that Collins argues is genuinely impenetrable to artificial intelligence, not because of computational limitations but because of social ones. A machine trained on the textual output of a community absorbs what the community has made explicit. It does not absorb the social practices through which the community produces, evaluates, and maintains its knowledge. It reads the papers but does not attend the conferences. It processes the published arguments but does not witness the hallway conversations in which the real negotiations about what counts as valid science occur. It learns the community's vocabulary but not its politics, its hierarchies, its unspoken alliances and rivalries, its evolving sense of what matters and what is passé.
The Orange Pill provides a vivid illustration, though Segal does not frame it in Collins's terms. The senior architect who could feel when something was wrong in a codebase before he could articulate what was wrong possessed all three species of tacit knowledge simultaneously. His somatic knowledge — the physical familiarity with the act of coding, the rhythm of reading code, the bodily sensation of recognizing a pattern — was built through years of practice. His relational tacit knowledge — the specific architectural decisions of the codebase, undocumented conventions, informal agreements among team members — was absorbed through proximity to the project. And his collective tacit knowledge — his sense of what constituted good architecture in his community, what trade-offs were acceptable, what standards the team held itself to — was produced by his participation in the social life of the engineering team and the broader software community.
When AI absorbed the codebase, it absorbed the explicit knowledge and much of the relational tacit knowledge — the patterns, the conventions, the documented and undocumented structures. It could not absorb the somatic dimension, because it has no body. And it could not absorb the collective dimension, because it is not a member of the community. It reads the community's output without participating in the community's life.
The consequences of this tripartite absence are not uniform. The relational gap narrows as the model's training data expands and its ability to extract implicit patterns improves. The somatic gap matters in some domains — surgery, craftwork, physical engineering — and less in others, including much of software development, where the relevant physical skills are relatively simple (typing) and the cognitive skills dominate. But the collective gap remains structurally open, and it is the one Collins insists matters most.
His 2025 paper on AI and the sociology of knowledge presses this point with specific reference to large language models. Collins argues that "learning from the internet is not the same as socialisation." The internet contains the textual residue of social life. It does not contain social life itself. A child who grows up in a community absorbs its collective tacit knowledge through a process of primary socialization — through the thousands of daily interactions, corrections, demonstrations, and implicit lessons that constitute growing up inside a culture. The child does not learn the community's norms by reading about them. The child learns them by living inside them, by having her behavior corrected, by observing what is praised and what is punished, by absorbing the community's sense of what is appropriate in contexts that no textbook could enumerate.
An LLM trained on the internet has access to an extraordinary volume of text about what communities value, practice, and believe. But it has undergone no primary socialization. It has not been corrected by a parent, praised by a teacher, embarrassed by a peer. It does not know what it feels like to violate a norm, because it does not experience norms as constraints on its behavior. It processes descriptions of norms. It does not live inside them.
This is why, Collins argues, large language models "have no moral compass." The phrase is not a moral judgment about the machines. It is a sociological observation. A moral compass is a product of collective tacit knowledge — of the community's sense of what is acceptable and what is not, a sense that is transmitted through socialization and maintained through ongoing social participation. A system that has not been socialized cannot possess it. It can mimic the surface features of moral reasoning, reproducing arguments about ethics with the same fluency it reproduces arguments about physics. But it cannot distinguish between a genuinely moral position and a rhetorically persuasive one, because the distinction depends on collective tacit knowledge about what counts as moral reasoning in the relevant community. And that knowledge lives in the community, not in its published texts.
The implications for The Orange Pill's central argument are significant. Segal claims that AI is an amplifier — that it amplifies whatever signal the human feeds it. Collins's taxonomy adds precision to this claim. AI amplifies the explicit signal with extraordinary fidelity. It amplifies the relationally tacit signal to a significant degree, as the training data captures much of what individuals know but have not formalized. It cannot amplify the somatic signal at all, because it does not receive somatic input. And it cannot amplify the collective tacit signal, because the collective dimension of knowledge is constituted by social participation, not by data processing.
The amplifier, in other words, has a frequency response. It is not flat across the spectrum of human knowledge. It boosts certain frequencies — the explicit, the relational — while remaining deaf to others — the somatic, the collective. The output sounds full, even rich. But entire registers are missing, and the missing registers are the ones that carry the information most essential to genuine expertise: the social knowledge of what matters, what counts, what is acceptable, and what is not.
Collins would note that this frequency response is invisible to anyone who lacks the expertise to hear the missing registers. The code that Claude generates compiles and runs. The prose it produces reads well. The analysis it offers is often sound. The absence of collective tacit knowledge is not audible in the output — not to a general audience, not to a casual user, not even to a practitioner who is not specifically listening for it. It becomes audible only when the output encounters a situation that requires the missing knowledge: a social context where the "right" answer depends on understanding norms the machine has never internalized, a professional judgment that requires the kind of contextual sensitivity that only participation in the community can produce.
The Deleuze fabrication, again, is the exemplary case. The passage sounded right. It deployed philosophical vocabulary with apparent precision. It connected concepts in a way that read as insightful. But it was wrong, and its wrongness was detectable only by someone who had participated in the philosophical community long enough to have absorbed the collective tacit knowledge of how those concepts are properly used — knowledge that is not in any textbook and that the model therefore could not have learned.
What we know that we cannot say is the foundation on which we stand when we do our best work. The machine can see the building. It cannot feel the ground.
Expertise does not live where most people think it does. The commonsense picture locates expertise inside the individual — in the brain, in the hands, in some personal store of knowledge and ability that the expert has accumulated through training and practice. Collins spent decades demonstrating that this picture is incomplete to the point of being misleading. Expertise is not a substance that individual minds contain. It is a property of social groups that individuals participate in. The distinction sounds philosophical. Its consequences are entirely practical, and they bear directly on the question of what happens when artificial intelligence enters a community of practice.
Collins built this argument through close empirical study of scientific communities, most notably the community of gravitational wave physicists that he followed for more than forty years. The study produced several books, including Gravity's Shadow (2004) and Gravity's Ghost and Big Dog (2013), and it provided the evidential base for Collins's theoretical claims about the social constitution of expertise.
What Collins found was not what a naive observer would expect. The textbook picture of physics suggests that the knowledge required to build and operate a gravitational wave detector is contained in published papers — in the equations, the engineering specifications, the documented procedures that constitute the field's explicit knowledge base. A sufficiently intelligent reader who mastered the published literature should, in principle, be able to build a working detector. Collins showed that this is false. Groups that attempted to replicate detectors using only the published literature consistently failed. The published literature contained the explicit knowledge. It did not contain the practical knowledge required to make the instruments work — knowledge that existed only inside the community of practitioners and that was transmitted through direct social interaction: apprenticeship, collaboration, and the thousand small exchanges that constitute daily life inside a laboratory.
Collins called this "the experimenter's regress." When a novel experiment fails to produce the expected result, there are always two possible explanations: the theory is wrong, or the experiment was performed incorrectly. But the only way to know whether the experiment was performed correctly is to know the expected result — which is the thing the experiment was designed to determine. The regress is broken not by logic but by social negotiation: the community reaches consensus about which results to trust and which to question, based on judgments about the competence of the experimenters, the quality of their equipment, the care of their procedures. These judgments are themselves based on collective tacit knowledge — on the community's shared sense of who is reliable, what methods are sound, and what counts as adequate evidence.
The relevance of this to the AI transition is immediate and largely unrecognized. When a large language model is trained on the published output of a community of practice, it absorbs the community's explicit knowledge in its entirety. It learns the vocabulary, the methods, the standard arguments, the canonical examples. It can reproduce them with a fluency that often exceeds any individual member of the community, because it has read everything while each individual has read only a fraction.
But the community's collective tacit knowledge — the social fabric of judgments, norms, standards, and practices through which the explicit knowledge is produced and maintained — is absent from the training data, because it was never written down. It was enacted in social interactions, in the real-time negotiations that occur when practitioners disagree, in the body language and vocal inflection of a senior physicist raising an eyebrow at a junior colleague's result. The model has the text. It does not have the life.
Collins and Thorne's 2026 paper tests this directly. They examined how gravitational wave physicists decided to ignore a specific fringe science paper — a decision that required not formal reasoning from published principles but social judgment about the credibility of the author, the plausibility of the claim given the community's accumulated experience, and the implicit standards for what counted as a paper worth engaging with. The physicists could explain their reasoning when asked, but the explanation was post hoc — a rational reconstruction of a judgment that had actually been made on the basis of social knowledge acquired through years of immersion in the community. The large language model, confronted with the same task, could not reproduce the reasoning. It could produce plausible-sounding arguments for ignoring the paper, but the arguments lacked the specific social grounding that gave the physicists' judgment its authority.
Collins and Thorne conclude that "the 'intelligence' we argue is in the humans not the LLMs." This is not a vague philosophical claim. It is an empirical finding grounded in a specific test case: the model's output looked like expert reasoning but was not expert reasoning, because expert reasoning in this case depended on social knowledge the model did not possess.
The Orange Pill describes a community of practice in the process of reconstituting itself around new tools. Segal's engineers in Trivandrum, the frontier developers posting on social media, the silent middle navigating the transition without a clean narrative — these are a community whose collective tacit knowledge is in flux. The old tacit knowledge — about how long things take, what skills are necessary, how teams should be organized, what counts as good work — is being destabilized by tools that violate the community's prior assumptions. The new tacit knowledge — about how to direct AI, how to evaluate its output, when to trust it and when to question it, what constitutes expertise in the new regime — is still forming.
Collins's framework predicts that this transition will be messy, uneven, and resistant to top-down management. Collective tacit knowledge cannot be designed in advance or imposed by policy. It can only be grown through the social practice of doing the work — through the gradual accumulation of shared experience, the negotiation of new norms, the emergence of implicit standards that no one articulates but everyone, over time, comes to recognize.
The critical implication, and the one that Segal's account does not fully address, is that the speed of the transition matters for sociological reasons beyond the obvious economic ones. Collective tacit knowledge takes time to form. It is transmitted through slow, friction-rich social processes — mentorship, apprenticeship, the gradual absorption of community norms through years of participation. When the practices that sustain a community change faster than the community's collective tacit knowledge can adapt, the result is not simply confusion. It is a genuine loss of knowledge — a thinning of the social fabric that held the community's expertise together.
Collins studied this phenomenon in the history of technology transfer. When a technology moves from one community to another — from a laboratory in one country to a laboratory in another, or from one generation of practitioners to the next — the explicit knowledge transfers easily. The papers can be copied, the specifications transmitted, the methods documented and shipped. What does not transfer easily is the collective tacit knowledge required to make the technology actually work. The receiving community must build its own collective tacit knowledge through its own practice, and this takes time — typically years, sometimes decades.
The AI transition is a technology transfer of unprecedented speed and scope. The technology — AI-assisted development — is being adopted by millions of practitioners simultaneously. Each practitioner, each team, each organization is building its own collective tacit knowledge about how to work with these tools. But the building takes time, and the tools themselves are changing faster than the knowledge can form. By the time a community has developed robust norms for working with one generation of AI tools, the next generation has arrived, and the norms must be renegotiated.
This is not an argument for slowing down. Collins is not a Luddite, and his framework does not counsel resistance. It is an argument for recognizing that the human side of the transition — the formation of collective tacit knowledge about how to live and work with AI — is the bottleneck, and that no amount of technological acceleration can remove it. The knowledge of how to use these tools well is social knowledge. It must be built through social processes. And social processes operate on human timescales, not computational ones.
What Segal calls the "silent middle" — the practitioners who feel both exhilaration and loss but cannot articulate the source of either — is, in Collins's terms, a community caught between two regimes of collective tacit knowledge. The old knowledge, about how to build software through years of implementation struggle, is dissolving. The new knowledge, about how to build through AI-directed conversation, is forming. The practitioners in the middle possess fragments of both and the full structure of neither. Their discomfort is not psychological weakness. It is the sociological reality of inhabiting a transition between two regimes of social knowledge.
Collins's advice, characteristic in its refusal of easy comfort, is that the only way through this transition is through practice — collective practice, conducted in communities that are forming new norms through the friction of doing the work together and discovering, through trial and error, what works, what does not, and what counts as genuine expertise in the new landscape. The community will stabilize. The collective tacit knowledge will form. But it will form on its own timeline, in its own way, through the social processes that have always produced human expertise. No policy can shortcut this process. No tool can automate it. The knowledge of how to use the tool well is precisely the kind of knowledge that cannot be produced by the tool itself.
The community of AI-directed builders is a community in its infancy. Its practices are still forming. Its standards are still negotiating themselves through the daily work of thousands of practitioners discovering what these tools can and cannot do. The collective tacit knowledge that will eventually stabilize into a mature body of professional practice is currently thin — a set of emerging norms, provisional heuristics, early intuitions that have not yet been tested against the full range of circumstances that mature expertise must accommodate.
Whether this community will produce expertise as deep as the one it is replacing is the question Collins's framework identifies as genuinely open. The answer depends not on the capabilities of the technology but on the quality of the social processes through which the community builds its knowledge. And that quality depends on time, on friction, on the slow, unglamorous, irreplaceably human work of learning together.
In the late 1990s, Harry Collins posed a question that seemed, at first hearing, either trivially easy to answer or impossibly hard. Could a person who had never practiced a science — never operated an instrument, never run an experiment, never contributed original data to a field — nevertheless acquire genuine expertise in that science?
The question was not hypothetical. Collins himself was the test case. He had spent more than two decades immersed in the gravitational wave physics community. He attended their conferences. He read their papers — all of them, not sampling but comprehensively, year after year, for decades. He interviewed their practitioners for hundreds of hours. He understood their arguments, their methodologies, their internal debates, their history. He could discuss the relative merits of interferometric versus bar detector designs with a fluency that physicists themselves recognized as genuine.
But he had never built an instrument. Never taken a data set. Never calibrated a laser. Never done the work.
The concept Collins developed to describe his own peculiar position — and, by extension, a broader phenomenon that he argued was undertheorized in the sociology of knowledge — was interactional expertise. Interactional expertise is the fluency acquired through sustained linguistic engagement with a community of practitioners. It is the ability to speak the language, understand the arguments, evaluate the claims, and participate in the conversation of a domain without being able to perform its core practices. It is real expertise — not fake, not pretend, not superficial. Collins demonstrated this by submitting to a formal test: he answered questions about gravitational wave physics alongside actual physicists, and a panel of judges could not reliably distinguish his answers from theirs. His interactional expertise was genuine enough to pass what amounts to a domain-specific Turing Test.
But interactional expertise is categorically different from what Collins calls contributory expertise — the ability to actually do the work, to contribute original knowledge to the field, to advance the practice rather than merely discuss it. The physicist who detects a gravitational wave possesses contributory expertise. Collins, who can discuss the detection with complete fluency, does not. The doctor who diagnoses a rare disease possesses contributory expertise. The medical journalist who writes about the diagnosis with clinical precision does not. The distinction is not about intelligence or depth of engagement. It is about the difference between knowing-that and knowing-how, between understanding the conversation and being able to do the thing the conversation is about.
A large language model, trained on the textual output of every domain of human expertise, possesses something like interactional expertise at a scale that dwarfs anything Collins demonstrated or imagined. It can discuss gravitational wave physics, molecular biology, contract law, Renaissance painting, structural engineering, and Sanskrit grammar with a fluency that would pass Collins's test in many of these domains simultaneously. It has absorbed more text, from more practitioners, across more fields, than any human being could read in a thousand lifetimes. Its interactional competence is comprehensive.
This is genuinely remarkable, and Collins does not dismiss it. In his 2021 lecture "Must Intelligent Machines Be Social Machines?" he acknowledged that large language models represent "a first but impressive step" toward reproducing aspects of human linguistic competence. The step is real. The interactional fluency is real. The question — Collins's question, the one that structures his entire engagement with AI — is whether interactional expertise is sufficient for the tasks that the AI revolution is asking machines to perform.
The Orange Pill provides the test case. Segal describes a collaboration in which Claude operates as a thinking partner — holding ideas, finding connections, suggesting structures, generating prose that the human author then evaluates, revises, or rejects. The collaboration requires Claude to discuss the subject matter with fluency, to understand the author's intentions well enough to produce relevant responses, and to maintain consistency across a complex, evolving argument. These are interactional competencies. They require the machine to converse expertly about a domain — the intersection of technology, philosophy, economics, and personal experience that constitutes the book's subject matter — without being able to originate the experience, the values, or the judgment that drive the argument.
Collins's framework suggests that the collaboration works precisely because interactional expertise is what the task requires. The author brings contributory expertise — the direct experience of building products, leading teams, navigating the AI transition, and forming the judgments that constitute the book's argument. The machine brings interactional expertise — the comprehensive linguistic fluency to engage with the author's ideas across multiple domains, to find connections the author might miss, to suggest structures that serve the argument. Each contributes what the other lacks. The collaboration is productive not despite the asymmetry between interactional and contributory expertise but because of it.
But Collins would press further. He would ask where interactional expertise serves well and where it fails, and the answer illuminates the specific character of the collaboration's limitations.
Consider the moment in The Orange Pill when Claude connected the technology adoption curve to punctuated equilibrium from evolutionary biology. Segal describes this as a breakthrough — a connection he had not seen, supplied by a machine that could range across domains he could not traverse alone. Collins's framework identifies this as a legitimate achievement of interactional expertise. The machine had absorbed enough about both technology adoption and evolutionary biology to recognize a structural similarity between them. The connection was genuine. The interactional competence was sufficient to produce it.
Now consider the Deleuze fabrication. Claude connected Csikszentmihalyi's flow state to a concept it attributed to Deleuze, and the connection was wrong — not in a way that a general reader would detect but in a way that anyone with contributory expertise in philosophy would recognize immediately. The failure was not a lapse in interactional competence. It was the ceiling of interactional competence. Claude could discuss Deleuze with the fluency of someone who had read extensively about Deleuze. But it could not apply Deleuze's concepts with the precision of someone who had participated in the philosophical community where those concepts are used, debated, and refined — where the collective tacit knowledge about what Deleuze's "smooth space" means and does not mean is maintained through ongoing scholarly practice.
The punctuated equilibrium connection succeeded because the structural similarity between the two domains was capturable in text — it was, in Collins's terms, a relational insight that could be made explicit. The Deleuze connection failed because its correctness depended on collective tacit knowledge — on the philosophical community's socially maintained understanding of how its key concepts are properly deployed — that no amount of textual training could supply.
This pattern — success where interactional expertise suffices, failure where contributory expertise is required — has implications that extend well beyond authorial collaboration. Collins's framework predicts a systematic geography of AI competence: the machine will excel in domains where the relevant knowledge is predominantly explicit and relational, and it will fail in domains where the relevant knowledge is predominantly collective and tacit. The geography is not a simple division between "easy" and "hard" tasks. It cuts across conventional difficulty rankings. A technically simple task that requires reading social context may defeat the machine, while a technically complex task that can be specified in formal terms may be trivial for it.
Collins made a related point with characteristic bluntness in his 2026 paper: "LLMs are at PhD level in terms of the literature-search phase of the task but not at the original thought element of the task." A PhD, in Collins's framework, requires both interactional and contributory expertise. The student must demonstrate comprehensive knowledge of the field — an interactional competence that large language models now possess to a degree that exceeds most individual students. But the student must also contribute original knowledge — an advance in understanding that the field recognizes as genuine. This requires contributory expertise: the ability to identify a real problem, to design a method for addressing it, to evaluate the results against the field's standards, and to communicate the contribution in a way that the community recognizes as meeting those standards. The standards themselves are collective tacit knowledge. The student absorbs them through years of socialization in the research community — through working with an advisor, presenting at seminars, receiving feedback that is not about the explicit content of the work but about the implicit norms of the community. An LLM has read every paper but attended no seminars. It possesses the interactional fluency to discuss the field at the highest level. It does not possess the contributory expertise to advance it, because advancing a field requires the kind of social knowledge that training on text cannot provide.
One of the most provocative applications of Collins's distinction concerns what happens when interactional expertise is mistaken for contributory expertise — when the fluency of the conversation is taken as evidence of genuine understanding. Collins coined the term "the Surrender" in his 2018 book to describe the broader cultural tendency to defer to computers not because they are genuinely competent but because their outputs look competent, and because humans, confronted with confident-sounding output, tend to accept it rather than challenge it. The Surrender is not about machines becoming too powerful. It is about humans becoming too deferential — accepting mimeomorphic surface as polimorphic substance because evaluating the difference requires expertise that the evaluator may not possess.
Segal catches this dynamic in himself when he describes the seduction of Claude's output — the prose that arrives polished, the structure that arrives clean, the references that arrive on time. He identifies the moment of almost keeping a passage that sounded better than it thought, and the discipline required to reject it. But Collins's framework suggests that this discipline becomes harder, not easier, over time. Each interaction in which the machine's interactional fluency is confirmed — each time the output is correct, insightful, well-structured — raises the baseline expectation and lowers the practitioner's vigilance. The hundredth time you check the machine's output and find it correct, you are less likely to check the hundred-and-first. And the hundred-and-first may be the one where interactional fluency conceals the absence of contributory understanding.
Collins described this as the fundamental asymmetry of AI evaluation. Confirming that a machine's output is correct requires the same expertise that producing the output would require. If the practitioner possesses that expertise, the machine's contribution is welcome but not essential — the practitioner could have produced the output independently, though perhaps more slowly. If the practitioner does not possess that expertise, the machine's contribution cannot be properly evaluated, and the risk of accepting a mimeomorphic surface as polimorphic substance rises to a level that no amount of interface design can mitigate.
The interactional expertise that large language models possess is genuine, significant, and historically unprecedented. Collins acknowledges this. But interactional expertise has a ceiling, and the ceiling is the boundary where contributory expertise begins — where the conversation about the work gives way to the doing of the work, where linguistic fluency gives way to the social knowledge that constitutes what it means to be a practitioner rather than a conversant.
The AI revolution has produced the most comprehensive interactional expert in human history — a system that can discuss anything with anyone at any level of apparent sophistication. Whether this is sufficient for the work that Segal describes, the directing of AI toward the production of genuine artifacts with genuine value, depends on whether the work is fundamentally interactional or fundamentally contributory. Collins's answer is that it is both, and that the interactional component is the part the machine handles while the contributory component is the part only the human can supply. The collaboration works when the human knows the difference. The Surrender begins when the distinction blurs.
In the spring of 1987, a French surgeon named Philippe Mouret inserted a laparoscope through a small incision in a patient's abdomen and removed a gallbladder without opening the body. The procedure was not the first of its kind — a few scattered attempts had been made elsewhere — but it was among the earliest to demonstrate that major abdominal surgery could be performed through instruments guided by a camera rather than by the direct contact of the surgeon's hands. Within five years, laparoscopic cholecystectomy had become the standard of care. Within a decade, the open approach that surgeons had practiced for a century was, for this procedure, essentially obsolete.
The transition was fast enough to produce a generational fracture in the surgical community. Surgeons trained before the laparoscopic revolution possessed one form of expertise. Surgeons trained after it possessed another. And the relationship between the two forms was not the simple progression that the word "improvement" implies. Something was gained. Something was lost. The gain and the loss were not commensurable — they could not be placed on the same scale and weighed against each other — because they involved different kinds of knowledge, different forms of embodiment, different relationships between the practitioner and the material of practice.
Collins studied transitions of this kind throughout his career, not in surgery specifically but in the broader phenomenon of how expertise is constituted, transmitted, and transformed when the practices through which it is built change. His framework provides the most precise available vocabulary for describing what happened when the surgeon's hand was replaced by the surgeon's eye — and, by extension, what happens when the programmer's implementation is replaced by the programmer's conversation with an AI.
The open surgeon's expertise was, in Collins's terms, richly somatic. The hands were the primary instruments of knowledge. A surgeon who had performed hundreds of open cholecystectomies knew, through her fingers, the difference between the texture of a healthy gallbladder and an inflamed one, between the resistance of normal tissue and the adhesions that signaled prior infection. She could feel the anatomy with her eyes closed — not because she had memorized it from a textbook but because thousands of hours of tactile engagement had built a body-level understanding that operated below conscious awareness. The hand knew things the mind had not yet articulated. The hand detected problems — a vessel in the wrong place, a duct that was thinner than expected, a layer of tissue that yielded too easily — before the conscious mind had formulated the question.
This was somatic tacit knowledge in its purest form. The surgeon could not fully describe what she knew. She could demonstrate it — another surgeon, watching her hands, could see the expertise in the sureness of her movements, the economy of her gestures, the specific way she held tension in her fingers while navigating a difficult dissection. But the knowledge itself was not in the description or even in the demonstration. It was in the hands. It had been deposited there, layer by layer, through years of practice, and it could be transmitted only through the same process: another pair of hands, learning through their own years of practice, building their own somatic archive.
The laparoscope removed the hands from the body. The surgeon now operated through instruments inserted through small incisions, watching the procedure on a video monitor rather than looking directly at the surgical field. The tactile feedback that had been the open surgeon's primary channel of information was dramatically attenuated. The instruments transmitted some resistance, some vibration, but nothing like the rich, multidimensional tactile experience of hands inside a body. The surgeon could no longer feel the difference between healthy and inflamed tissue. She had to see it — on a two-dimensional screen that compressed a three-dimensional space into a flat image, with a camera whose angle and magnification she controlled through a separate instrument.
The critics within the surgical community articulated their concern with precision that Collins would recognize as sociologically significant. They were not making a general argument against technology. They were making a specific argument about the nature of surgical expertise. The tactile relationship between the surgeon's hand and the patient's body was not merely a convenience that the laparoscope happened to eliminate. It was, they argued, the foundation of surgical judgment. The hand did not merely execute the surgeon's decisions. It informed them. The feedback loop between touch and decision was so tight, so deeply integrated into the practice of surgery, that removing it did not merely change the interface. It changed the nature of the expertise itself.
They were right about the change. They were wrong about the consequence. The consequence was not degradation. It was transformation.
Surgeons who trained on laparoscopic techniques developed a different form of embodied expertise, no less demanding than the one it replaced. They learned to read a two-dimensional image and construct a three-dimensional spatial model in their minds — a cognitive skill that open surgeons had never needed to develop, because their eyes and hands were operating in the same three-dimensional space. They learned to coordinate instruments whose tips they could see on screen but whose handles they held at angles that did not correspond to the visual field — a dissociation between visual and motor space that required extensive practice to master. They learned to compensate for the loss of tactile feedback by developing heightened visual sensitivity — reading the way tissue moved on screen, the way a structure deformed under pressure, the way color shifted when blood supply was compromised.
Each of these was a new form of polimorphic expertise. The spatial reasoning required to navigate a three-dimensional anatomy through a two-dimensional display was not a mimeomorphic skill that could be mastered by copying the surface behavior. It depended on the individual surgeon's developing an understanding of the specific patient's anatomy, the specific configuration of instruments, the specific angle of the camera — a contextual, situation-dependent judgment that varied with every case. The instrument coordination required a new form of somatic tacit knowledge: not the hand's knowledge of tissue but the hand's knowledge of tool, a different tactile vocabulary built through a different apprenticeship.
Collins's critical insight is that the new expertise was transmitted through the same social mechanisms as the old. Laparoscopic surgeons did not learn their craft from manuals or videos, though manuals and videos existed. They learned it from other surgeons — through proctored cases, through the specific social relationship of mentor and apprentice in which the experienced practitioner guides the novice through the procedure, offering corrections, demonstrations, and the kind of real-time, contextually sensitive feedback that constitutes the transmission of collective tacit knowledge. The technology changed. The social process of knowledge transmission did not.
This matters because the parallel to the AI transition is both striking and imperfect, and the imperfection is where the important questions live.
The Orange Pill presents the laparoscopic analogy as evidence for what Segal calls ascending friction — the principle that each technological abstraction removes difficulty at one level and relocates it to a higher cognitive floor. Collins's framework validates the ascension but adds a dimension that Segal's account underemphasizes: the social infrastructure through which the new, higher-level expertise is produced and transmitted.
When Segal describes the designer on his team who went from producing visual specifications to building complete features through conversation with Claude, the transformation is real. The designer's skill was not eliminated. It was relocated from one polimorphic domain — translating visual intention into specifications that developers could implement — to another polimorphic domain — directing an AI through iterative conversation toward an implementation that matches the designer's aesthetic and functional standards. The new skill requires judgment: Is this implementation faithful to the design intention? Does the interaction feel right? Is the code producing the experience the user should have? These are polimorphic questions. They depend on context, on taste, on social knowledge about what users expect and what constitutes quality in the relevant community.
But Collins would ask a question that Segal's account does not raise: How is the designer learning this new skill? From whom? Through what social processes? The open surgeon learned from a mentor who possessed the same embodied expertise. The laparoscopic surgeon learned from a mentor who had mastered the new technique and could guide the apprentice through its specific challenges. In both cases, the transmission of expertise was social — it required a human relationship in which tacit knowledge could pass from practitioner to learner through the medium of shared practice.
The designer learning to direct Claude has no comparable mentor, because the skill is too new for a mentorship tradition to have formed. The community of practitioners who direct AI toward the production of working artifacts is months old, not decades. Its collective tacit knowledge is embryonic. The norms for what constitutes good AI-directed design — what level of specificity to provide, how to evaluate whether the output meets the standard, when to accept the machine's suggestion and when to insist on revision — are being invented in real time, by thousands of practitioners working largely in isolation, without the institutional structures that have historically supported the formation of expert communities.
This is the gap that Collins's framework identifies as consequential. The skill is transforming, not disappearing. But the social infrastructure through which the transformed skill would normally be transmitted to the next generation of practitioners has not yet been built. The laparoscopic revolution took a decade to develop robust training programs, proctorship networks, and credentialing standards. During that decade, the quality of laparoscopic surgery was uneven — excellent in centers where experienced surgeons mentored novices carefully, mediocre or worse in centers where surgeons adopted the technique without adequate guidance. Complication rates were higher in the early years, not because the technology was inferior but because the social infrastructure of expertise transmission lagged behind the technology's adoption.
Collins would predict the same pattern for AI-directed work. The technology is adopted at the speed of individual enthusiasm. The social infrastructure of expertise transmission lags behind. The result is a period of uneven quality — brilliant work from practitioners who combine deep domain expertise with emerging skill in AI direction, poor work from practitioners who lack the domain expertise to evaluate what the machine produces. The unevenness is not a technology problem. It is a sociology problem. The collective tacit knowledge that would distinguish good AI-directed practice from mediocre AI-directed practice is still forming, and it can only form through the social processes that produce all collective tacit knowledge: shared practice, mutual correction, the gradual emergence of standards through the friction of doing the work together.
The hand has been replaced. What has not been replaced — what cannot be replaced by any technology — is the social process through which the replacement expertise is built, evaluated, and transmitted. The laparoscopic surgeon's screen-mediated skill was as demanding as the open surgeon's tactile skill. But it required the same human infrastructure to develop: mentors, communities, institutions, time. The AI-directed builder's conversational skill may prove equally demanding. But it will require the same human infrastructure to mature — an infrastructure that is, at this moment, barely beginning to exist.
Collins would note, with the empirical precision that characterizes his work, that this is not a theoretical concern. It is an observable phenomenon. The community of AI-directed builders is already exhibiting the signature features of a community in its formative period: rapid experimentation, inconsistent standards, vigorous disagreement about what constitutes quality, and the emergence of provisional norms that may or may not survive contact with the full range of circumstances that mature expertise must accommodate. This is normal. It is what the early years of any new community of practice look like. The question is not whether the community will mature — it almost certainly will — but what it will lose in the transition period, and whether the losses will be visible in time to address them.
The surgeon who trained on open technique and then learned laparoscopic surgery carried both forms of expertise. She knew what the hand could tell her, and she knew what the screen could tell her, and she could compare the two in cases where both were available. The surgeon who trained exclusively on laparoscopic technique did not have this comparative capacity. She knew only the screen. Whether she was missing something that the hand would have told her was a question she could not answer, because she had never had the hand's knowledge.
The same asymmetry is forming in the AI transition. Practitioners who built software through years of implementation before the arrival of AI tools carry a comparative capacity that purely AI-directed practitioners will not develop. They know what the struggle taught them, and they can evaluate the AI's output against that knowledge. The next generation will not have this baseline. Whether they need it — whether the ascending friction produces expertise of equivalent depth through different means — is Collins's genuinely open question. And it is a question that only the maturation of the new community of practice, over years and decades of collective effort, can answer.
Every community of practice that has ever produced genuine expertise has produced it through the same mechanism. A newcomer enters the community. The newcomer is incompetent — not because the newcomer lacks intelligence but because competence in the community's practices requires knowledge that cannot be acquired outside the community. The newcomer works alongside experienced practitioners. Through years of participation — watching, imitating, failing, being corrected, gradually absorbing the community's standards and norms — the newcomer becomes a practitioner. The knowledge transfers not through instruction but through immersion. The apprenticeship is not a shortcut to expertise. It is the only known path.
Collins documented this process with a specificity that distinguishes his work from the broader literature on situated learning. His long immersion in the gravitational wave physics community gave him direct observation of how junior researchers became senior researchers, and the process was not what the formal training structure suggested. The formal structure — coursework, qualifying exams, dissertation research — provided the explicit knowledge. The apprenticeship provided everything else: the sense of what problems were interesting, what methods were trusted, what results were surprising, what standards the community actually held itself to as opposed to what it claimed in its published methodological statements. A student who completed the formal requirements without absorbing the tacit knowledge through apprenticeship would be credentialed but not expert. Collins encountered such individuals. They could pass exams. They could not do physics.
The apprenticeship served a dual function that is easy to overlook. It transmitted knowledge from the experienced to the inexperienced. And it reproduced the community itself — ensuring that the collective tacit knowledge, which exists only in the practices of the community's members, would persist into the next generation. Each successful apprenticeship was an act of cultural reproduction. Each failure — each student who completed the credential without absorbing the tacit knowledge — was a tiny crack in the community's continuity.
This framework, applied to the AI transition, produces what may be the most uncomfortable challenge to the optimism that The Orange Pill represents. The challenge is not that AI eliminates expertise. Collins is clear that the transition transforms expertise rather than destroying it. The challenge is that AI may disrupt the apprenticeship through which the next generation of expertise is formed — and that this disruption, unlike the more visible disruptions that attract attention (job displacement, skill obsolescence, the erosion of competitive moats), operates below the threshold of visibility until its consequences have already materialized.
The argument requires careful construction, because the surface evidence seems to point in the opposite direction. AI tools, by removing the mechanical friction of implementation, appear to accelerate the development of expertise. A junior developer using Claude Code can produce working software in a fraction of the time it would have taken without the tool. The developer ships code faster, encounters a wider range of problems, receives immediate feedback on the quality of solutions. By conventional measures — volume of output, speed of iteration, breadth of exposure — the AI-assisted developer is developing faster than the unassisted developer.
Collins's framework identifies the flaw in this reasoning. The conventional measures are measuring the wrong thing. Volume, speed, and breadth are measures of mimeomorphic competence — the ability to produce outputs that match the surface pattern of expert work. They do not measure polimorphic competence — the ability to exercise judgment in novel situations, to recognize when the standard approach is insufficient, to feel that something is wrong before being able to articulate what.
Polimorphic competence, in Collins's account, is built through a specific kind of struggle: the struggle of working at the boundary of one's understanding, in the presence of practitioners who possess the knowledge one lacks, over a period long enough for the tacit dimension of the practice to be absorbed. The struggle is not incidental to the learning. It is the medium through which tacit knowledge is transmitted. The moment of being stuck — of not knowing what to do, of having to think through a problem from first principles, of failing and having to diagnose the failure — is the moment when the learner's understanding deepens. The experienced practitioner, watching the struggle, provides guidance that is calibrated to the specific moment — not a general principle but a specific intervention, aimed at the specific difficulty the learner is encountering, informed by the experienced practitioner's own history of having encountered similar difficulties and having built the understanding that resolved them.
When AI resolves the struggle before it occurs, the learning opportunity is bypassed. The junior developer who would have spent four hours debugging a dependency conflict — during which she would have learned, through the specific pain of the experience, how the dependency system works, where it breaks, what the error messages actually mean — instead describes the problem to Claude, receives a working solution in minutes, and moves on. The code works. The output is correct. The learning has not occurred — or more precisely, a different and shallower kind of learning has occurred, the kind that comes from seeing a correct answer rather than building one through struggle.
Segal acknowledges this dynamic in The Orange Pill when he describes the engineer who lost both the tedium and the ten minutes of architectural learning buried in four hours of plumbing. The observation is precise and important. Collins's framework amplifies it into a structural concern: the ten minutes were not merely useful. They were the mechanism through which the community's collective tacit knowledge was transmitted to the next generation of practitioners. Eliminate the mechanism, and the knowledge is not lost immediately — the current generation still possesses it. But it is not transmitted. And knowledge that is not transmitted does not persist.
The analogy to language acquisition is instructive. A child acquires language not by reading about language but by being immersed in a language-using community. The immersion is not efficient. The child hears thousands of utterances that contain no new information. The child makes errors that are corrected — sometimes explicitly, more often implicitly, through the caregiver's response that models the correct form without drawing attention to the error. The process is slow, messy, full of redundancy and apparent waste. It is also the only process that produces native fluency. No shortcut has ever been found. No amount of explicit instruction can substitute for the years of immersion that build the tacit knowledge of a language — the knowledge of what sounds right, what feels wrong, what is grammatically permissible but socially inappropriate, what is technically an error but universally accepted.
Collins argued, in his 2025 paper on AI and the sociology of knowledge, that "learning from the internet is not the same as socialisation." The argument applies with equal force to the development of professional expertise. Learning to code from AI is not the same as learning to code through apprenticeship in a community of practice. The AI provides answers. The community provides context, correction, standards, and the slow accumulation of collective tacit knowledge that constitutes genuine professional competence.
The counterargument is immediate and must be taken seriously. Perhaps the apprenticeship that the current generation underwent was unnecessarily painful. Perhaps the four hours of debugging produced ten minutes of valuable learning and three hours and fifty minutes of wasted time. Perhaps AI-directed practice, by eliminating the waste, allows practitioners to accumulate the ten minutes of valuable learning at a much higher rate — to encounter more architectural decisions in less time, to develop judgment faster by iterating through more design problems.
Collins would regard this counterargument with the specific skepticism of a researcher who has spent decades studying how expertise actually forms. His empirical observation, across multiple communities of practice, is that the waste is not separable from the learning in the way the counterargument assumes. The three hours and fifty minutes of debugging were not empty time surrounding the ten minutes of insight. They were the context in which the insight occurred — the slow buildup of frustration, confusion, and partial understanding that prepared the mind to receive the insight when it arrived. The insight was not a nugget buried in gravel that could be extracted without the gravel. It was a crystallization that occurred because the solution had been slowly supersaturating in the practitioner's mind, and the specific moment of resolution — the click of understanding when the error is finally diagnosed — was the moment of crystallization. Remove the supersaturation and the crystal does not form, no matter how many nuggets the AI delivers.
This is not a romantic argument. It is an empirical claim about the psychology of learning, supported by Collins's decades of observation of how expertise develops in practice. The claim may be wrong. It is possible that AI-directed practice produces equivalent depth through different means — that the rapid iteration of describing problems, evaluating solutions, and directing revisions builds the same tacit understanding that years of manual struggle produced, simply faster. Collins acknowledges this possibility. He insists, characteristically, that it is an empirical question — one that can only be answered by observing the long-term outcomes of AI-directed apprenticeship in multiple communities of practice, over a period sufficient to determine whether the practitioners trained in the new way develop the same depth of judgment as those trained in the old.
The experiment is underway. It cannot be run a second time, because the conditions are not repeatable — the communities that are currently reorganizing around AI tools are the only communities that will undergo this specific transition, and their outcomes will be shaped by the specific historical circumstances of the transition. If the apprenticeship problem is real — if the elimination of mechanical struggle systematically thins the collective tacit knowledge of professional communities — the consequence will not be visible for years, possibly decades. It will manifest not as a sudden failure but as a gradual decline in the quality of professional judgment across the affected domains. The decline will be masked by the increasing volume and speed of output, because the mimeomorphic surface will remain excellent even as the polimorphic substance thins.
Collins frames this as a question, not a prediction. His intellectual honesty requires him to acknowledge that the transformation of skill may produce outcomes he cannot foresee — that the new forms of polimorphic expertise emerging around AI-directed practice may prove as deep, in their own way, as the forms they replace. But his empirical training also requires him to note that every previous case he has studied — every community of practice whose apprenticeship structures were disrupted by technological change — underwent a period of knowledge thinning before the new structures matured. The question is not whether the thinning will occur. It is how deep it will go, and whether the communities affected will recognize it in time to build the social structures — the mentorships, the standards, the institutional supports — that can mitigate it.
The knowledge does not announce its departure. It simply fails to arrive in the next cohort of practitioners. And by the time the absence is noticed, the community that would have transmitted it may no longer exist in a form capable of reconstituting what was lost.
There is a test that Collins designed, rooted in Alan Turing's foundational thought experiment but repurposed for the study of expertise rather than general intelligence. Collins calls it the Imitation Game — not the original Turing Test, which asks whether a machine can be distinguished from a human in open-ended conversation, but a domain-specific variant that asks whether a machine (or a non-expert human) can be distinguished from a genuine expert in conversation about a specific field. Collins used this test to validate his own interactional expertise in gravitational wave physics, and he proposed it as a general method for evaluating AI competence: not "Can the machine fool anyone?" but "Can the machine fool people who know what they are talking about?"
The test, when applied to large language models, reveals a systematic pattern. The machine fools the general audience with near-total reliability. Its outputs — code, prose, analysis, argumentation — are produced with a fluency and apparent authority that most readers cannot distinguish from human expertise. The machine also fools practitioners in domains adjacent to their own, where the practitioner possesses enough knowledge to recognize competence but not enough to detect the specific errors that only deep domain expertise would reveal. But the machine fails, consistently and characteristically, when its outputs are evaluated by practitioners with genuine contributory expertise in the specific domain, who possess the collective tacit knowledge required to distinguish surface fluency from substantive understanding.
The pattern of failure is more informative than the pattern of success, because it reveals the precise boundary between what the machine copies and what it does not.
What the machine copies is the distributional structure of expert discourse. A large language model trained on a comprehensive corpus of, say, architectural writing will produce text that follows the distributional patterns of architectural writing — it will use the right vocabulary, deploy the right kinds of examples, structure arguments in the way that architects structure arguments. The copy is not superficial. It captures stylistic features, argumentative structures, domain-specific conventions, and even the implicit hierarchies of value that characterize expert discourse (the way architects, for instance, treat certain materials as inherently more interesting than others, or the way they frame certain design problems as more worthy of attention). The machine has absorbed these patterns from the training data, and it reproduces them with a statistical precision that is, in many cases, more consistent than any individual human practitioner.
But the distributional structure of expert discourse is the surface of expertise, not its substance. Expert discourse is produced by practitioners who possess understanding — who know why they use certain vocabulary, why certain examples are apt, why certain arguments are structured the way they are. The surface features of the discourse are effects of this understanding. The machine reproduces the effects without possessing the cause.
Collins argued this point in Artifictional Intelligence with an analogy that has gained force in the era of large language models. A parrot trained to repeat the phrase "the pressure in the containment vessel is rising" has not detected a rise in pressure. It is reproducing a sound pattern. The fact that the sound pattern corresponds to a meaningful proposition about the physical world is irrelevant to the parrot. The parrot does not know what pressure is, what a containment vessel is, or what "rising" means in this context. It is performing a mimeomorphic action — reproducing a surface behavior — that happens to coincide with a meaningful act of communication.
A large language model is not a parrot. Its reproduction of expert discourse is vastly more sophisticated — it can generate novel sentences, respond to context, adapt its output to the specific requirements of a conversation. But the principle, Collins argues, is the same. The model reproduces the distributional patterns of expert discourse without possessing the understanding that produced those patterns. The sophistication of the reproduction does not change its fundamental character. A very sophisticated parrot is still a parrot.
This claim invites an immediate and powerful objection: How do we know? If the machine's output is indistinguishable from expert output across a wide range of tasks, what grounds do we have for asserting that it lacks understanding? The objection is the AI researcher's version of the philosophical zombie argument — if it walks like expertise and talks like expertise, on what basis do we deny that it is expertise?
Collins's answer is sociological rather than philosophical. The grounds for distinguishing mimeomorphic reproduction from polimorphic understanding are not internal — not about what the machine "really" thinks or experiences — but external. They are about where the output breaks. And the output breaks at specific, predictable, sociologically identifiable points: the points where correctness depends on collective tacit knowledge that is not present in the training data.
The Orange Pill provides the exemplary case. Segal's Deleuze fabrication was a passage that reproduced the distributional structure of philosophical discourse with impressive fidelity. The vocabulary was correct. The sentence structure was appropriate. The rhetorical moves — asserting a connection between two thinkers, citing a concept, drawing an implication — followed the conventions of philosophical writing. A reader without expertise in Deleuze's philosophy would have found the passage convincing. A reader with expertise would have recognized, immediately, that the concept was being used in a way that violated the philosophical community's collective understanding of what the concept means.
The failure is not a mistake in the ordinary sense. It is not a factual error that could be corrected by adding more data to the training set. It is a failure of a specific kind: the machine used a concept in a way that was grammatically and rhetorically appropriate but philosophically wrong, and the wrongness was detectable only by someone who had participated in the social practices of the philosophical community long enough to know how the concept is actually used — not just what the published texts say about it but what the community would accept and reject as legitimate application.
Collins would identify this as the signature of mimeomorphic reproduction encountering a polimorphic boundary. The published texts about Deleuze contain enough information for the machine to reproduce the distributional pattern of expert discussion of Deleuze. But the published texts do not contain the full specification of how the community uses Deleuze's concepts, because that specification includes tacit judgments — about what counts as an illuminating application versus a misleading one, about which connections are philosophically productive and which are superficially clever but substantively empty — that are maintained through the community's ongoing social practices rather than through its publications.
The consequences extend far beyond philosophy. In software engineering, code that follows the distributional patterns of expert code — that uses the right libraries, follows the right conventions, structures logic in the way experienced developers structure it — may nevertheless fail in ways that no surface inspection can predict. The failure occurs when the code encounters a situation that requires understanding the problem domain rather than merely reproducing the patterns of solutions to similar problems. A function that handles user authentication may work perfectly in the cases it was designed for and fail silently in edge cases that an experienced developer would have anticipated — because the experienced developer's anticipation is based on collective tacit knowledge about how users actually behave, how systems actually fail, and what the consequences of failure actually are. The code's surface is expert. Its understanding is absent.
Collins's 2026 paper with Thorne provides additional empirical evidence. They confronted large language models with a task that required the specific form of social reasoning that gravitational wave physicists use when evaluating fringe science claims. The models produced arguments that sounded like expert evaluation — they cited relevant methodological principles, noted potential flaws in the fringe paper's approach, and reached conclusions consistent with the community's consensus. But the arguments, when examined by physicists, lacked the specific social grounding that gave the community's actual evaluation its authority. The real physicists' reasoning was embedded in their knowledge of the author's reputation, the history of similar claims in the field, the informal consensus about what kinds of evidence would be required to take the claim seriously — knowledge that is socially maintained and that the published literature captures only partially and retrospectively.
The models produced evaluations that looked like expert judgment. The evaluations were not expert judgment. The gap between looking like and being is precisely the gap between mimeomorphic and polimorphic competence — between copying the surface and understanding the substance. And the gap is invisible to anyone who does not possess the domain expertise required to see through the surface, which means that as the volume of machine-generated output increases, the proportion of output that is evaluated by someone capable of detecting the gap decreases. The machine's surface excellence becomes, over time, its own evidence of substance — not because the substance is present but because fewer and fewer evaluators can tell the difference.
Collins describes this as a sociological trap, not a technological failure. The machine is doing exactly what it was designed to do: reproduce the distributional patterns of expert performance at scale and speed. The trap is that human institutions — publishing, peer review, professional credentialing, organizational decision-making — are designed around the assumption that the surface features of expert output are reliable indicators of expert understanding. When the assumption held, the institutions worked. When the assumption fails — when surface-identical outputs can be produced without the understanding that historically produced them — the institutions are deceived. Not by malice. By the structural mismatch between what the machine produces and what the institutions are equipped to evaluate.
The machine copies the form of expertise. It does not copy the social substance that gives the form its reliability. The copy is excellent — often better than any individual expert's output, because it draws on a more comprehensive corpus and is unconstrained by the individual expert's limitations of memory, attention, and time. But excellence of form without substance of understanding is a new phenomenon in human institutional life, and the institutions have not yet developed the mechanisms to detect it. The detection requires precisely the collective tacit knowledge that the machine does not possess and that the institutions were built to take for granted.
Communities of practice do not form by declaration. They form through the accumulation of shared experience, through the gradual development of norms and standards that no one designed but that everyone, over time, comes to recognize. Collins documented this process across multiple scientific communities — gravitational wave physicists, laser builders, practitioners of alternative medicine — and found the same pattern each time. The community begins as a loose collection of individuals working on similar problems in relative isolation. Through conferences, publications, informal conversations, and the slow building of personal relationships, the individuals begin to converge on shared standards. The convergence is not directed by any central authority. It emerges through the friction of doing the work in proximity to others doing similar work — through the discovery that certain approaches produce better results, that certain standards are more robust, that certain norms of communication and evaluation serve the community's purposes more effectively than others.
The process takes time. Collins found, in his studies of scientific communities, that the formation of mature collective tacit knowledge — the point at which the community's standards are robust enough to reliably distinguish good work from bad — typically requires at least a decade. The early years are characterized by disagreement, inconsistency, and the coexistence of competing norms. The later years are characterized by greater consensus — not unanimity but a shared sense of what the community values, what it considers evidence, what it recognizes as competent practice. The consensus is never complete and never static; it evolves as the community's practices evolve. But it provides a stable enough foundation for the community to function as a community — to train newcomers, evaluate contributions, and maintain the standards on which its expertise depends.
The community of AI-directed builders that The Orange Pill describes is in its earliest formative phase. Its members are scattered across organizations, industries, and geographies. They share a common practice — using AI tools to produce software, text, analysis, and other artifacts that would previously have required specialized technical skill — but they do not yet share a common set of standards. The norms for what constitutes good AI-directed work are not established. The criteria for evaluating output are not agreed upon. The collective tacit knowledge that would allow the community to reliably distinguish between excellent practice and mediocre practice has not yet formed.
Collins's framework predicts the trajectory of this community's development with a specificity that derives from his empirical study of many communities before it. The prediction is not about outcomes — Collins is characteristically agnostic about whether the community will produce expertise of the same depth as the one it is replacing — but about process. The process will follow the same path that every community of practice has followed: from isolation to convergence, from competing norms to provisional consensus, from thin collective tacit knowledge to thick. And the process will take time that is measured in years, not months, because the formation of collective tacit knowledge cannot be accelerated by the same tools that are transforming the practices around which the knowledge forms.
The current state of the community is observable in the public discourse that Segal documents. The voices on social media — the triumphalists posting metrics, the elegists mourning what has been lost, the silent middle holding contradictory truths — are the voice of a community in its pre-convergent phase. The disagreements are not primarily about facts. They are about norms. What counts as good work when AI does the implementation? Is a product that was built in a weekend with Claude Code of the same quality as a product that took a team six months? What does "quality" even mean when the output looks professional regardless of the depth of thought behind it?
These are not questions that data can answer, because they are questions about values — about what the community will hold itself to, what standards it will maintain, what it will consider worth doing and doing well. Values are collective tacit knowledge. They are formed through social negotiation, through the slow process of discovering, through practice, what matters and what does not. The negotiation is happening now, in every Slack channel and conference room and late-night Twitter thread where practitioners argue about the implications of the tools they are using.
Collins would study this community the way he studied gravitational wave physicists — not by asking what the tools can do but by asking what social practices are forming around the tools. His questions would be specific: When a practitioner rejects AI output and insists on revision, what criteria are being applied? Are the criteria shared across the community or idiosyncratic? How are newcomers learning the criteria — through explicit instruction, through the example of experienced practitioners, through the trial and error of their own practice? Are the criteria robust enough to distinguish between output that is genuinely good and output that merely looks good?
The answers to these questions are not yet available, because the community is too young for its norms to have stabilized. But Collins's framework identifies the features that would signal the community's maturation. The first is the emergence of reliable evaluation — the community's ability to look at a piece of AI-directed work and judge, with reasonable consistency, whether it meets the community's standards. This requires that the standards exist, that they are shared, and that practitioners have internalized them deeply enough to apply them without conscious deliberation. In Collins's terms, the evaluation must be polimorphic — dependent on contextual judgment rather than rule-following — and the ability to perform it must be collectively sustained.
The second signal is the development of pedagogical practices — the community's ability to transmit its standards to newcomers. This is where the apprenticeship problem becomes most acute. The community must develop methods for teaching not just the mechanics of AI direction (how to prompt, how to evaluate output, how to iterate) but the judgment that distinguishes excellent AI-directed practice from competent AI-directed practice. The mechanics can be taught through instruction. The judgment can only be transmitted through apprenticeship — through the slow process of working alongside experienced practitioners, absorbing their standards through observation and correction, building the collective tacit knowledge that constitutes genuine expertise.
The third signal is the emergence of the community's capacity for self-criticism — its ability to recognize its own failures and learn from them. Collins observed that mature scientific communities are characterized not by their success rate but by their capacity for rigorous self-evaluation. They know how they can go wrong. They have developed mechanisms — peer review, replication, the social pressure of competing research groups — for detecting and correcting errors. The community of AI-directed builders has not yet developed comparable mechanisms. The current practice of evaluating AI output is largely individual — each practitioner applies their own standards, informed by their own experience, without the institutional structures that would aggregate individual evaluations into community-level quality control.
The Orange Pill describes the formation of this community from the inside — from the perspective of a practitioner who is simultaneously building with the tools and trying to understand what the building means. Segal's account is valuable precisely because it is situated inside the community's formative period, capturing the texture of the experience — the exhilaration, the terror, the productive vertigo — that an outsider's analysis would miss. Collins's framework provides the analytical distance that the insider's account cannot supply. Together, they describe a community at its most precarious and most promising moment: the moment when the practices are new, the norms are forming, the collective tacit knowledge is thin but thickening, and the question of what the community will become remains genuinely open.
Collins's intellectual humility demands acknowledgment of this openness. His framework does not predict whether the new community will produce expertise of the same depth as the old. It predicts that the community will follow the same developmental path as every previous community of practice — through struggle, negotiation, and the gradual formation of collective tacit knowledge. Whether the destination is equivalent depends on variables that the framework identifies but cannot determine: the quality of the mentorship that emerges, the robustness of the evaluative standards the community develops, the willingness of experienced practitioners to invest the time and friction required to transmit their knowledge to newcomers who may be tempted to bypass the apprenticeship in favor of the tool.
What the framework does determine is that the community's fate will be decided not by the capabilities of the technology but by the quality of its social processes. The tools are powerful. The question is whether the humans who use them will build the social infrastructure required to use them well — the mentorships, the standards, the institutions, the norms that constitute the collective tacit knowledge of a mature community of practice. That infrastructure cannot be built by AI. It can only be built by people, working together, over time, through the irreducibly social process of learning to do something well and then learning to teach others to do it.
There is a property of collective tacit knowledge that Collins identified early in his career and that has gained urgency with each decade since: it is irreversible in its loss. When a community of practice dissolves — when the practitioners retire, disperse, or are replaced by a new generation that learned different practices — the collective tacit knowledge that the community sustained dissolves with it. The knowledge does not persist in archives, in textbooks, in databases, or in any other form of explicit storage. It persists only in the ongoing social practices of the community that maintains it. When the practices stop, the knowledge stops.
Collins demonstrated this most vividly in his studies of technology transfer. When the United States attempted to replicate the British technique for producing a specific type of laser in the 1970s, the American scientists had access to every published paper, every documented specification, every explicit description of the method. They could not make the laser work. The knowledge required to make it work — the specific calibrations, the undocumented adjustments, the practitioner's feel for when the system was performing correctly — existed only in the British community that had developed the technique. The Americans ultimately succeeded only when British physicists traveled to American laboratories and worked alongside American researchers for extended periods, transmitting the tacit knowledge through direct social interaction. The published literature was necessary but not sufficient. The social transmission was indispensable.
The corollary is sobering. If the British community had dissolved before the knowledge was transmitted — if the practitioners had retired, if the laboratory had closed, if the funding had been redirected — the knowledge would have been lost. Not temporarily. Permanently. A future researcher who wanted to replicate the technique would have faced the same barrier the Americans originally faced, but without the British community available to provide the missing knowledge. The researcher would have had to reconstruct the tacit knowledge from scratch, through years of independent experimentation — essentially rediscovering what had already been discovered, through the same slow, friction-rich process that produced the knowledge in the first place.
The AI transition described in The Orange Pill is, in Collins's terms, an experiment in the transformation of communities of practice — and it is an experiment that cannot be run twice. The communities of software developers, designers, engineers, lawyers, doctors, teachers, and writers who are currently reorganizing their practices around AI tools are undergoing a one-time transition. The conditions of the transition — the specific state of AI technology, the specific configuration of existing expertise, the specific social and economic pressures driving adoption — are historically unique and unrepeatable.
If the transition goes well — if the communities develop new forms of collective tacit knowledge that are as deep and robust as the ones they replace — the outcome will be an expansion of human capability that vindicates the most optimistic reading of the moment. If it goes badly — if the elimination of the old apprenticeship structures thins the collective tacit knowledge faster than the new structures can replace it — the outcome will be a loss that cannot be recovered by reverting to the old tools. The knowledge that was built through decades of implementation struggle will have dissipated, and the social conditions that produced it will no longer exist.
Collins was careful to emphasize that this is not an argument against the transition. It is an argument for awareness of its irreversibility. A transition that is recognized as irreversible is managed differently from one that is assumed to be reversible. If the builders believe they can always go back — that the old expertise will be waiting if the new tools prove inadequate — they will take risks they should not take and neglect precautions they should maintain. If they understand that the transition is one-way, that the knowledge they are transforming cannot be reconstructed from explicit records if the transformation fails, they will invest more seriously in the social infrastructure required to make the transformation succeed.
The practical implications are specific. Organizations that are adopting AI tools should invest simultaneously in the social structures through which the new expertise will be developed and transmitted. Not training programs in the conventional sense — not tutorials on prompt engineering or workshops on AI workflow optimization — but genuine apprenticeship structures: relationships between experienced practitioners and newcomers in which the tacit knowledge of AI-directed work is transmitted through the slow, friction-rich process of working alongside someone who possesses it. The investment is not glamorous. It does not produce metrics that look impressive on quarterly reports. But Collins's research suggests it is the difference between a transition that produces genuine new expertise and one that produces the surface appearance of competence without the substance to support it.
Similarly, the institutions that credentialed expertise in the old regime — universities, professional certifications, industry standards bodies — must adapt not merely their curricula but their understanding of what expertise means in the new landscape. The old credentials measured the ability to perform specific technical tasks. The new credentials must measure something harder to quantify: the judgment to direct AI tools wisely, the evaluative capacity to distinguish good output from bad, the social knowledge to understand what users need and what the community considers acceptable quality. These are polimorphic competencies, and credentialing them requires methods that go beyond standardized testing — methods that assess the practitioner's ability to exercise judgment in novel, context-dependent situations. Collins's Imitation Game offers one model: evaluate the practitioner not by their technical output but by their ability to participate convincingly in the conversation of the expert community. If they can discuss the work with the fluency and judgment of a genuine expert — if they can be reliably distinguished from a non-expert but not reliably distinguished from a genuine expert — they possess at least interactional expertise in the domain, which is a necessary if not sufficient condition for the evaluative competence that AI-directed work requires.
The experiment is underway. Its conditions are unrepeatable. Its outcome will be determined not by the power of the tools but by the quality of the social processes through which the tools are integrated into human practice. Collins's framework does not predict the outcome. It identifies the variables on which the outcome depends — and insists, with the authority of five decades of empirical research, that the most consequential variables are social, not technological.
The assembly programmers of the 1960s — the practitioners who wrote machine code at the lowest level of abstraction, who understood every register and memory address and instruction cycle — possessed a form of collective tacit knowledge that has not survived into the present era. Their community dissolved as the abstraction layers rose. The knowledge they possessed is not available to contemporary programmers, who work at levels of abstraction so far removed from the hardware that the lower levels are essentially invisible. If a contemporary programmer needed to write assembly code for a novel architecture without documentation — a scenario analogous to the Americans trying to replicate the British laser — she would face the same barrier. The collective tacit knowledge that made assembly programming a fluent, socially sustained practice has dispersed. The community that maintained it no longer exists.
Was the loss consequential? The conventional answer is no. The abstraction layers that replaced assembly programming enabled capabilities that assembly programmers could not have imagined. The loss was real but, on balance, the gain was greater. Collins would accept this judgment while noting its retrospective character. The judgment is available only in hindsight, after the transition has been completed and its consequences have become visible. At the time of the transition, the judgment was not available. The assembly programmers who warned that something important was being lost could not know whether their warnings were prophetic or merely nostalgic. Neither could the enthusiasts who dismissed the warnings.
The same uncertainty characterizes the current transition. The engineers who warn that AI-directed development produces shallow practitioners may be prophetic or merely nostalgic. The enthusiasts who celebrate the democratization of capability may be visionary or merely naive. Collins's framework does not resolve the uncertainty. It establishes, with empirical rigor, that the uncertainty is genuine — that the transition is transforming the social structures on which expertise depends, that the transformation is irreversible, and that its consequences will be determined by choices that are being made now, under conditions of incomplete knowledge, by communities that are too young to have developed the collective wisdom required to make the choices well.
The knowledge at stake is not stored in any server. It is not backed up in any archive. It lives in the social practices of communities that are currently deciding, through their daily work, what to preserve and what to let go. The decision is consequential. It is irreversible. And it is being made, as all such decisions are, before anyone can fully understand what it will cost.
A synthesis of Collins's framework, applied to the full scope of The Orange Pill's argument, arrives at a conclusion that is neither triumphant nor elegiac but diagnostic. The AI transition is producing a new form of polimorphic action — and whether this new polimorphism carries the depth of the one it replaces is the genuinely open question that neither the enthusiasts nor the critics can answer from their current positions.
The old polimorphism of software development was constituted by a specific set of socially embedded skills. The experienced developer's judgment about how to structure a system — which components to couple, which to separate, where to accept technical debt and where to insist on clean architecture — was a polimorphic competence built through years of implementation struggle. The judgment was not rule-following. It was contextual, situation-dependent, informed by the accumulated experience of having built systems that worked and systems that failed, by the social knowledge of what the team could maintain and what would collapse under the pressure of growth or change. The judgment was transmitted through apprenticeship — through the specific social relationship of senior and junior developer, in which the senior developer's corrections and explanations conveyed not just technical information but the community's standards for what constituted good work.
The new polimorphism of AI-directed work is constituted by a different set of skills, but Collins's framework insists that they are genuinely polimorphic — genuinely dependent on social knowledge, contextual judgment, and understanding that cannot be reduced to rules.
The first component is communicative skill: the ability to describe intentions in ways that the AI can interpret productively. This is not as simple as "writing good prompts." It requires understanding the machine's interpretive tendencies — what it is likely to do with an ambiguous instruction, where it will default to generic solutions, when it needs more context to produce output that matches the practitioner's intention. This understanding is built through practice, through the accumulation of experience with the machine's patterns of response, through the gradual development of what Collins would call a feel for the conversation. The feel is tacit. Practitioners who possess it often cannot articulate what they know — they simply recognize, in the flow of the interaction, when the conversation is on track and when it has gone wrong. The knowledge resides not in explicit rules about how to prompt but in the practitioner's embodied sense of the conversational dynamic.
The second component is evaluative judgment: the ability to assess whether the AI's output meets the practitioner's standards. This is the component that most directly depends on prior domain expertise. A practitioner who has never built software cannot evaluate AI-generated software — not because the evaluation requires technical knowledge of a specific kind but because it requires the collective tacit knowledge of the software community: the sense of what constitutes clean code, robust architecture, maintainable design. Without this knowledge, the practitioner is reduced to evaluating the surface — does it compile? does it run? does it produce the expected output? — without the capacity to evaluate the substance — is it structured in a way that will survive contact with real users, real scale, real change?
Collins's framework identifies this as the critical vulnerability of the new polimorphism. The evaluative judgment that constitutes the core of the new skill depends on domain expertise that was historically built through the same implementation struggle that AI has now assumed. If the judgment can be built through other means — through the practice of directing and evaluating AI output, through apprenticeship with experienced practitioners who possess the old domain expertise and can transmit their evaluative standards — then the new polimorphism is self-sustaining. If the judgment requires the old apprenticeship as its foundation, then the new polimorphism is parasitic on the old — dependent on a generation of practitioners whose expertise was built through implementation struggle and whose knowledge must be transmitted before it dissipates.
The third component is what might be called creative misreading: the capacity to recognize when the machine's literal response to an instruction reveals a possibility that the practitioner had not considered. Segal describes this moment in The Orange Pill when he recounts the connection between technology adoption curves and punctuated equilibrium — a connection that emerged from the collision of his question and Claude's associative processing, producing an insight that neither could have generated independently. Collins would identify this as a genuine instance of productive interaction between interactional and contributory expertise. The machine's interactional competence — its ability to range across domains and find structural similarities — collided with the practitioner's contributory expertise — his understanding of what would constitute a meaningful connection in the context of his argument — to produce something new.
Creative misreading is polimorphic because it depends on the practitioner's ability to distinguish between the machine's responses that are merely novel (different from what was expected but not useful) and those that are genuinely illuminating (different from what was expected and productive in ways that advance the work). This distinction cannot be made by rule. It requires the same contextual, socially informed judgment that characterizes all polimorphic action. The practitioner who recognizes a creative misreading is exercising a form of expertise that is as demanding, in its own way, as the implementation expertise it replaces.
The fourth component is social: the understanding of what the work is for, who it serves, and what "good enough" means in the specific human context in which the work will operate. This is the component that Collins's framework identifies as most firmly rooted in collective tacit knowledge. A product that is technically excellent but serves no genuine need has failed a polimorphic test that no amount of technical sophistication can pass. The judgment about what constitutes a genuine need — what users will value, what problems are worth solving, what solutions will be adopted and which will be ignored — is social knowledge of the most fundamental kind. It is knowledge about human beings, their practices, their values, their frustrations, and their aspirations. It is maintained not in any dataset but in the ongoing social practices of communities that build things for other human beings to use.
Collins's verdict on the new polimorphism is characteristic of his intellectual temperament: precise, empirically grounded, and deliberately unresolved. The new skills are genuine. They are polimorphic. They resist reduction to rules. They require practice to develop and social embedding to sustain. In these respects, they are structurally equivalent to the skills they replace.
But structural equivalence is not the same as functional equivalence. Two forms of expertise can be equally polimorphic — equally dependent on social knowledge, equally resistant to formalization, equally demanding of practice — and still differ in depth. Collins's research on the history of technology transfer shows that when a community's practices change, the new practices sometimes produce expertise that is as deep as the old but differently constituted, and sometimes produce expertise that is shallower — functionally adequate for normal conditions but lacking the reserves of understanding that the old expertise drew on when conditions became abnormal.
The difference between deep and shallow expertise is invisible under normal conditions. Shallow expertise produces output that is indistinguishable from deep expertise when the situation falls within the expected range. The difference becomes visible only when the situation departs from the expected — when the system fails in an unprecedented way, when the user's needs are genuinely novel, when the problem requires judgment that no prior case quite covers. In these moments, deep expertise draws on reserves of understanding that shallow expertise does not possess, and the consequences of the difference can be significant.
Whether the new polimorphism will produce deep or shallow expertise is the question that Collins frames as genuinely undecidable from the current vantage point. The experiment is in its earliest phase. The community is forming. The collective tacit knowledge is accumulating. The trajectory could bend either way — toward a new form of expertise as deep as the old, sustained by social practices as rich as the ones it replaces, or toward a thinner expertise that functions well under normal conditions but lacks the depth to handle the abnormal.
Collins's contribution to the conversation is not a prediction but a specification: a precise description of what the question actually is, what variables it depends on, and what evidence would be required to answer it. The evidence is not yet available. The communities are too young, the practices too new, the collective tacit knowledge too thin for the long-term trajectory to be visible. The answer will emerge over years and decades, through the collective experience of thousands of practitioners building, evaluating, and transmitting the new expertise.
What can be said now, with the precision Collins demands, is that the outcome depends on choices that are being made in the present — choices about how much to invest in mentorship, how seriously to maintain evaluative standards, how much friction to preserve in the learning process, how robustly to build the social infrastructure through which the new community's collective tacit knowledge will be transmitted to the next generation. These are not technological choices. They are social choices. And they are being made, as Collins would insist, by communities of practice whose members are the only people qualified to make them — because they are the only people who possess the emerging collective tacit knowledge on which the choices depend.
The new polimorphism is real. Its depth is undetermined. Its future is social.
The sentence that caught me was not about AI.
It was Collins, writing about what happens when you can no longer tell the difference between someone who understands and someone who merely sounds like they understand. He called the danger "the Surrender" — not a surrender to superior machines but to superficially competent ones. The risk, as he framed it, is that we stop checking. That the hundredth time the output looks right, we lose the muscle for asking whether it actually is.
I recognized myself in that sentence with an immediacy that was uncomfortable.
Through every chapter of this book, Collins's framework did something I was not expecting. It did not reassure me and it did not alarm me. It gave me vocabulary for a discomfort I had been carrying since Trivandrum — the nagging sense that the speed of what we were building was outrunning the social structures needed to evaluate it. I knew what ascending friction was. I had lived it, named it, argued for it. What I did not have, until Collins, was a precise account of what the friction was actually made of.
It is made of social knowledge. Not data. Not documentation. Not the textual residue of expert practice that an LLM can ingest. It is made of the thing that happens between a senior engineer and a junior one at eleven o'clock at night when the system is failing in a way neither of them has seen before, and the senior engineer says something that is not in any manual — a judgment call, rooted in years of having been wrong in similar situations, transmitted in that moment through a social relationship that no amount of compute can replicate.
Collins calls this collective tacit knowledge. I call it the thing I am most afraid of losing.
Not because the tools are dangerous. Because the tools are so good that we might forget the social infrastructure on which their value depends. The code works. The prose reads well. The output looks right. And looking right is enough, most of the time, for most purposes. Until the moment it is not — the moment that requires the reserves of understanding that only a mature community of practice can sustain — and then the absence of what we stopped building becomes visible all at once.
This is not a reason to stop building. It is a reason to build two things at once: the artifacts and the communities. The products and the apprenticeships. The code and the conversations that teach people how to know when the code is not enough.
Collins is eighty-three. He is still publishing. Still testing. Still asking whether the machine understands or merely copies. His answer — that the intelligence is in the humans, not the LLMs — is not a dismissal of the technology. It is a demand that we take our own contribution seriously enough to maintain the social conditions that produce it.
I hear in that demand an echo of the question this entire project has been circling: Are you worth amplifying?
Collins reframes it in terms I cannot evade: Are we building the communities through which the next generation will develop the judgment that makes amplification worth having? Or are we assuming the judgment will form on its own, the way the code forms on its own, and discovering a decade from now that it did not?
The experiment cannot be run twice. The communities that dissolve do not reconstitute on demand. The knowledge that is not transmitted does not persist.
I keep building. But I build now with Collins's question in my ear — not as a warning against the tools but as a specification of what the tools require from us. The machine copies the surface. The substance is social. And the social is ours to maintain or ours to lose.
AI can now discuss any field with a fluency that fools nearly everyone. It writes code that compiles, prose that persuades, analysis that cites the right sources. Harry Collins, the sociologist who embedded himself in scientific communities for decades, identified the precise boundary where fluent imitation ends and genuine understanding begins -- and that boundary has never mattered more than it does right now.
This volume applies Collins's framework to the AI revolution unfolding in The Orange Pill. His taxonomy of tacit knowledge -- relational, somatic, and collective -- reveals what training data captures and what it structurally cannot. His distinction between interactional and contributory expertise explains why the machine can discuss your work brilliantly without being able to do it.
The result is a diagnostic tool for anyone building with AI: a way to know when the output deserves trust and when the surface is concealing an absence that only social knowledge can fill.
-- Harry Collins and Simon Thorne, 2026

A reading-companion catalog of the 14 Orange Pill Wiki entries linked from this book — the people, ideas, works, and events that Harry Collins — On AI uses as stepping stones for thinking through the AI revolution.
Open the Wiki Companion →