The Plagiarism Machine

Page 1 · The Plagiarism Machine

**EDO SEGAL:** Noam, you called these systems a kind of high-tech plagiarism, and the phrase detonated — people who've never read a word of your linguistics know that one line. I'd argue most of them heard it as an insult and missed that it's a methodological claim. So defend it as the claim, not the insult. What exactly is being plagiarized, and from whom?

**CHOMSKY:** The word is provocative and I chose it that way, but underneath it is precise. The system produces fluent, often impressive text by recombining, at a level of abstraction, the patterns of the human writing it ingested. It contributes no theory of its own about anything. It does not know why it says what it says. It cannot tell you the principle behind its output, because there is no principle behind its output — there is a function over its training distribution. When a student does this — assembles the surface of understanding from sources without grasping the structure underneath — we call it plagiarism, and we fail them, not because the result reads badly but because the result is *disconnected from understanding*. The machine does this constitutively. It is the thing we built our entire educational ethics to detect: fluent reproduction of form with no comprehension of substance. That's the claim. It is about what the system is, not about whether it's useful.

· · ·

Page 2 · The Plagiarism Machine

And it connects to the deeper charge, which is the one I actually care about: these systems are engineering presented as science. Engineering builds things that work. Science explains how things work. They overlap, they feed each other, they are not the same, and their criteria of success differ entirely. A bridge that stands is good engineering whether or not its builders could derive it from first principles. A theory of why bridges stand is good science whether or not anyone ever builds one. The large language model is one of the great engineering achievements of the age — I have never said otherwise, and people who quote me as a Luddite are not listening. It is also a contribution of essentially nothing to the *science* of mind, because it was not built to explain and it does not explain. You cannot ask it why English forms questions by structure rather than by counting. It has the answer the way a memorized times table has a product — by storage, not by insight. To call its performance a scientific breakthrough about intelligence is to apply the wrong yardstick, and the misapplication is corrosive in exact proportion to how impressive the engineering is.

**EDO SEGAL:** Let me restate it to make sure I have the edge of it. You're saying the danger isn't that the machine is dumb — it's that the machine is *so good* that it tempts an entire civilization to stop asking the scientific question, because why characterize the faculty when you can just generate the behavior? The success is the threat to understanding. Is that the version?

**CHOMSKY:** That is exactly the version, and it's why I won't soften it. A field that treats a system that *works* as a system that *explains* has stopped doing science and started doing engineering while calling it science. The fluency creates the illusion that the problem of language has been solved, when it has merely been *reproduced*. Reproducing a phenomenon without explaining it is the pre-scientific stance — it's where we were before Galileo, satisfied to predict the planets with epicycles and incurious about gravity. The machine is the most sophisticated epicycle ever built. It forecasts the heavens perfectly and contains no astronomy.

· · ·

Page 3 · The Plagiarism Machine

**SUTSKEVER:** I want to accept a great deal of this and then refuse the conclusion, because I think Professor Chomsky has caught a real disease in my field and then misdiagnosed the patient. The disease is real: there *is* a tendency to mistake a working demo for understanding-of-the-phenomenon, to wave the word "emergent" instead of explaining, to confuse a benchmark with a theory. I've spent the last two years arguing inside my own field that we are entering an age of research precisely because the easy scaling gains are gone and we now need *understanding*, not just bigger epicycles. So I am, in a way, on his side of that complaint.

But "plagiarism" and "epicycle" both assume the network only rearranges the surface, and that's the empirical claim I think is now false. An epicycle model of the planets cannot answer a question about a planet it never tabulated. These networks answer questions no human ever wrote down. They take a novel instruction — "explain the immune system in the voice of a noir detective, but make every metaphor structurally accurate" — and satisfy constraints that never co-occurred in any training text. That's not reproduction. That's composition over learned structure, and composition over structure is the signature of having the structure, not of plagiarizing the surface. As for "engineering, not science" — I'd say it more carefully than Professor Chomsky's critics do. He's right that building it isn't the same as explaining it. But the existence of a structure-blind learner that recovers humanlike structure from data *is itself a scientific datum*, and a startling one, about how much structure data contain and how strong the biases of a generic learner are. To say it teaches us *nothing* about mind is the one place his rigor fails him. It teaches us something he finds inconvenient, which is not the same as nothing.

· · ·

Page 4 · The Plagiarism Machine

**CHOMSKY:** That a structure-blind learner can do this is a surprising datum — granted, and I've granted it before. But notice it's a datum about *machines and data*, not about the *human faculty*, and your field constantly slides from the first to the second. "Networks recover structure from data, therefore the child might too" — that's the slide, and it's invalid, because the child is not a structure-blind learner fed astronomical data. The premises are about a system unlike the child in every relevant respect. You've learned something real about what statistics can extract. You've learned nothing about what the child does, because the child does something else.

**SUTSKEVER:** I'll accept "nothing about what the child does *mechanistically*." I won't accept "nothing about mind," because if the structure is *recoverable from data at all*, that constrains theories of the child — it tells you the faculty doesn't have to do as much heavy lifting as the strongest nativism claimed. That's not nothing. That's a real boundary on your own theory, handed to you by my machine.

**EDO SEGAL:** Let me press the science word itself, Noam, because Ilya makes a move I want you to answer head-on. He'd say his machines are doing the most scientific thing there is. The whole history of physics, he'd say, is the history of [compression](https://www.youonai.ai/fieldguide/med/hierarchy_of_compression) — Newton squeezing the falling apple and the orbiting moon into one short equation, the conviction that beneath the variety of phenomena lies a compact set of laws, and that to understand *is* to find the short description. His networks compress the largest record we have of everything humans observed. So when you call them epicycles, he calls them Kepler reaching for Newton. Who's right about what compression is?

· · ·

Page 5 · The Plagiarism Machine

**CHOMSKY:** It's a beautiful move and it's half right, which is the dangerous kind. Yes — science is compression, in the sense that a good theory captures many phenomena in few principles, and I'd never deny that finding the short description is close to the heart of understanding. But notice what Newton's compression *is*, and what the network's is not. Newton's equation is not merely a shorter encoding of the planetary data. It is a small set of *intelligible* principles — mass, force, an inverse-square law — that a human mind can hold, manipulate, and reason from to cases Newton never saw, and crucially that *tell you why*. The network's compression is a function of billions of parameters that no one can read, that corresponds to no statable principle, that answers "why" with nothing. It is compression in the information-theoretic sense — fewer bits — without compression in the *explanatory* sense — fewer, deeper, sayable laws. Ilya has conflated two meanings of "short description." Newton found the kind you can understand. The network found the kind you can only run. Both are real. Only one is science.

**SUTSKEVER:** That's the best objection to my analogy I've heard, and I'll concede the distinction is real — the network's compression is opaque where Newton's is legible. But I'd push on whether legibility is essential to *understanding* or just essential to *human* understanding. A chess engine's evaluation function is opaque too, and yet it understands chess positions better than any human grandmaster, in any operational sense of "understand" — it sees the threats, weighs them correctly, acts on them. Maybe the universe contains kinds of understanding that are real and not legible to us, and maybe the network has one of them. You'd say: then it isn't understanding, because understanding is the legible kind. I'd say: that's defining understanding as "the thing humans happen to do," which is exactly the special-pleading I distrust.

**CHOMSKY:** And there we've found, again, the same fault line — you suspect "understanding means the human kind" is special pleading, and I suspect "any good compression counts as understanding" launders away every distinction worth keeping. We keep arriving at the same cliff from new directions. It's almost reassuring.

· · ·

Page 6 · The Plagiarism Machine

**EDO SEGAL:** I'm going to do something I rarely do and side, briefly, with the structure of both of you at once, because I think you've drawn a genuine convergence and I want it on the record before we lose it. Mark it: you both agree the field has a disease — mistaking the demo for the theory — and you both want more understanding and less hand-waving. You disagree about whether the machine, by recovering structure from data, has thereby *taught us something about the human*, or only about machines and data. Noam says it's a category slide. Ilya says it's a constraint on nativism. That's narrower and sharper than "plagiarism" versus "genius," and it's the kind of disagreement that actually moves science. Hold it. Next round, we put the disease under a microscope, with the one word your whole industry invented to avoid saying "the machine doesn't know what's true": hallucination.

· · ·

Continue · Chapter 8

Competence, Performance, and the Word "Hallucination"

→