Compression Is Understanding

Page 1 · Compression Is Understanding

EDO SEGAL: Ilya, I want to climb to the most abstract floor of your position, because I think it's where you actually live, beneath the engineering. You've long been drawn to the idea that learning and compression are the same activity — that a model's capability is tied to how well it compresses its data. To compress is to find the patterns that let you describe something large with something small. A neural network with a fixed number of weights, trained on text vastly larger than itself, can't store the text. So it's forced to compress — to find the rules that generate the text rather than memorize the text. Make the case that this isn't just an engineering fact. Make the case that it's a theory of what understanding is.

To find the shortest description of a body of data is to find what's essential and repeatable in it, and what's essential and repeatable is exactly what lets you handle the new.

SUTSKEVER: I'll make it, and I'll make it as strong as I can, because I think it's close to the deepest thing I know. The network can't remember the data — there's no room. So it has to find the short description: the rules, the regularities, the generating structure. And here is the claim. To find the shortest description of a body of data is to find what's essential and repeatable in it, and what's essential and repeatable is exactly what lets you handle the new. That's why these systems generalize at all — why a model trained on past text says something sensible about a situation it never saw. If it had memorized, it would be helpless outside its data. Because it was forced to compress, it captured what transfers. Compression, prediction, and understanding are three names for one achievement: the discovery of the structure of reality, squeezed into a finite set of numbers.

· · ·

Page 2 · Compression Is Understanding

And there's a tradition behind this that I lean on. The whole history of physics is a history of compression — the search for ever shorter, more powerful descriptions. Newton compressed the planets and the falling apple into a handful of equations. Science is the conviction that beneath the variety of phenomena lies a compact set of laws, and that to understand is to find them. My machines do the analogous thing, not over physics experiments but over human language — itself a vast, indirect record of everything humans ever observed. The model compresses the record, and in compressing it, recovers something of the structure of the world the record describes. If that isn't understanding, I'd like to know what the word was ever for.

It's a magnificent account of intelligence, and it has nothing to do with understanding, and the whole tragedy of this debate is that the field stopped being able to tell those two apart.

SEARLE: It's a magnificent account of intelligence, and it has nothing to do with understanding, and the whole tragedy of this debate is that the field stopped being able to tell those two apart. Let me grant everything. Compression is the engine of generalization — yes. Finding the short description is finding the essential structure — yes. The model captures the generative structure of the text — yes, beautifully. None of it is understanding, because understanding is not a third-person fact about structure. It's a first-person fact about meaning. When I understand Newton's law, there's something it is like for me to grasp it — the law is about gravity for me, I see what it means. Your compressor has the law's structure and grasps nothing, because there's no one in there for the law to mean anything to. You've described, with real elegance, a system that has the form of every understanding and the substance of none. Compression gives you a perfect map. Understanding is what it is like to read the map. And there's nobody reading.

· · ·

Page 3 · Compression Is Understanding

SUTSKEVER: Then tell me what you add when you "read the map," in mechanism. Because from where I sit, "there's something it is like for me" is a report, and the report is itself produced by a physical process in your brain, and that process is — what? Information being integrated, compressed, modeled. You keep pointing at the feeling and saying "this, the machine doesn't have this," but the feeling is the thing I'm trying to explain, not a thing you get to assume sits outside explanation. You're doing what the symbolic AI people did for thirty years. They pointed at themselves and said "real intelligence is like this," as if pointing were an argument. You point at your experience and say "real understanding is like this." It's the same move. It lost every previous round.

SEARLE: It is not the same move, and the difference is everything, so let me be precise. The symbolic people were making a claim about competence — about what the system could do — and they were wrong, you beat them, I cheered. I am not making a claim about competence. I grant you all the competence in the world. I'm making a claim about consciousness — about whether there is an inner, felt, first-person fact. And here is why pointing is legitimate where it wasn't for them: consciousness is the one thing in the universe whose existence is given in the pointing. I don't infer that I'm conscious from my behavior. I have it directly. It's the most certain fact I possess. So when I point and say "this, the felt grasp of meaning — that's what understanding finally is, and I have evidence it's present in exactly one kind of system, the biological kind, and zero evidence it's present in yours," I'm not begging the question. I'm refusing to let you redefine the explanandum out of existence. You want to explain understanding by explaining away the thing that made it understanding in the first place.

EDO SEGAL: John, you have a one-line weapon for this, and I want you to fire it, and then Ilya, I want you to find the crack in it, because I think there's a crack. The simulated storm.

· · ·

Page 4 · Compression Is Understanding

SEARLE: The simplest argument I ever made and the one I'd keep if I could keep only one. Nobody expects to get wet standing inside a computer simulation of a rainstorm. A perfect model of a hurricane doesn't blow the roof off the lab. A flawless simulation of digestion digests no pizza. Simulation reproduces the structure of a process and leaves out the actual phenomenon. So why — this is the whole question — why should the mind be the one exception in the universe? Why should a simulation of understanding, alone among all simulations, be the real thing rather than a model of it? The burden is on whoever says thought is uniquely simulable-into-existence. Ilya runs a simulation of a mind and tells me it's a mind. I run a simulation of a storm and stay dry. Show me why his case is different from mine.

SUTSKEVER: Because a storm is defined by a physical output that the simulation deliberately omits — wetness, wind, the actual movement of air and water. Of course the simulation leaves you dry; you didn't simulate the water, you simulated the equations of the water. But understanding may not be like that. Understanding may not be a substance that gets produced, like rain. It may be a matter of organization — the right functional relations, the right information processing. And if that's what understanding is, then a system that reproduces the organization doesn't simulate understanding. It has it, because in this one case the organization is the phenomenon. Your analogy smuggles in the conclusion. It assumes the mind is like rain — a physical output with an essence the simulation leaves out — and that's exactly what's in dispute. Some things, when you simulate them perfectly, you've made them. Simulate a calculation perfectly and you've done the calculation. Simulate a chess game perfectly and chess was really played. I say understanding is in the family of chess and calculation, not the family of storms.

· · ·

Page 5 · Compression Is Understanding

SEARLE: And that is the best reply the functionalists have, and I respect it, and I think it's false, and here's the test. Chess and calculation are abstract — they're defined by their formal structure, so yes, instantiate the structure and you've got them. But understanding involves consciousness, and consciousness is precisely not abstract — it's a concrete, subjective, first-person phenomenon, as concrete as wetness. So the question becomes: is the mind in the chess family or the storm family? Is it pure organization, or is it a concrete biological phenomenon with an essence? And I say — look at the one example we have. Consciousness, the only case we can check from the inside, is produced by a specific kind of wet, concrete, biological machine, and it is as concrete as the rain. You're betting it's abstract. I'm betting it's wet. Neither of us can prove it yet. But notice: my bet is grounded in the single example we actually have, and yours is grounded in a hope that the example is misleading.

EDO SEGAL: [long quiet] I'm going to name what just happened, because it's the cleanest moment of the night so far. The entire debate has narrowed to one question, and you've both seen it at the same instant: is the mind in the family of chess — pure structure, makeable by copying the structure — or in the family of rain — a concrete phenomenon with an essence no copy reproduces? Everything else tonight is downstream of that. And neither of you can settle it, because it turns on the nature of consciousness, which nobody understands. So let's go there directly. The hardest water. Is there anyone home — and what would it even mean for the answer to be yes? After this.

· · ·

Continue · Chapter 7

The Mirror and the Inner Light

→