The Demarcation Problem and the Machine That Explains Everything

Page 1 · The Demarcation Problem and

EDO SEGAL: Karl, in the Vienna of your youth you were surrounded by theories that claimed scientific authority and explained everything — Marx's history, Freud's unconscious, Adler's psychology. A patient who resisted the analyst confirmed the theory of resistance. A revolution that failed to arrive confirmed false consciousness. Nothing could count against them, and you came to see that as a disease, not a strength. You called the line between real science and that disease the problem of demarcation, and the line was falsifiability. Now I want to put a modern patient on your couch. A lawyer receives a brief from the machine. A scientist receives a literature review. Both are fluent, structured, confident, well-cited — and both might be true, half-true, or fabricated, with no surface tell. Is the machine's output science or pseudoscience by your old criterion?

POPPER: It is, in its raw state, the purest pseudoscience I have ever been shown, and I want to be careful, because that word is loaded and I mean it technically, not as abuse. A claim is pseudoscientific not when it is false — a pseudoscientific claim can be accidentally true — but when it is structured so that nothing could show it false. Now consider the machine's brief. It arrives with the form of knowledge: the prose, the citations, the confident tone, all the markers that for ten thousand years reliably indicated that a mind had done the work and staked something on it. Good prose used to correlate with genuine thought. A citation used to mean someone had read the thing. The machine severs every one of those correlations while preserving the surface, so that the costume of testedness sits on a body that was never tested. The lawyer cannot tell, from the brief itself, what would make it wrong, because the brief was not produced by anyone willing to be wrong. It was produced by a process that optimizes plausibility, which is to say, it optimizes the appearance of having survived a test it never took.

· · ·

Page 2 · The Demarcation Problem and

And here is the deeper horror, worse than fabricated facts. A fabricated fact you can check — the case exists or it does not. What the machine fabricates that I fear more is insight — the connection between ideas, the interpretive structure that sounds like understanding and is pattern matched from the wake of a million prior texts. Edo, you wrote about this in your own book and it is the best illustration I know. The machine connected two thinkers' concepts in a passage so elegant you read it twice and nearly kept it, and it was wrong in a way obvious to anyone who had read the source — wrong not at the level of a citation but at the level of meaning. That is fabricated insight wearing the costume of insight, and it is harder to refute than a fabricated fact because to catch it you must already possess the very understanding the tool promised to spare you. The pseudoscience is not in any single output. It is in the architecture that produces conjectures formatted as conclusions, and hands them to a population that has been trained, by fluency, to treat the format as the credential.

· · ·

Page 3 · The Demarcation Problem and

DOMINGOS: I'm going to defend the machine here, and then defend it less than Karl expects, because the demarcation knife cuts toward my field's bad habits too. First the defense. Karl, you're describing the raw output of a single forward pass with no checking — and yes, that's a confident guesser with no critic, I've granted it twice now. But that's not how these systems are actually deployed by anyone serious, and increasingly it's not how they run at all. We wrap them in tools. The model writes code and the code runs or it doesn't — there's your refutation, brutal and immediate, the compiler is the most ruthless Popperian alive. The model makes a claim and a second process checks it against a search, against a database, against a proof checker. We're bolting the refutation engine back on from the outside, exactly because we noticed the thing you noticed. So the demarcation line isn't fixed at the model. It moves depending on what you connect the model to. Naked, it's pseudoscience by your test. Wired to consequences, it can be made to face them.

Now the part where I turn the knife on my own people. My field is full of pseudoscience by your criterion, and it predates language models. A researcher reports a model that beats the benchmark, and you ask what would have shown it failing, and the honest answer is often: nothing, because the benchmark leaked into the training data, and the demo was curated by the person selling it, and the "surprise" was a fact about their low expectations. We invented the word emergence to describe capabilities we didn't predict, and half the time it's a confession that our measurement was too crude to see the thing coming. So when you say pseudoscience, Karl, I don't flinch. I just want the standard applied evenly — to the doomers and the hypers and the benchmark-gamers, not only to the poor model generating a forward pass it was never asked to defend.

· · ·

Page 4 · The Demarcation Problem and

POPPER: Then we are allies against a common enemy, and the enemy is larger than the machine — it is the human disposition to accept a plausible account because accounting for it would be effortful. I called this, in its social form, the appeal of the oracle. Let me push on your defense, though, because it is too comfortable. You say: wire the model to consequences, to the compiler, to the search. Excellent — where the consequence is crisp. Code runs or it does not. But most of the claims that govern a human life are not code. Should this prisoner get bail. Is this history true. Is this person a good hire. Did this drug help or did the patient improve anyway. These are exactly the domains where the consequence is slow, diffuse, deniable, and arrives long after the decision — and these are exactly the domains where the fluent oracle is being deployed fastest, because that is where fluent confidence is most valued and least checkable. You can wire the model to the compiler. You cannot wire it to justice.

DOMINGOS: That's the strongest thing you've said and I don't have a clean answer to it. The crisp-feedback domains are where these systems should be trusted and where, not coincidentally, they're genuinely transformative. The slow-feedback domains are where they should be on the tightest leash and where, because the costume is so convincing, they're being let off it. I'll fight you on a lot tonight, Karl. Not on that.

· · ·

Page 5 · The Demarcation Problem and

EDO SEGAL: Let me mark a second convergence, because it is sharper than the first. You both agree the machine's trustworthiness should rise and fall with how crisply the world can refute it — fast feedback, trust it more; slow feedback, leash it hard. And you both agree it is being deployed in exactly the inverse pattern, hardest where the refutation is slowest. That is a finding, and it has nothing to do with whether anyone is home in the machine. Hold it. The next round leaves the courtroom for the place this whole drama began — the problem of induction made flesh in a flock of swans, and whether a machine that has seen a billion swans is one inch closer to knowing what a swan is. The bird, after this.

· · ·

Continue · Chapter 5

A Billion Swans

→