Stepwise refinement is the operational procedure behind the principle that elegance and correctness should be built into a program from the first line. The programmer does not write a complex system and then try to verify it. She starts with an abstract specification, derives a first version that is as simple as the specification permits and provably correct at that level of abstraction, and then refines the version step by step — each step small enough to be verified, each step preserving the properties the previous steps established. The finished program is complex only to the extent the problem required complexity; every piece of the complexity was introduced deliberately, and every introduction preserved the correctness of the whole.
Stepwise refinement was articulated most clearly in Niklaus Wirth's 1971 paper "Program Development by Stepwise Refinement" and Dijkstra's 1976 A Discipline of Programming, which formalized the approach using guarded commands and predicate transformers. The method was an operational answer to the question of how structured programming and provable correctness could be combined in actual practice.
The characteristic feature of the method is its monotonicity: each step preserves what the previous steps established. The program grows, but it grows by refining abstract specifications into concrete implementations in small, verifiable increments. The programmer never has to think about the whole system at once, because the whole system is constructed by local moves each of which is individually simple. This is separation of concerns extended through time: the programmer addresses one refinement at a time, in isolation, and trusts that the refinements compose.
AI code generation is, structurally, the antithesis of stepwise refinement. The AI does not start with the simplest correct solution and add complexity only when required. It starts with a plausible solution — one that resembles the solutions in its training data — and the plausible solution is almost never the simplest one. Plausibility and simplicity are different optimization targets. The plausible solution looks like code that exists. The simple solution reveals why the code is correct. These sometimes coincide. More often they do not.
Each regeneration compounds the divergence. When the builder asks the AI to fix a bug or add a feature, the AI does not simplify; it extends. It produces code that addresses the new requirement by building on the existing code, inheriting whatever complexity the existing code already contained and adding the complexity needed to accommodate the new behavior. There is no refinement, because there is no reasoning about the specification to be preserved. There is only statistical continuation.
The term "stepwise refinement" was introduced by Wirth in his 1971 paper, but the underlying methodology was being developed simultaneously by Dijkstra, Hoare, and others. Dijkstra's guarded command language, introduced in his 1975 paper on nondeterminacy, was designed explicitly to support refinement with formal verification at each step.
The method has a vigorous descendent in the B-Method, Event-B, and TLA+ formalisms used in safety-critical software today, though its influence on mainstream programming practice has been filtered through agile methods that share some of its iterative character while rejecting its insistence on provable correctness at each step.
Simplest first. Begin with the simplest solution the specification permits and add complexity only when requirements demand it.
Each step preserves properties. Refinements are monotonic: the correctness established at one step is not lost in the next. Verification is cumulative.
Complexity must earn its place. Every piece of complexity in the finished program corresponds to a specific requirement that demanded it. No complexity arrives by accident.
Plausibility is not simplicity. AI generation optimizes for what resembles existing code, not for what minimally satisfies the specification. These are different targets and they rarely coincide.
Regeneration is not refinement. Asking the AI to fix a bug is not a refinement step; it is a statistical continuation that extends the existing code without revisiting its structure.
The practical critique of stepwise refinement is that it scales poorly: real software is built incrementally by large teams over years, and the discipline of formal refinement at every step is incompatible with the way such teams actually work. The Dijkstrian reply, implicit throughout his later writing, is that the incompatibility is a symptom rather than a refutation — the real difficulty is that the industry has chosen to organize work in ways that make discipline impossible, and the choice has costs that show up downstream as unmaintainable systems.