You On AI Field Guide · Small Failures and the Immune System The You On AI Field Guide Home
Txt Low Med High
CONCEPT

Small Failures and the Immune System

Petroski's analogy — precise, not decorative — that small, detectable failures function in engineering practice as immune responses function in biology: early warnings in the margin between initial deviation and catastrophic collapse, providing the window within which human intervention remains possible.
A crack in a concrete beam is not always a catastrophe. More often, it is a message — the structure reporting, from field conditions, that the stress distribution exceeds what the design anticipated. The engineer who reads the crack correctly receives an opportunity to intervene before the small failure becomes a large one. Petroski argued that this early-warning dynamic is not incidental to engineering but constitutive: the profession's inspection protocols, maintenance schedules, load testing, and factors of safety form a system oriented toward detecting failures while they are still small enough to be managed. The system presupposes a margin between normal operation and catastrophic failure — a margin within which small failures can occur without killing anyone. AI optimization, by reducing this margin in pursuit of efficiency, does not merely risk overloading structures. It eliminates the warning system itself. The optimized structure does not crack before it breaks. It breaks.
Small Failures and the Immune System
Small Failures and the Immune System

In The You On AI Field Guide

The Citicorp Center crisis of 1978 illustrates the immune system operating as intended. William LeMessurier's tower, completed in 1977, was discovered a year later to be vulnerable to quartering winds — a load case he had not fully analyzed, combined with a construction change that substituted bolted for welded connections. The vulnerability was identified not by structural failure but by a Princeton student's question. The margin — measured in time between the vulnerability's identification and the arrival of a storm that would have exploited it — allowed remediation. Welders worked at night. The building stands. The small failure, caught in the window between error and catastrophe, never became a large one.

The Tacoma Narrows Bridge demonstrates the opposite case. Its deck was optimized to an extreme shallowness — eight feet deep for a 2,800-foot span. Under normal wind, it did not crack, deflect, or vibrate enough to register as a warning. The first manifestation of aerodynamic instability was also the last: oscillations grew without check because the margin in which they could have been detected at manageable amplitude had been consumed by the optimization. A deeper, less efficient deck would have oscillated earlier at lower amplitudes. The oscillation would have prompted investigation. The investigation might have led to remediation. The optimization eliminated all three steps by eliminating the initial oscillation altogether.

Factor of Safety
Factor of Safety

The analogy to biology is structural. The immune system does not prevent infection. It detects infection early — when the pathogen load is still small enough to be managed — and mounts a response that prevents systemic crisis. The system operates in the margin between initial incursion and point of no return. Remove the margin and the immune response has no window in which to function. The organism appears healthy until it does not, with no intermediate state.

Engineering's equivalent is the deflection that exceeds calculation by a small percentage, the vibration at an unpredicted frequency, the material fatigue slightly faster than test data suggested. Each is a departure from the design hypothesis, small enough to be observed without immediate consequence, consequential enough to signal that the real conditions are departing from the modeled ones. The engineer who observes and interprets these signals receives a second chance — the opportunity to update understanding before the departure becomes fatal. The optimized structure that does not produce these signals has no second chance. It operates within spec until it does not, and the transition is discontinuous.

Origin

Petroski developed the small-failures framework across To Engineer Is Human (1985) and subsequent work, drawing on his detailed study of structural failures and their warning signs. The immune-system analogy appears in varying forms across his writing, though Petroski typically preferred the engineering language of inspection, monitoring, and maintenance rather than the biological metaphor. The Henry Petroski — On AI simulation extends the analogy explicitly, arguing that AI optimization threatens the immune function itself — not through damaging structures but through producing structures that, by virtue of their optimization, cannot signal their own distress.

Key Ideas

Small failures are features, not defects. The crack, the deflection, the vibration are not design inadequacies to be eliminated. They are the structure's communication channel to the engineer, operating in the margin between normal function and catastrophic failure. Their absence is not a sign of superior design but potentially of insufficient margin.

Small failures are features, not defects

The immune system requires margin. The warning signals occur in the space between specified capacity and actual failure capacity. If optimization reduces this space to zero, the signals have no space in which to occur. The structure becomes silent — and silence, in this case, is not a sign of health.

Time is the critical resource. The value of a small failure is the time it buys for intervention. A crack detected today that would become catastrophic in two years is valuable because the two years can be used. An optimized structure whose first failure is catastrophic provides no time and no option for intervention.

Efficiency and warning are in tension. The margin that enables early warning reads, to an optimization algorithm, as waste. Removing it produces efficiency gains. The gains are real. The cost — the elimination of the warning system — is invisible until the conditions arrive that the warning system would have detected.

Further Reading

  1. Henry Petroski, To Engineer Is Human (1985)
  2. Henry Petroski, Success through Failure: The Paradox of Design (2006)
  3. Charles Perrow, Normal Accidents (1984)
  4. James Reason, Human Error (1990)
Explore more
Browse the full You On AI Field Guide — over 8,500 entries
← Home 0%
CONCEPT Book →