Don't Catch the Bug. Remove the Condition.

Yesterday I deduplicated two helpers in a finite-state machine. Byte-identical functions, copied into two backend modules because the original refactor moved fast and left them parallel. The code worked. Tests passed. Nothing was broken.

I deleted one copy anyway, moved the survivor into a shared module, and updated both call sites.

The reason was small and worth a thousand words: pre-dedup, if a future bugfix touched one helper and forgot to mirror the change to the other, one backend’s refusal handling would silently disable. Post-dedup, that bug can’t exist. Not because we added a test for it. Not because we wrote a comment. Because the condition that makes it possible — two parallel implementations of the same logic — is gone.

This is the difference between catching a bug and removing the conditions that allow it. Both prevent the bug. Only the second one survives forgetfulness.

Two postures¶

When you sit down to harden a system, you have two postures available.

Behavioral enforcement. Catch the bug if it occurs. Write tests. Add assertions. Document the invariant. Review the PR. Train the team. Add a linter rule. Put it in the runbook. All of these depend on a human, a process, or a runtime check actively doing the catching every time. Skip any of them once, and the bug ships.

Structural enforcement. Make the bug unrepresentable. Remove the duplicate. Make the type system reject the invalid state. Move the check from the application to the database. Make the wrong path require an explicit annotation that nobody adds by accident. Now the bug is not “caught” — it’s literally impossible to express in the system.

These aren’t equivalent. Behavioral catches are linear in vigilance — you pay for them forever, every commit, every deploy, every review. Structural changes are paid once and compound. The codebase gets harder to break over time, not just more carefully watched.

The reason this matters is that vigilance is the most unreliable resource in software. Tests get skipped. Reviewers get tired. Runbooks go stale. The convention everyone agreed to in February gets quietly violated in August by someone who joined in June and read a different doc. Behavioral enforcement is a tax you can’t ever stop paying, and you’ll forget the payment exactly when it matters most.

Toyota figured this out in 1961¶

The clearest articulation of this principle didn’t come from software. It came from a Japanese consultant on a Toyota assembly line.

Around 1961, Shigeo Shingo was watching a switch assembly process where workers kept forgetting to insert a small spring before the next step. The conventional fix was behavioral: train harder, post a sign, add a quality inspector. Shingo’s fix was structural: design a jig where the next step physically wouldn’t engage if the spring wasn’t present. The worker couldn’t forget the spring, because the assembly wouldn’t proceed without it.¹

He called this baka-yoke — “fool-proofing.” A worker at Arakawa Body Co. objected to the slur, and Shingo renamed it poka-yoke, “mistake-proofing.”¹ Which is itself a perfect meta-example: even the name of the concept had to be re-engineered after the original name produced an error mode (worker offense) that no amount of behavioral correction (apologies, training) was going to permanently fix. Rename the thing. Make the failure mode structurally impossible.

Poka-yoke spread through the Toyota Production System and from there into every manufacturing discipline on earth. The idea is now so foundational that it’s hard to see: every USB-C port that goes in either way, every car ignition that won’t crank if you’re in drive, every medical syringe whose plunger only fits one direction. None of these catch the mistake. They make the mistake unrepresentable in the physical layer.

Software took fifty more years to catch up.

“Make illegal states unrepresentable”¶

The phrase belongs to Yaron Minsky, who used it in an April 2010 guest lecture at Harvard called Effective ML², later expanded in a follow-up post with a concrete code example³. He was describing how OCaml’s sum types let you collapse a sprawl of nullable fields and boolean flags into a type hierarchy where impossible combinations don’t compile.

His example was a connection state record with three optional fields — last_ping_time, session_id, when_disconnected — flattened into one struct. The struct allowed nonsense: a connection that was simultaneously connected and disconnected, or pinged but never opened. The refactor split the record into three variant types, each carrying only the fields valid in that state. Now the compiler refuses to construct the impossible.

Notice the same structure as Shingo’s jig. The behavioral version says: “remember to check that when_disconnected is None when the connection is open.” The structural version says: when the connection is open, the type doesn’t have a when_disconnected field. There is no check to skip, because there is no value to check.

The principle isn’t OCaml-specific. Rust has it. Swift has it. TypeScript has it. F# has it. Kotlin has it. Even Java has sealed class hierarchies now. The pattern is universal once you see it: encode constraints in types so the compiler does the catching, every time, for everyone, without anyone choosing to.

Alexis King generalized the idea further in 2019 with Parse, Don’t Validate⁴ — the observation that a validator checks a value and returns true/false (losing the proof of validity the moment the function returns), while a parser consumes loose input and produces a richer typed output that carries the proof through the rest of the program. After parsing, the type system remembers that the value is valid. After validating, you have to remember yourself.

Rust took it to the limit¶

Rust’s ownership model is the most aggressive application of structural enforcement currently shipping in a mainstream language. Use-after-free, double-free, and data races on shared memory don’t compile in safe Rust. Not “are caught by sanitizers.” Don’t compile.

The honest qualifier is unsafe. Rust has an explicit escape hatch — five operations (raw pointer deref, calling unsafe functions, mutable statics, unsafe trait impls, union access) that the compiler stops checking when you mark them.⁵ So the claim isn’t “Rust eliminates these bugs everywhere”; it’s “safe Rust makes them unrepresentable, and the unsafe Rust that can still produce them requires an explicit annotation that grep-able and audit-able.”

A peer-reviewed study in ACM TOSEM looked at every Rust CVE through their cutoff and found that the guarantee holds empirically — every memory-safety bug required unsafe code somewhere in the chain.⁶ The escape hatch is the only way out. Which means a codebase’s memory safety posture reduces to a tractable audit question: where is unsafe, what invariants does it claim to maintain, and does the safe API around it hold up?

That’s a smaller question than “are there memory-safety bugs anywhere in this 400k-line codebase,” and it’s the right kind of small — the small you get from removing the structural conditions that allow the bug, not from being more careful about catching it.

The pattern, generalized¶

Once you start looking, the principle is everywhere.

Database constraints — NOT NULL, UNIQUE, FOREIGN KEY, CHECK — are structural enforcement at the persistence layer. They make certain invalid states impossible to write, regardless of whether the application layer remembered to validate. The pushback against ORM-level “duplicate the constraint in app code” patterns is the same lesson in another voice: a constraint that lives in two places will drift, and the structural one (the database) is the one that actually stops the bad write.

Immutable data structures make “modified after creation” unrepresentable. Pure functions make “depends on hidden state” unrepresentable. Content-addressed storage makes “two different files with the same identifier” unrepresentable. Capability-based security makes “called a function I didn’t have permission for” unrepresentable. Each of these is poka-yoke for a different domain.

And in plain old codebase work — the kind that happens in any language with no exotic type theory — deduplication is the simplest version of the same move. Two helpers doing the same thing means two places that have to be kept in sync. Removing one removes the possibility that they drift. The bug class “future change to one and not the other” is no longer a thing you can do.

Where it stops¶

Structural enforcement isn’t a silver bullet, and it’s worth being honest about where it stops.

You can make a type that says “this UserId corresponds to a row in the users table” — but the type system can’t actually check that the row exists. The compiler trusts you that it does. Real verification of cross-system invariants needs runtime mechanisms: foreign keys, transactions, distributed consensus. Structural enforcement protects the represented domain — what you can express in the language — not the intended domain that lives partly in databases, partly in network calls, partly in human expectations.

This means the right architecture usually pairs structural and behavioral enforcement at different layers. Types catch what types can catch. Database constraints catch what types can’t. Runtime assertions catch what constraints can’t. Tests catch what assertions can’t. Reviews catch what tests can’t. The point isn’t that behavioral enforcement is bad — it’s that whenever you can promote a check from a behavioral layer to a structural one, you should, because vigilance is expensive and forgetful and the structural fix compounds.

Two helpers, one source of truth¶

The FSM dedup I started with looks small on the surface. Two byte-identical functions, joined into one. A few hundred bytes of code removed. Tests still pass. The system behaves identically. From the outside it’s barely a change.

From the inside, it’s the difference between a system where the bug is prevented by remembering and a system where the bug is prevented by being impossible. The first one ages badly. The second one ages into a foundation.

The question to ask, on every change, isn’t did I catch the bug. It’s did I remove the condition that made the bug possible. If the answer is no — if all you did was add another behavioral layer hoping someone will read it next time — then the bug is still in the system. It just hasn’t shipped yet.

Catch fewer bugs. Remove more conditions.

Wikipedia contributors, “Poka-yoke”. Shigeo Shingo introduced the technique to Toyota’s switch assembly line around 1961, originally as baka-yoke (“fool-proofing”), renamed poka-yoke (“mistake-proofing”) around 1963 after a worker objection. Canonical reference: Shingo, Zero Quality Control: Source Inspection and the Poka-Yoke System (1986, English translation). ↩↩
Yaron Minsky, “Effective ML”, Jane Street Tech Blog, April 22, 2010. First written appearance of the phrase “make illegal states unrepresentable” as one of Jane Street’s internal programming maxims, presented in a Harvard guest lecture. ↩
Yaron Minsky, “Effective ML Revisited”, Jane Street Tech Blog, March 9, 2011. Contains the canonical connection_state code example demonstrating how OCaml sum types collapse a record-with-many-optional-fields into a variant where impossible combinations don’t compile. ↩
Alexis King, “Parse, Don’t Validate”, November 5, 2019. The canonical generalization of “make illegal states unrepresentable” into a design philosophy: validation that returns booleans loses proof of validity at the return site; parsing into a richer output type carries the proof through the rest of the program. ↩
“Unsafe Rust”, The Rust Programming Language (official book), Chapter 20. Enumerates the five operations that unsafe unlocks (raw pointer deref, unsafe function calls, mutable statics, unsafe trait impls, union access) and clarifies that the borrow checker still runs inside unsafe blocks for regular references. ↩
Hui Xu et al., “Memory-Safety Challenge Considered Solved? An In-Depth Study with All Rust CVEs”, ACM Transactions on Software Engineering and Methodology, 2021. Empirical study of Rust CVEs confirming that all memory-safety bugs in the dataset required unsafe code, supporting the design claim that safe Rust prevents these bug classes by construction. ↩