There’s a strange pattern in aviation safety work that you only really notice after a while.
The catastrophic events—the ones that make reports, boards, and headlines—almost never come from something truly unknown. In hindsight, they’re usually traceable. The warning signs were there. The system “knew” in some sense.
So the real question is not: why did we miss it?
It’s:
Why do we keep letting small, almost boring weaknesses accumulate until they matter?
We’re very good at designing against big failures
Modern safety engineering is excellent at obvious things:
- Engine failures
- Structural overloads
- Electrical faults
- Software crashes
- Single-point failures
We have tools, models, redundancy, certification regimes, and decades of lessons learned.
If something is dramatic and clearly dangerous, we usually have a strategy for it.
And yet…
That’s not where most risk lives anymore.
The real risk is the stuff that “sort of works”
The uncomfortable truth is that the hardest safety problems today are not binary failures.
They’re degraded behaviours like:
- A sensor that’s slightly unreliable, but “good enough”
- A procedure that still works, but only if people shortcut it
- A maintenance step that gets quietly compressed under time pressure
- A warning system that’s technically functional, but routinely ignored
- A design assumption that is almost true in real operations
Individually, none of these are catastrophic.
Together, they become normalised drift.
And normalised drift is where safety quietly erodes.
Safety systems don’t usually break — they relax
One of the most misleading mental models in safety engineering is that systems “fail.”
In reality, most systems don’t fail suddenly. They relax.
They move from:
- tightly designed behaviour
to - loosely followed intent
to - informal practice
to - “this is how we actually do it”
By the time something serious happens, the system hasn’t “broken” in a technical sense.
It has simply drifted far enough from its safety assumptions that those assumptions no longer apply.
The dangerous word in safety engineering: “acceptable”
A lot of safety work quietly depends on this idea:
“This level of risk is acceptable.”
That’s not wrong. It’s necessary. You can’t design a zero-risk aviation system.
But here’s the subtle issue:
“Acceptable” is not a fixed property. It is a contextual agreement.
And context changes:
- Staffing pressure changes
- Operational tempo changes
- Technology gets layered over old assumptions
- Experience levels shift
- Maintenance realities evolve
What was acceptable at design time may become fragile in operation.
Not because anyone failed — but because the world moved.
Most safety systems are optimised for certainty, not ambiguity
Engineering tools are strongest when the system is:
- defined
- bounded
- predictable
But real operations are:
- messy
- time-pressured
- partially visible
- constantly adapting
So we end up with a gap:
- Design assumes clarity
- Operation lives in ambiguity
And in that gap, people do what they always do in real systems:
They adapt.
Adaptation is both the solution and the risk
This is where it gets uncomfortable.
Operational staff are not “deviating from design” in a careless sense.
They are:
- making the system work in real conditions
- closing gaps the design didn’t fully anticipate
- keeping operations moving under constraints
In many cases, adaptation is exactly what prevents incidents.
But adaptation has a cost:
It slowly replaces designed safety margins with lived experience.
And lived experience is not always visible to engineers, analysts, or regulators.
So what actually keeps systems safe?
Not perfection. Not elimination of error. Not rigid adherence.
It’s something more boring and more important:
Continuous visibility of how the system is actually behaving, not just how it was designed to behave.
That means:
- listening to weak signals, not just incidents
- treating “workarounds” as data, not discipline problems
- updating assumptions faster than operational reality changes
- designing for drift, not just nominal cases
In other words:
Safety engineering is less about preventing failure
and more about preventing invisibility.
A final uncomfortable thought
If a system only looks safe in reports, models, and certification documents…
That’s not evidence of safety.
That’s evidence of alignment between documentation and design intent.
The real question is always:
What is the system doing when no one is formally checking it?
Because that is where safety actually lives.
Related Posts

