Why averages are hiding your operational risk
- David Anderson

- Jan 5
- 3 min read
Updated: Jan 29
Most operational reporting looks reasonable on the surface.
Average delivery time looks fine.
Average delay is only a few days.
Average schedule slippage feels manageable.
Yet teams are still firefighting. Customers are still frustrated. Inventory buffers keep creeping up. Everyone knows something isn’t lining up, even though the numbers say things are “mostly OK”.
The problem usually isn’t the data. It’s the metric.
The problem with averages in the real world
An average tells you what usually happens. That’s useful, but it’s incomplete.
In practice, most operational pain does not come from being a little bit late. It comes from the small number of cases that go very wrong.
Think about it this way:
Nine deliveries arrive on time.
One delivery arrives 120 days late.
On average, performance still looks acceptable.Operationally, that one late delivery can trigger stock-outs, emergency freight, lost production time, and some very uncomfortable conversations.
The average smooths that pain away. The business still has to deal with it.
This pattern shows up everywhere:
Freight and logistics.
Production scheduling.
Project delivery.
Field service and maintenance.
Professional services work.
The details change, but the behaviour doesn’t.
Why the worst case operational risk matters more than the typical one
Most teams already plan for things to be slightly late. Buffers exist for a reason.
What’s harder to deal with are the rare but severe delays:
The shipment that arrives months late.
The job that blows out far beyond expectations.
The schedule that collapses under one bad dependency.
These are the events that drive:
Excess safety stock.
Manual workarounds.
Escalation and blame.
Loss of trust between teams and customers.
If your reporting does not make these outcomes visible, you are managing by surprise.
A more useful question to ask
Instead of asking:
“What is our average delay?”
A more practical question is:
“How bad does it get when things go wrong?”
This is where percentile-based metrics become useful.
What P90 actually means (without the maths)
P90 is short for “90th percentile”.
In plain terms:
90 percent of outcomes are better than this number.
10 percent are worse.
So if P90 delivery lateness is 40 days:
Most deliveries are much better than that.
But one in ten is worse.
That one in ten is where most operational stress lives.
P90 does not replace averages. It complements them.
Averages tell you what usually happens.
P90 tells you what you need to be ready for.
What changes when you look at P90
When teams start looking at P90 instead of just averages, a few things usually happen.
First, differences between suppliers, carriers, or processes become obvious. Two options that look identical on average often have very different worst-case behaviour.
Second, conversations become more grounded. Instead of arguing about whether performance is “good enough”, teams can talk about risk, predictability, and exposure.
Third, planning decisions improve. Buffers, safety stock, and contingencies can be sized based on real variability, not gut feel.
This is especially useful when:
Service levels matter more than cost alone.
Downstream impact is high.
Customers feel delays more than early arrivals.
This applies beyond logistics
Although this comes up a lot in freight and supply chain work, the same logic applies almost everywhere.
If you run projects:
Average task duration hides the risk of critical overruns.
If you manage production:
Average lead times hide line-stopping events.
If you run a service operation:
Average response times hide customer-breaking delays.
Any system with variability benefits from understanding its extremes.
How to use this in practice
You don’t need a complex model to start.
A practical approach looks like this:
Keep your existing averages.
Add P90 alongside them.
Review both together.
When averages look good but P90 looks ugly, you’ve found risk.When both improve, performance is genuinely becoming more reliable.
Over time, this also helps identify where improvement effort actually pays off. Reducing variability often delivers more value than chasing marginal average gains.
The takeaway
Averages are not wrong. They’re just incomplete.
If your reporting only tells you what usually happens, it is not telling you what actually causes stress, cost, and disruption in the business.
Metrics like P90 make that risk visible. Once you can see it, you can manage it.
And that’s when reporting starts to support real decision-making, not just reassurance.
Want to see this applied to your data?
If your reports feel reassuring but day-to-day operations still feel reactive, there may be risk in the data that isn’t being surfaced.
Data Wranglers work with operational teams to take existing data and turn it into practical metrics that expose variability, risk, and where attention is actually needed. That usually starts with a short, focused piece of analysis to understand what’s really going on.
If that sounds useful, feel free to get in touch.


Comments