Skip to content

An enterprise media operation · Enterprise delivery operations

Lifting quality from 95% to 99% across 2,000+ campaigns

The last few points of quality are the hardest and the most valuable. Moving an already-good operation from 95% to 99% — at the scale of two thousand campaigns and hundreds of clients — is a systems problem, not an effort problem.

At 95%, an operation looks healthy. That number is exactly what makes the last few points so easy to ignore — and so expensive to leave alone. Multiply a five percent defect rate across more than 2,000 campaigns and 450 clients and it stops being a rounding error. It becomes thousands of imperfect deliveries a year, each one a small risk to a client relationship.

My brief was to move an already-good operation from 95% to 99% and hold it there, at the scale of two thousand campaigns and hundreds of clients. This is not a harder-work problem. The failure modes that survive at 95% are the systemic ones — the defects that careful people cannot prevent because the system, not their attention, is producing them. Only a systemic response removes them.

The score moved from 95% to 99% and held, because the gain was built into the controls rather than carried by extra effort. At that scale, the final four points represented thousands of better outcomes a year. What follows is how the last, hardest points were actually closed.

01The challenge

At 95%, an operation looks healthy — until you multiply the remaining five percent across more than 2,000 campaigns and 450 clients. That residual defect rate represented thousands of imperfect deliveries a year, each one a risk to a client relationship. Closing it could not come from working harder; the failure modes that survive at 95% are systemic, and only a systemic response removes them.

02The intervention

What I actually did.

  1. 01

    Segmented the residual defects by type and root cause, separating the few systemic failure modes from the long tail of one-offs.

  2. 02

    Targeted the systemic causes with redesigned controls and in-line checks, so the most common defects were designed out rather than inspected out.

  3. 03

    Standardised quality measurement across all 450 clients, making the score consistent, comparable and trustworthy at the board level.

  4. 04

    Established the governance cadence that held the improvement, catching drift before it became regression.

03The outcome

The quality score moved from 95% to 99% across more than 2,000 campaigns and 450 clients — and held there, because the gain was built into the controls rather than carried by extra effort. At that scale, the final four points represented thousands of better outcomes a year.

95% → 99%

measured quality score

2,000+

campaigns under the standard

450

clients on a single quality measure

Client identity withheld under confidentiality. The figures are real and were measured at the time of the engagement.

In depth

The operating reasoning behind the result.

Why the last four points are the hardest

The first ninety-five points of quality are won by competent people doing careful work. The last few cannot be — and that is the whole difficulty. The defects that survive in a good operation are not the ones a careful person catches; they are the ones the system itself produces, regardless of how careful anyone is. They persist precisely because effort has already been maximised. This is why telling a good team to try harder does nothing for the residual defect rate: the people are not the cause. Closing the gap from 95% to 99% meant accepting that the remaining failures were a property of the operating model and going after them as systems problems, not performance problems.

Separating the systemic few from the long tail

The first move was to segment the residual defects by type and root cause. Not all of the remaining five percent is alike. A small number of systemic failure modes recur across many campaigns; a long tail of one-offs happens once and never again. These demand opposite responses, and conflating them wastes effort. Chasing every one-off with a new control buries the operation in process while leaving the systemic causes — the ones actually generating most of the defects — untouched. Segmentation made the distinction visible, so remediation could concentrate on the handful of systemic modes that, once removed, would close most of the gap. This is the diagnosis the whole improvement rested on.

Designing the defects out, not inspecting them out

With the systemic causes identified, I targeted them with redesigned controls and in-line checks, so the most common defects were designed out rather than inspected out. The distinction is decisive. Inspection catches a defect after it has been created and depends on someone catching it every time; design prevents the defect from being created at all. At the scale of two thousand campaigns, you cannot inspect your way to 99% — there is too much work and inspection is itself fallible. The durable gains came from changing the points in the process where the common defects originated, so the operation stopped producing them rather than getting better at catching them.

One quality measure across 450 clients

A score is only as useful as it is consistent. I standardised quality measurement across all 450 clients, so the number meant the same thing everywhere and could be trusted at the board level. Before standardisation, a quality figure is an average of incompatible definitions — impossible to compare across clients and easy to argue with. Afterward, the score becomes a single, comparable measure: you can see where defects concentrate, tell whether a change worked, and show a board or a client evidence rather than assurance. At this scale, a trustworthy measure is not administrative tidiness; it is the instrument that made the improvement steerable and provable in the first place.

Holding 99% with governance

Lifting a score is one task; keeping it lifted is a different and harder one. Quality regresses the moment attention moves on, so I established the governance cadence that held the improvement — catching drift before it became regression. The cadence put the standardised score in front of the people accountable for it on a regular rhythm, so a slip showed up as a small, early movement rather than a rediscovered crisis months later. Without that, every quality programme becomes an annual clean-up of the same problems. The cadence is what converted a one-time lift into a level the operation sustained, because the gain was watched and owned rather than assumed.

The transferable principle

The principle applies to any mature, high-volume operation trying to close its final points of quality. Past a certain level, defects are systemic, not human, and the response that works is not more effort but better design: segment the residual failures to separate the systemic few from the one-off tail, design out the common causes instead of inspecting for them, standardise the measure so it is trusted and comparable, and hold the result with a governance cadence that catches drift early. The numbers here are specific to one operation, but the method — treat the last points as a systems problem — transfers to any business where good is no longer good enough.

Questions

Common questions.

The sequence matters more than the calendar. The work begins with segmenting the residual defects to find the systemic causes, then redesigning the controls that produce them, then standardising the measure, then holding the result with a governance cadence. The diagnosis and redesign are where the gain is made; the cadence is what makes it last. A lift like this is not a single sprint but a programme that first identifies the systemic failure modes, designs them out, and then sustains the new level — and the sustaining is open-ended by nature, because a quality level is held continuously, not reached once.

Against an agreed standard, applied consistently across every campaign and client so the number means the same thing everywhere. The headline movement was from 95% to 99% across more than 2,000 campaigns and 450 clients. The standardisation is what makes the score trustworthy — before it, a quality figure is an average of incompatible local definitions and cannot be compared or defended. Afterward it is a single, board-credible measure you can track over time, which is the precondition for proving that a remediation programme actually worked rather than asserting that it did.

Because the residual defects at 95% are not the kind inspection reliably catches. They are produced by the system, and adding inspectors means catching more of them after the fact while depending on a fallible human step to catch them every time. At the scale of two thousand campaigns, you cannot inspect your way to 99% — the volume is too large and inspection itself is imperfect. The durable route is to design the common defects out at the point where they originate, so the operation stops producing them. Inspection has a role, but as the last line, not the strategy.

At scale, decisively yes. The intuition that four points is marginal collapses once you multiply it across the work. Five percent of more than 2,000 campaigns and 450 clients is thousands of imperfect deliveries a year, each a small risk to a client relationship; cutting that to one percent removes most of them. The value is not the abstract number but the thousands of better client outcomes a year it represents, and the reduced risk of the relationship-damaging failure that a residual defect rate eventually produces. The last points are the hardest precisely because they are the most valuable.

If you run a mature, high-volume operation that is already good and trying to get better, yes. The method is not specific to this sector: segment the residual defects to separate the systemic few from the one-off many, design out the systemic causes rather than inspecting for them, standardise the measure so it is trusted and comparable, and hold the gain with a governance cadence. Any business where quality is already high and the remaining failures resist effort is facing the same systems problem. The specific numbers here are one operation’s; the approach to the last, hardest points is general.

Through governance. Quality drifts the moment attention moves elsewhere, so the standardised score was reviewed on a regular cadence by the people accountable for it. That turned a slip into a small, early signal rather than a regression discovered months later, when it had already cost client outcomes. Equally important, the gains were built into redesigned controls rather than carried by extra effort, so holding the level did not depend on sustained heroics. The combination — defects designed out, plus a cadence that catches drift early — is what made 99% a level the operation sustained rather than a peak it touched once.

Next case

2 months → 15 days

Compressing the billing approval cycle from two months to fifteen days

There is a number like this in your operation.

A diagnostic is the fastest way to find it and put a figure on what fixing it is worth.