Framework · Reference standard

The 3-Step Makegoods QA Framework

Ashish Kumar Agnihotri·Updated 11 March 2026·15 min read

This is a reference description of the 3-Step Makegoods QA Framework — a method for staging quality assurance across a delivery cycle so that delivery shortfalls are caught and corrected before they become makegoods liabilities. It is written to be applied, not admired.

Definition

The 3-Step Makegoods QA Framework is a staged verification method for high-volume delivery operations. Rather than inspecting finished output at a single end-of-line gate, it places three lightweight checks at the points in the delivery cycle where specific failure modes are both most likely to occur and still cheap to correct. Verification effort is weighted by financial exposure, and every defect caught is traced to its root cause so that recurring failures are removed at source.

It was developed inside a global advertising operation, where it protected more than USD 20 million in client billings, and subsequently became the standard QA approach for that function.

The governing principle

Quality is determined by where you place the check, not by how hard you look. A modest verification at the right point in the flow — while an error is still correctable — is worth more than an exhaustive inspection at the end, where the only remaining options are to absorb the cost or issue a makegood. Every design decision in the framework follows from this principle.

The three steps

Step 1 — Verify at the point of commitment

Purpose: catch mismatches between what is being promised and what can actually be delivered, before delivery begins.

What it verifies: that the commitment made to the client is internally consistent and deliverable as specified — that nothing has been sold that the delivery system cannot fulfil.

Why here: an error caught at commitment costs nothing to correct; the same error discovered after delivery is a makegood. This is the highest-leverage check in the framework.

Step 2 — Verify in flight

Purpose: catch drift between what was committed and what is actually being delivered, while the delivery is still in progress and can be adjusted.

What it verifies: that delivery is tracking against the commitment on the dimensions that, if they slip, produce a shortfall — measured against a defined, standardised specification so that two reviewers reach the same verdict.

Why here: in-flight is the last point at which a shortfall can be corrected rather than compensated. A check here converts a would-be makegood into a routine adjustment.

Step 3 — Verify before reconciliation

Purpose: confirm that what was delivered matches what will be invoiced, and catch any residual discrepancy before it reaches the client.

What it verifies: the final reconciliation between commitment, delivery, and billing — the last line of defence.

Why here: anything that reaches this step and fails is a defect that escaped the two upstream checks, which makes it a signal as much as a catch: a recurring Step 3 failure points to a gap in Step 1 or Step 2 that should be closed.

Each step is placed where a class of error is most likely to occur and still cheap to correct. The placement is the framework; the checking is just execution.

Risk-weighting: not all output is checked equally

The framework deliberately does not treat every unit of work identically. Verification depth is weighted by financial exposure — the campaigns, accounts, or transactions that carry the greatest liability if they slip receive the deepest scrutiny. This is what allows the framework to run at the true volume of the operation without becoming the bottleneck.

Equal checking is itself a choice: it spends the same scrutiny on trivial and on high-stakes work. Risk-weighting simply makes the allocation deliberate, directing finite verification capacity to where it protects the most value.

The feedback loop: catches must remove causes

A QA framework that only catches defects is a permanent tax. The 3-Step framework requires that every defect caught is traced to its cause in the delivery process, and that recurring causes are removed at source. Over time this makes the framework lighter, because the failure modes it was built to catch stop occurring.

This loop is what distinguishes a quality system from a quality net. A net catches the same fish forever; a system stops the fish from getting in.

What you need to implement it

A standard precise enough to score. Each step requires a defined specification of "delivered as committed" against which a reviewer can reach a binary, repeatable verdict.
A map of the delivery cycle. You must know the real flow — including the points of commitment, active delivery, and reconciliation — to place the three checks correctly.
An exposure model. A way to rank work by financial liability, so verification can be weighted rather than uniform.
A route from catch to cause. A defined mechanism for tracing defects to their origin and a cadence for removing recurring causes.

Result

Applied as designed, the framework converts an invisible, recurring revenue leak into a controlled, measured process. It answers, with evidence, the question most delivery operations cannot: are we catching the failures that cost us the most? In its originating context it protected more than USD 20 million in client billings and became the function's standard approach — the truest measure of a framework's success, which is that it stops being a project and becomes simply how the work is done.

A worked example

The framework is easiest to see in a generic delivery operation: something is sold against a specification, produced over a period, and invoiced against what was actually delivered. The mechanics differ by industry; the shape does not. Walk one through.

The operation sells a defined volume of output to a client — so much of something, to an agreed specification, by an agreed date. Today, quality assurance happens at the end: someone checks the finished delivery before the invoice goes out. By then the work is done. If it falls short, the only options left are to absorb the cost or issue a makegood. The leak is invisible until reconciliation, and by reconciliation it is already a liability. This is exactly the situation the framework is built for.

Step 1 — at the point of commitment. Before anything is produced, a check confirms the commitment is internally consistent and deliverable: that the volume sold can actually be produced in the window agreed, against the specification recorded, with the capacity available. Most end-of-line shortfalls are born here, in a promise the delivery system was never able to keep. Catching it now costs a conversation. Catching it at reconciliation costs a makegood. This is the highest-leverage check in the framework, which is why it comes first.

Step 2 — in flight. Production is under way. A check tracks actual delivery against the commitment on the dimensions that produce a shortfall if they slip — pace, volume, conformance to specification — measured against a defined standard so two reviewers reach the same verdict. If delivery is running behind the pace needed to hit the committed volume, that is visible now, while there is still time to add capacity or adjust. In flight is the last moment a shortfall can be corrected rather than compensated. A check here turns a would-be makegood into a routine adjustment.

Step 3 — before reconciliation. Delivery is complete and the invoice is about to go out. A final check reconciles what was committed, what was delivered, and what is about to be billed, and catches any residual discrepancy before it reaches the client. Crucially, anything caught here is also a signal: it escaped Steps 1 and 2, so a recurring Step 3 catch points to a gap upstream that should be closed. The last line of defence doubles as the smoke detector for the two checks before it.

How to know it is working

The framework earns its keep by changing where and when failures are caught. Watch for these signals.

Catches move upstream over time. Early on, Step 3 catches the most, because the upstream checks are new and the process is still leaky. As the feedback loop does its work, the centre of gravity shifts: more is caught at Step 1 and Step 2, less reaches Step 3. A rising share of catches at the point of commitment is the strongest sign the framework is maturing.

Makegoods fall. The outcome the framework exists to produce. Shortfalls that used to surface after invoicing — as absorbed cost or client-facing makegoods — are increasingly caught while they are still correctable. If makegoods are not falling, the checks are either misplaced or not feeding back.

Step 3 catches trend toward zero. Because Step 3 is the last line of defence, a recurring catch there is a defect that escaped two upstream checks. As those gaps are closed, Step 3 should go quiet — not because failures stopped being possible, but because they are being caught earlier. A noisy Step 3 that stays noisy means the loop is not closing.

The checks get lighter. As recurring causes are removed at source, the failure modes the framework was built to catch stop occurring, and the verification needed to guard against them shrinks. A framework that demands ever more inspection effort is catching defects but not removing them — a net, not a system.

Clients stop finding the failures first. The clearest external signal. When shortfalls are caught and corrected before the client sees them, the framework is doing its job. When the client is still the one raising the alarm, the checks are landing too late.

A maturing framework is a quietening one. When the deepest, earliest check catches the most and the last-line check catches almost nothing, the system is working as designed.

Common failure modes

The framework fails in recognisable ways, and most trace back to abandoning its governing principle — that placement, not effort, determines quality.

Three inspections at the end. The most common failure, and the one already flagged: treating the three steps as three end-of-line checks rather than three staged ones. If all three happen near reconciliation, the inspection cost has tripled and nothing has been gained, because every catch still arrives too late to do anything but compensate. The value is entirely in the placement; cluster the checks at the end and you have thrown it away.

Catching without feeding back. The checks run, defects are caught and corrected, and nobody traces them to cause. The same failures recur cycle after cycle, and the framework becomes a permanent tax — forever catching the same fish. Without the feedback loop it is a net, not a system, and it never gets lighter.

Uniform checking. Every unit verified to the same depth, regardless of exposure. This either makes the framework the bottleneck — too slow to run at the real volume of the operation — or forces the depth down everywhere until the high-stakes work is under-checked along with the trivial. Equal checking is itself a choice, and usually the wrong one; risk-weighting is what lets the framework run at true volume.

An unscoreable standard. The checks exist but "delivered as committed" is not defined precisely enough for two reviewers to agree. Verdicts swing with who is checking, and the catches become unreliable. Each step depends on a standard precise enough to score; without it the framework rests on opinion.

A wrong map of the flow. The checks are placed against an assumed delivery cycle rather than the real one, so they sit at the wrong points and miss the moments where errors actually occur. Mapping the true flow — the real points of commitment, active delivery, and reconciliation — is a precondition, not a detail.

Adapting the framework to your context

The three steps are defined by their role in the flow, not by any particular industry, which is what lets the framework travel.

Locate the three moments in your cycle. Every delivery operation has a point where a commitment is made, a period where it is fulfilled, and a point where it is reconciled and billed. The labels differ; the moments are always there. The work of adapting the framework is identifying those three points in your specific flow and placing a check at each. Get the points right and the framework fits; guess at them and the checks miss.

Tune the depth to your exposure model. What counts as high-liability work is domain-specific. Build the exposure model in your own terms — whatever ranks work by the financial consequence of its slipping — and weight verification depth accordingly. The principle is constant; the ranking is yours.

Match the standard to the work. Step 2 in particular needs a defined specification of "on track" that fits what you deliver. The criteria differ by domain; the requirement that they be scoreable — binary and repeatable — does not.

Scale the checks to volume. In a very high-volume operation the checks must be light and fast, leaning hard on risk-weighting to stay off the critical path. In a lower-volume, higher-value operation each check can be deeper. The framework accommodates both, because it specifies where to check, not how heavily — the weight is yours to set.

What never changes: three staged checks, placed where a class of error is most likely and still cheap to correct; depth weighted by exposure; and every catch fed back to remove its cause. Those are the invariants. The rest is fitted to the operation.

How to put this to work

Start by mapping the real delivery cycle — not the idealised one in a process document, the one that actually happens. Find the three moments: where the commitment to the client is made, where it is being fulfilled, and where it is reconciled against what will be billed. These are where the three checks go. Everything else in the framework depends on getting these three points right, so spend the time to get them right.

Next, write the standard each check applies — a definition of "delivered as committed" precise enough that two reviewers reach the same verdict. Without it the checks produce opinions, and opinions cannot be trusted, trended, or defended to a client. If you have not defined a scoreable standard, that is the prerequisite; build it first.

Then build a simple exposure model — a way to rank work by the financial consequence of its slipping — and weight the depth of each check accordingly. The aim is not to check everything equally. It is to direct your finite verification capacity at the work that would cost the most if it failed, so the framework protects the most value without becoming the bottleneck.

Finally, close the loop, because this is what makes it a system rather than a net. Every catch gets traced to its cause in the delivery process, and recurring causes get removed at source. Watch the catches migrate upstream over time and the checks grow lighter as the failure modes they were built for stop occurring. That migration is the framework working exactly as intended.

Placed well, weighted by exposure, and fed back to source, three light checks do what an exhaustive end-of-line inspection cannot: they catch the failures that cost the most while those failures are still correctable. That is the whole proposition — and in the operation where it was built, it protected more than USD 20 million in billings and stopped being a project, becoming simply how the work was done.

The 3-Step Makegoods QA Framework

Definition

The governing principle

The three steps

Step 1 — Verify at the point of commitment

Step 2 — Verify in flight

Step 3 — Verify before reconciliation

Risk-weighting: not all output is checked equally

The feedback loop: catches must remove causes

What you need to implement it

Result

A worked example

How to know it is working

Common failure modes

Adapting the framework to your context

How to put this to work

The Operations Governance Scorecard

Defining the Quality Standard

Definition#

The governing principle#

The three steps#

Step 1 — Verify at the point of commitment#

Step 2 — Verify in flight#

Step 3 — Verify before reconciliation#

Risk-weighting: not all output is checked equally#

The feedback loop: catches must remove causes#

What you need to implement it#

Result#

A worked example#

How to know it is working#

Common failure modes#

Adapting the framework to your context#

How to put this to work#

The Operations Governance Scorecard

Defining the Quality Standard

Definition

The governing principle

The three steps

Step 1 — Verify at the point of commitment

Step 2 — Verify in flight

Step 3 — Verify before reconciliation

Risk-weighting: not all output is checked equally

The feedback loop: catches must remove causes

What you need to implement it

Result

A worked example

How to know it is working

Common failure modes

Adapting the framework to your context

How to put this to work