Test Estimation & Planning · Lesson 3

Test Strategy & Planning

A test plan is not a forty-page template nobody reads. It is a short, sharp statement of what you will test, how, and how everyone will know it is safe to release. The plan that earns its place is the one a delivery manager actually opens before the go/no-go call.

Lead Test Estimation & Planning — Lesson 3 of 3 ~30 min read · ~70 min with exercises

1 The Hook

A test lead on a fictional NZ insurer’s policy-renewal release did what she had been taught: she wrote a thorough test plan. It was forty pages. It opened with the company mission, a glossary, a list of testing types copied from a standard, an org chart, and six pages on tools. The actual content — what would be tested for this release, and how the team would know it was done — was a thin paragraph buried on page 31.

Nobody read it. The delivery manager skimmed the first page and filed it. The developers never opened it. When go-live week arrived and someone asked “are we ready?”, there was no shared answer, because the document that should have defined “ready” had buried that definition where no one found it. The team argued about whether they could release while the clock ran down, each person working from a different idea of what “done” meant.

The release went ahead on a gut call. A known defect in the renewal-premium calculation — logged, but never escalated as a release blocker because there was no agreed rule for what blocks a release — went to production and over-charged a batch of customers. The plan had said nothing useful about exit criteria or what would stop a go-live, so nothing did.

Here is the lesson. A forty-page plan that nobody reads provides exactly as much protection as no plan at all — arguably less, because it creates the illusion of control. The job of a test plan is not to be comprehensive; it is to be read, agreed, and used to make the go/no-go decision. A one-page plan that defines scope, exit criteria, and what blocks a release, and that everyone has actually read, beats a forty-page template every time.

2 The Rule

A test plan exists to be read, agreed, and used — not to be comprehensive. Make it fit for purpose: short enough that the people who matter read it, sharp enough that it defines scope, exit criteria, and what blocks a release. A one-page plan everyone has read and agreed beats a forty-page template nobody opens, every single time.

3 The Analogy

Analogy

A pre-flight checklist versus the full aircraft manual.

A pilot does not read the thousand-page aircraft manual before every flight. They run a one-page checklist: the specific things that must be true for this aircraft to take off safely, each one checked and confirmed out loud. It is short precisely so it gets done, every time, without skipping. The full manual exists, but it is a reference, not the thing you act from at the gate.

A fit-for-purpose test plan is the pre-flight checklist, not the manual. It is the short, sharp list of what must be true for this release to be safe to ship — scope covered, exit criteria met, no open blockers. The insurer’s forty-page plan was a manual handed over as if it were a checklist, so nobody ran it, and the aircraft took off with a known fault. The plan’s value is in being short enough to actually run before go-live.

4 Test Strategy versus Test Plan

Two words get used interchangeably and should not be. Knowing the difference stops you writing the wrong document.

A test strategy is the organisation’s standing approach to testing — how this bank or this agency tests, across all its releases. It covers the levels of testing it runs, its general approach to automation and environments, its standards and roles. It is written once and changes rarely. You usually inherit it; you do not write a new one per release.

A test plan is for one specific release or project. It says what this release will test, how, by when, and how everyone will know it is done. It is written fresh each time because the scope, risks, and timeline are different each time.

The forty-page failure usually comes from confusing the two: the lead copies the standing strategy — all the general, evergreen material about how the organisation tests — into what should have been a lean, release-specific plan. The strategy content is real, but it belongs in the strategy document, referenced by a link, not re-typed into every plan. The plan should contain only what is specific to this release.

Pro tip: Test for the difference with one question: “would this sentence be true for the next release too?” If yes, it is strategy — link to it, do not repeat it. If it is only true for this release (this scope, these risks, this date, these exit criteria), it belongs in the plan. That single test strips a forty-page plan down to the few pages that matter.

5 The Fit-for-Purpose Test Plan

A release test plan that people will read and use needs only a handful of sections. Here is the skeleton — aim for one to three pages, not forty.

RELEASE TEST PLAN — [release name, fixed go-live date]

1. What this release is   One or two lines. The change and why it matters.
2. Scope — in         What WILL be tested, ranked by risk (from Lesson 2).
3. Scope — out        What will NOT be tested, and the residual risk of each.
4. Approach          Levels and types for THIS release; what is automated vs manual.
5. Environments & data  Which environments and test data, and when they're needed.
6. Entry criteria     What must be true before test execution starts.
7. Exit criteria      What must be true to call testing done — the definition of "ready".
8. Go/no-go inputs   What testing will hand the go/no-go decision, and what blocks a release.
9. Risks & dependencies The few real threats to the plan and who owns each.
10. Escalation        Who hears about a blocker, and how fast.

Notice what is absent: no mission statement, no copied list of testing types, no org chart, no six pages on tools. Everything evergreen is in the standing strategy, linked from the top. What remains is specific to this release and small enough to read in five minutes. The two sections that do the heavy lifting — and that the insurer’s plan lacked — are exit criteria and go/no-go inputs, which the next sections take in turn.

6 Entry and Exit Criteria

Entry and exit criteria are the conditions that bracket test execution, and they are where a vague plan becomes a useful one.

Entry criteria are what must be true before you start executing. They protect the testing window from being wasted: if you start before the build is stable or the environment is ready, you burn days re-doing work. Good entry criteria for an NZ release: the build is deployed to the test environment and passes a smoke test, the test data is loaded, the interfaces you depend on are available, and the scope is locked. If entry criteria are not met, that is itself a signal to escalate — the testing window is being eaten before it starts.

Exit criteria are what must be true to call testing done — the written definition of “ready” that the insurer’s team never had. They must be measurable, not vague. Compare:

Vague (useless): “Testing is complete when the system has been thoroughly tested and is working well.”
Measurable: “Testing is complete when: all score-9 and score-6 risk areas are fully executed; zero open Critical or High defects; any open Medium/Low defects are logged and accepted by the business; the agreed cut scope is recorded with residual risk; and the regression of the high-impact existing payment path has passed.”

The measurable version can be checked off in the go/no-go meeting by anyone, with no judgement call about what “thorough” means. That is the entire point: exit criteria turn “are we ready?” from an argument into a checklist. Agree them at the start of the release, with the business, so the rule is set before anyone is under pressure to bend it.

Pro tip: Define exit criteria before execution starts, never during go-live week. Criteria agreed under deadline pressure get written to be passable rather than safe. Agreed early, with the business in the room, they are an honest bar everyone signed up to — and they are far harder to quietly lower at 9pm the night before the date.

7 Stakeholder Escalation

A plan that only describes happy-path testing is incomplete. Things go wrong — a blocker appears, an environment falls over, a serious defect surfaces late — and the plan must say what happens then. That is escalation, and a lead who has not pre-agreed it ends up improvising under pressure.

Good escalation in a release plan answers three questions in advance:

  • What triggers an escalation? Name the events that cannot wait for the next stand-up — a Critical defect in a high-impact area, an entry criterion not met on day one, a test environment down for more than a set time, a dependency that has slipped. Define them so anyone can recognise one.
  • Who hears about it, and how fast? A Critical defect goes to the delivery manager and the relevant business owner the same day, not in a weekly report. A slipped dependency goes to whoever owns that dependency. Map trigger to audience so no one has to work out who to call.
  • What does the lead bring to the escalation? Not just “there’s a problem” but the facts and the options: what the defect is, what it puts at risk in the business’s terms, and the realistic choices (fix and re-test, accept the risk, move the date, cut scope). Escalating with options is the difference between raising an alarm and giving the business a decision it can make.

The throughline of this whole track shows up here: a lead escalates with the same clear-eyed framing used for estimation and prioritisation — the facts, the risk in the stakeholder’s language, and the options — so the business can decide rather than panic.

8 Go / No-Go Inputs

The go/no-go meeting is where the release is decided, and testing is one of its most important inputs. The lead’s job is not to make the go/no-go call — that is the business’s — but to give it an honest, structured input that lets the business decide well.

A strong testing input to a go/no-go is not “testing’s done, we’re good” or “I’m not comfortable.” It is a clear statement against the agreed exit criteria:

TESTING INPUT TO GO/NO-GO — [release], [date]

Exit criteria status:
  High-risk scope (score 9 & 6) executed      PASS
  Open Critical defects                0      PASS
  Open High defects                  1      FAIL — see below
  Payment-path regression passed         PASS
  Cut scope recorded & accepted            PASS

Open High defect: renewal-premium rounds up by 1c on ~2% of policies.
Risk if shipped: minor over-charge, correctable, low reputational impact.
Options: (a) fix & re-test — needs 1 day, misses the date; (b) ship with a
    known-issue note and a fix in the next release; (c) hold the date.
Testing recommendation: option (b) is defensible; the call is the business's.

This input does the lead’s whole job at once: it reports status against criteria everyone agreed, it surfaces the one failing item with its real-world risk, and it offers options rather than a verdict. The business makes the go/no-go — but it makes it on evidence, not on a gut feeling, and that is what the insurer’s release never had.

Pro tip: The exit criteria you agreed at the start of the release become the go/no-go checklist at the end. That is why they are worth getting right and agreeing early: they are not bureaucracy, they are the spine of the release decision. A lead who sets clear exit criteria in week one has already written most of the go/no-go input before testing even begins.

9 Common Mistakes

🚫 Writing a forty-page plan nobody reads

Why it happens: Length feels like rigour, and templates encourage filling every section.
The fix: An unread plan protects no one and creates a false sense of control — the insurer trap. Strip out everything evergreen (link to the standing strategy instead) and keep only what is specific to this release. A one-to-three-page plan people actually read beats a long one they file.

🚫 Vague exit criteria like “thoroughly tested and working well”

Why it happens: Vague criteria are easy to write and avoid hard conversations up front.
The fix: “Thorough” cannot be checked, so “are we ready?” becomes an argument under pressure. Make exit criteria measurable — specific risk areas executed, zero open Critical/High, regression of the key path passed — so the go/no-go is a checklist, not a debate.

🚫 Agreeing exit criteria during go-live week

Why it happens: The release date drives everything and criteria feel like something to settle “when we get there.”
The fix: Criteria set under deadline pressure get written to be passable, not safe. Agree them at the start of the release, with the business, so the bar is honest and far harder to quietly lower the night before the date.

🚫 Giving a go/no-go verdict instead of structured evidence

Why it happens: “We’re good” or “I’m not comfortable” feels like a clear answer.
The fix: The go/no-go is the business’s call, not the lead’s. Give status against the agreed exit criteria, surface any failing item with its real-world risk, and offer options — so the business decides on evidence, not on your comfort level.

10 Now You Try

Three graded exercises: critique a plan, fix weak exit criteria, then build a release plan. Write your answer, run it for AI feedback, then compare to the model answer.

🔍 Exercise 1 of 3 — Critique This Test Plan

A lead produced the test plan described below for a fixed-date release. Identify 3 things that make it not fit for purpose and say what a fit-for-purpose plan would do instead.

The plan, for an IRD online-services release:
Thirty-eight pages. Pages 1–6: company background, glossary, and a copied list of every testing type that exists. Pages 7–20: a generic description of how IRD tests in general (levels, tools, roles) — identical to the last three releases’ plans. Pages 21–30: tooling. The release-specific content (what’s changing, what will be tested) is one paragraph on page 33. Exit criteria read: “testing is complete when the system is working as expected.” There is no section on what blocks a release or who gets escalated to. The delivery manager filed it after page 2.

List 3 problems and the fit-for-purpose fix for each:

Show model answer
There are at least four real problems here; any three well-explained earns full marks.

1. Evergreen strategy content copied into a release plan — pages 7-20 (how IRD tests in general) are identical to past releases, so they are strategy, not plan. Fix: link to the standing test strategy and delete the copy; keep only what is specific to this release. Apply the test "would this be true for the next release too?" — if yes, it's strategy.

2. Release-specific content buried / the plan is unread — the one thing that matters (what's changing, what will be tested) is a paragraph on page 33, and the delivery manager stopped at page 2. Fix: a one-to-three-page plan leading with scope (in and out, ranked by risk) so the people who matter actually read it.

3. Vague exit criteria — "working as expected" cannot be checked, so "are we ready?" will be an argument under pressure. Fix: measurable exit criteria — high-risk scope executed, zero open Critical/High, regression of the key path passed, cut scope recorded and accepted.

Bonus: no escalation or "what blocks a release" section — a blocker or late Critical defect would be improvised. Fix: define escalation triggers, who hears about them and how fast, and what testing hands the go/no-go.

The pattern: the plan optimised for looking comprehensive instead of being read and used, and it lacked the two sections that actually drive a release decision — measurable exit criteria and go/no-go inputs.
🔧 Exercise 2 of 3 — Fix the Exit Criteria

Rewrite the vague exit criteria below into measurable exit criteria that could be checked off in a go/no-go meeting with no judgement call. Use a fictional Te Whatu Ora e-prescribing release as the context, and make at least one criterion reflect the high-impact regression and the recorded cut scope.

Original (vague):
“Testing is complete when the system has been thoroughly tested, most defects are fixed, and the team is confident it’s ready to go live.”

Rewrite as measurable exit criteria:

Show model answer
Exit criterion 1: All score-9 and score-6 risk areas (the prescribing-decision and dose-calculation logic, and the pharmacy-system integration) are fully executed, with results recorded.

Exit criterion 2: Zero open Critical defects and zero open High defects. (For an e-prescribing system, a Critical is anything affecting prescribing accuracy or patient safety.)

Exit criterion 3: Any open Medium/Low defects are logged, triaged, and explicitly accepted by the clinical business owner, with a workaround noted where one exists.

Exit criterion 4: Regression of the high-impact existing function the change touches — the existing medication-record and allergy-check path — has been executed and passed. (This is the core safety function a nearby change could break.)

Exit criterion 5: The agreed cut scope is recorded, with the residual risk of each cut named and accepted by the business in writing.

When and with whom: agreed at the START of the release (not go-live week), with the clinical business owner and delivery manager in the room, so the bar is set before anyone is under pressure to lower it.

What makes this strong: every criterion is checkable with no judgement about "thorough"; Critical/High is defined in patient-safety terms; the high-impact regression (allergy/medication-record path) is called out explicitly; the cut scope and its residual risk are captured and accepted; and the criteria are agreed early with the business. The original could not be checked off by anyone — "thoroughly", "most", and "confident" are all judgement calls that collapse under deadline pressure.
🏗️ Exercise 3 of 3 — Build a Fit-for-Purpose Test Plan

Build a one-page release test plan for a fictional bank term-deposit renewal release with a fixed go-live. Cover at least: what the release is, scope in (risk-ranked) and out (with residual risk), entry criteria, measurable exit criteria, go/no-go inputs, and escalation. Keep it sharp — this is a checklist, not a manual.

Show model answer
1. What this release is: Automated term-deposit renewal — at maturity, deposits roll over at the current rate unless the customer opts out. Fixed go-live tied to a rate-change date.

2. Scope IN (risk-ranked): (9) renewal interest-rate selection and the renewed-amount calculation; (9) the maturity-trigger that decides which deposits renew on the date; (6) customer opt-out handling; (6) regression of the existing interest-posting path the renewal shares code with; (4) renewal confirmation notice content.

3. Scope OUT (with residual risk): manual early-withdrawal flow (unchanged, change cannot reach it — low residual risk); marketing content on the renewal page (cosmetic — low residual risk). Both recorded and accepted by the business.

4. Entry criteria: build deployed to the test environment and passing smoke test; rate-table test data loaded; the core-banking interface available; scope locked.

5. Exit criteria (measurable): all score-9 and score-6 areas executed; 0 open Critical or High defects; the existing interest-posting regression passed; open Medium/Low defects accepted by the business; cut scope recorded with residual risk.

6. Go/no-go inputs: testing reports status against the exit criteria as a pass/fail checklist; any failing criterion is surfaced with its risk in the bank's terms and options (fix & re-test / ship with known issue / hold date). A release is blocked by any open Critical or High defect, or any score-9 area not executed. The go/no-go call is the business's.

7. Escalation: trigger = a Critical defect in renewal calculation, an entry criterion unmet on day one, or the core-banking interface down >4 hours → delivery manager + term-deposit product owner, same day, brought with facts + risk + options.

What makes it strong: it fits a page, leads with scope ranked by risk, states what's out and the residual risk, has measurable exit criteria, names what blocks a release, and pre-agrees escalation — and it carries no mission statement, glossary, or copied testing-type list. Weak answers reproduce a template, leave exit criteria vague, or omit what blocks a release.

11 Self-Check

Click each question to reveal the answer.

Q1: Why does a forty-page test plan often protect a release less than a one-page plan?

Because a plan only protects a release if it is read, agreed, and used — and a forty-page plan rarely is. Worse, it creates a false sense of control: everyone assumes the long document covers things, so no one checks. A one-page plan that defines scope, exit criteria, and what blocks a release, and that people actually read, drives the go/no-go decision the long one never does.

Q2: What is the difference between a test strategy and a test plan?

A test strategy is the organisation’s standing approach to testing across all releases — written once, changed rarely, usually inherited. A test plan is for one specific release: what it will test, how, by when, and how everyone knows it is done — written fresh each time. The forty-page failure comes from copying evergreen strategy content into what should be a lean, release-specific plan.

Q3: What makes exit criteria useful, and when should they be agreed?

They must be measurable — specific risk areas executed, zero open Critical/High, the key regression passed — so “are we ready?” becomes a checklist anyone can verify rather than an argument about what “thorough” means. Agree them at the start of the release, with the business, so the bar is honest and not quietly lowered under deadline pressure in go-live week.

Q4: What should a lead bring to an escalation, beyond “there’s a problem”?

The facts, the risk in the stakeholder’s own language, and the realistic options — fix and re-test, accept the risk, move the date, or cut scope. Escalating with options turns an alarm into a decision the business can make, and it is the same clear-eyed framing used for estimation and prioritisation throughout the track.

Q5: What is the lead’s role in the go/no-go decision?

Not to make the call — that is the business’s — but to give it an honest, structured input: status against the agreed exit criteria as a pass/fail checklist, any failing item surfaced with its real-world risk, and options rather than a verdict. The exit criteria agreed at the start of the release become the go/no-go checklist at the end.

12 Interview Prep

Real questions asked in NZ QA interviews for lead and test-manager roles. Read the model answers, then practise your own version.

“What goes into a test plan, and how long should it be?”

As short as it can be while still being read and used — usually one to three pages for a release. It should cover what this release is, scope in (ranked by risk) and out (with the residual risk of each cut), the approach for this release, entry and exit criteria, go/no-go inputs, key risks and dependencies, and escalation. What it should not contain is everything evergreen — the mission statement, the glossary, the general description of how the organisation tests. That lives in the standing test strategy and gets linked, not re-typed. My test for each line is “would this be true for the next release too?” If yes, it’s strategy; if it’s only true for this release, it belongs in the plan.

“How do you write exit criteria that actually mean something?”

I make them measurable, so they can be checked off in a go/no-go meeting with no judgement call. Not “thoroughly tested and working well” — that’s an argument waiting to happen — but specific things: all the high-risk areas executed, zero open Critical or High defects, the regression of the high-impact existing path passed, open lower-severity defects logged and accepted by the business, and the cut scope recorded with its residual risk. And I agree them at the start of the release with the business in the room, not in go-live week, because criteria written under deadline pressure get set to be passable rather than safe.

“It’s the go/no-go meeting and there’s one open High defect. What do you do?”

I give the business what it needs to decide, not a verdict. I report status against the exit criteria as a checklist, so everyone can see this is the one item failing. Then I surface that defect properly: what it is, what it puts at risk in the bank’s terms — say a minor over-charge that’s correctable versus a safety issue that isn’t — and the realistic options: fix and re-test and miss the date, ship with a known-issue note and fix next release, or hold the date. I’ll give a testing recommendation if asked, but the go/no-go is the business’s call. The point is that they decide on evidence against criteria we agreed up front, not on anyone’s gut feeling at the table.