Requirements & Acceptance-Criteria Review
The cheapest defect to fix is the one that never gets built. Most testers wait for working software and then test it — but the highest-leverage testing happens before a single line of code exists, by finding the holes in the requirement itself.
1 The Hook
A fictional NZ council, Harbourview District, commissioned an online rates-payment portal. One user story read, in full: “As a ratepayer, I want to pay my rates online so that I don’t have to visit the council office.” Acceptance criteria: “User can pay their rates. Payment is fast and secure.” The team estimated it, built it, and demoed a clean credit-card flow. Everyone was happy.
Then it met the real world. What about a ratepayer paying part of their rates — is partial payment allowed, and how is the remaining balance shown? What about someone who owns three properties on one account? What happens when the payment gateway times out but the bank actually took the money — do they get charged twice? Can someone pay more than they owe, and if so, is that a credit or an error? None of these had answers, because the requirement never asked the questions. Each one became a defect, a production incident, or an angry call to the contact centre — and each was found at the most expensive possible moment, after release.
Here is the uncomfortable truth: every one of those defects was sitting in plain sight in that two-line story before anyone wrote code. “User can pay their rates” is not a requirement — it’s a wish. “Fast and secure” is not testable — how fast, secure against what? A tester who read that story properly would have surfaced a dozen questions in ten minutes, and answering them in a meeting costs minutes. Answering them via a production incident costs days and trust.
That ten-minute review is one of the most valuable things a senior tester does. This lesson teaches you how to do it: how to read a requirement like a tester, name the defects in it, and turn a wish into something you could actually verify.
2 The Rule
A requirement you can’t test is a defect, not a requirement. Reviewing requirements is testing — it’s static testing, and it’s the cheapest testing there is, because a defect caught in a sentence costs a question, while the same defect caught in production costs an incident. If you can’t write a test from it, the requirement isn’t done.
3 The Analogy
A builder reading a blueprint with a missing dimension.
A good builder reads the plans before pouring concrete. If the blueprint says “put a window here” but never gives a size, the builder doesn’t guess and pour — they ring the architect and ask, because fixing a wrong window on paper costs a phone call, and fixing it in a finished wall costs a demolition. The plans are the requirement; the concrete is the code.
A tester reviewing requirements is the builder who reads the plans first. “Pay rates online” with no answer for partial payments or double-charges is a window with no dimension. The whole skill is catching the missing dimension while it’s still just a line on paper — because once it’s set in code, every gap you didn’t question becomes something you have to demolish.
4 The Five Requirement Defects
Requirements fail in recognisable ways. Learn the five and you can scan any story and name what’s wrong with it:
5 What Makes a Requirement Testable
A testable requirement is one you can turn into a pass/fail check without inventing the answer yourself. The properties to look for — the testable subset of the classic INVEST qualities — are:
- Specific: one clear behaviour, not a bundle. “Lock the account after 5 failed logins within 15 minutes” — not “handle bad logins sensibly.”
- Measurable: any quality is quantified. Not “fast,” but “the dashboard renders within 2 seconds on a 4G connection for a dataset of up to 500 rows.”
- Unambiguous: only one reasonable reading. Name the actor, the trigger, the data, and the boundary explicitly.
- Bounded: the edges are defined. What’s the minimum, the maximum, the empty case, the invalid input?
- Complete for its scope: it covers the unhappy paths it implies — what happens on failure, on timeout, on bad data.
The single most useful word to delete from any requirement is a weasel word: fast, secure, intuitive, user-friendly, robust, seamless, appropriate, etc. Each one hides an untested decision. Every time you see one, your job is to replace it with a number or a named, checkable condition — or to ask the question that gets you one.
6 Good Acceptance Criteria
Acceptance criteria are the bridge from a story to a test — the specific, checkable conditions that must be true for the story to be “done.” A clean and widely-used format is Given / When / Then (the same Gherkin structure you’ll use in BDD): the Given sets the starting state, the When is the action, the Then is the observable, verifiable outcome.
Notice what that does that “user can pay their rates” never could: it names the starting state, a specific amount, and an exact, observable outcome — you could write an automated test straight from it. Good acceptance criteria also cover the negative and boundary cases explicitly, because those are where defects live:
That second one — the timeout-and-double-charge case — is exactly the gap that bit Harbourview District. Written as acceptance criteria before the build, it becomes a test the developer must satisfy. Left unwritten, it becomes a production incident.
7 Watch Me Review a Story
Here is a real-feeling story, reviewed the way a senior tester would — out loud, in minutes:
Reading it sentence by sentence, the defects fall out:
- Ambiguous / contradictory: “the link logs them in” — does it log them in, or take them to a set a new password screen? Those are different features with different security profiles. A reset link that silently logs you in is a security smell.
- Missing — unhappy paths: What if the email isn’t a registered account? (For security you should respond identically either way — but that’s a decision someone must make.) What if the link is used twice? What if it’s expired?
- Missing — security boundaries: Does the link expire, and after how long? Is it single-use? Is there a rate limit on requests, or can an attacker spam someone’s inbox? None are stated — and for an auth flow these are the requirement.
- Untestable: “gets a reset link” — by what channel, in what timeframe? “Within 2 minutes by email” is testable; “gets a link” isn’t.
- Unstated assumption: assumes every account has one verified email; what about accounts with none, or a changed-but-unverified address?
That’s eight or nine real questions from two sentences, in about three minutes. Each is cheaper to answer now than to discover later — and several (link expiry, single-use, rate limiting) are the difference between a secure auth flow and a breach. The output of the review isn’t criticism; it’s a list of questions and a set of rewritten, testable acceptance criteria the team agrees on before estimating.
8 The Three Amigos
Requirement review works best as a conversation, not a gate. The Three Amigos is the practice of three perspectives reviewing a story together before it’s built:
- Business (PO / BA) — what the user needs and why; owns the intent.
- Development — how it could be built; surfaces technical constraints and effort.
- Testing — how we’ll know it works, and how it could fail; surfaces the edge cases, the unhappy paths, the ambiguities.
The tester’s role in that room is distinctive: you are the person whose job is to ask “what about when…” The developer is thinking about building it, the PO about the goal; you are the only one incentivised to hunt the holes. Your questions — “what about a part payment? a timeout? more than they owe?” — are what turn a vague story into a clear one before anyone commits to building the wrong thing. Done well, the Three Amigos converts the most expensive defects (misunderstood requirements) into the cheapest (a five-minute conversation), and it’s the natural home for the Given/When/Then criteria you write together.
9 Common Mistakes
🚫 Waiting for working software before you start testing
Why it happens: “Testing” feels like it needs something to click on.
The fix: Reviewing requirements is testing — static testing — and it’s the cheapest there is. The defects you find in a sentence cost a question; the same defects found in a build cost a rework cycle. Start at the story, not the demo.
🚫 Only checking the happy path
Why it happens: The happy path is what the story describes, so it feels like the whole job.
The fix: The story tells you the happy path; your value is the unhappy ones. For every requirement, ask the negative case, the boundary, and the “what if it fails halfway” — that’s where the defects hide.
🚫 Accepting weasel words
Why it happens: “Fast,” “secure,” and “user-friendly” sound like requirements and nobody wants to seem pedantic.
The fix: Each one hides an untested decision. Replace it with a number or a named, checkable condition — or ask the question that produces one. “Fast” becomes “within 2s on 4G”; “secure” becomes a specific control.
🚫 Treating a missing requirement as “not my problem”
Why it happens: If it’s not written down, it feels like it’s out of scope.
The fix: Missing requirements are the most dangerous kind precisely because there’s nothing to point at. Noticing the absence — the empty state, the error case, the boundary nobody mentioned — is the senior skill. Raise it as a question, not an assumption.
🚫 Turning the review into a critique of the author
Why it happens: Finding holes can sound like fault-finding.
The fix: The output of a review is a list of questions and improved acceptance criteria the team agrees on — not a scorecard. Frame everything as “what should happen when…” so it’s collaborative; you’re strengthening the story, not grading the PO.
10 Now You Try
Three graded exercises. Write your answer, run it for AI feedback, then compare to the model answer.
Read the user story below. Find at least 4 requirement defects and classify each as Ambiguous, Missing, Contradictory, Untestable, or Unstated assumption. Explain why each is a defect.
“As a shopper, I want to apply a discount code at checkout so I can save money. Acceptance criteria: The user enters a code and the discount is applied. Invalid codes show an error. Discounts should be applied quickly. The total updates automatically. Multiple codes can be stacked, and only one code can be used per order.”
List and classify the defects:
Show model answer
There are at least five defects; any four well-explained earn full marks. 1. CONTRADICTORY — "Multiple codes can be stacked" directly contradicts "only one code can be used per order." Both can't be true; someone must decide. 2. UNTESTABLE — "Discounts should be applied quickly." No number, so it can never objectively pass or fail. Needs e.g. "the total updates within 1 second of applying a code." 3. MISSING (unhappy/boundary paths) — Nothing about: a code that's expired, a code below a minimum spend, a discount larger than the order total (does it go negative, or floor at $0?), or a code already used. The happy path is the only path described. 4. AMBIGUOUS — "Invalid codes show an error." Invalid how — doesn't exist, expired, not yet active, wrong region, already redeemed? Each is a different message and a different rule. 5. UNSTATED ASSUMPTION — Assumes a code applies to the whole order; what about codes that only apply to certain items or categories? Also assumes the discount is a percentage or fixed amount without saying which. Marking: full marks identify FOUR distinct defects with the CORRECT type and a clear reason. The contradiction and the untestable "quickly" are the two most obvious; missing the contradiction is a common slip.
Rewrite the weak acceptance criteria below into testable Given / When / Then criteria. Include at least one happy path, one negative case, and one boundary case. Context: a fictional NZ bank’s “daily transfer limit” feature.
“Users have a daily transfer limit. They shouldn’t be able to transfer too much. The limit should work properly.”
Rewrite as Given / When / Then (happy + negative + boundary):
Show model answer
A good rewrite picks a concrete limit (e.g. $5,000/day) and writes observable outcomes: Happy path: Given a customer with a daily transfer limit of $5,000 and $0 transferred today When they transfer $3,000 Then the transfer succeeds and their remaining daily limit shows $2,000 Negative case: Given a customer with a $5,000 daily limit and $4,000 already transferred today When they attempt to transfer $2,000 Then the transfer is rejected with a message stating they have $1,000 of their daily limit remaining And no funds are moved Boundary case: Given a customer with a $5,000 daily limit and $0 transferred today When they transfer exactly $5,000 Then the transfer succeeds and their remaining daily limit shows $0 (and a companion case: a transfer of $5,000.01 is rejected) What makes these testable: a specific limit, a specific starting state, and an observable, checkable outcome — you could automate each directly. Bonus marks for the at-exactly-the-limit boundary and the "and no funds are moved" assertion on the rejection. Weak answers keep "too much" or "properly" or give only a happy path.
You’re walking into a Three Amigos for the thin story below. Write the clarifying questions a tester should ask — aim for at least 6, covering boundaries, error/unhappy paths, non-functional requirements, and unstated assumptions. Context: a fictional NZ health provider.
Show model answer
A strong set spans several categories, not six versions of one question: Boundaries / data: 1. How far ahead can a patient book — tomorrow only, or months out? Is there a minimum notice (e.g. can't book a slot in 5 minutes)? 2. Can a patient hold more than one appointment, or book multiple in one session? Concurrency / race: 3. What happens if two patients pick the same slot at the same moment — who gets it, and what does the other see? Error / unhappy paths: 4. What if the booking fails after the slot is taken but before confirmation — is the slot released or stuck? 5. Can a patient cancel or reschedule, and does that free the slot? (Implied but unstated.) Non-functional / compliance: 6. How is the patient identified and authenticated — this is health data under the Privacy Act 2020 and HIPC. What confirmation channel (email/SMS), and what if it fails? 7. Accessibility — does the booking flow meet WCAG for patients using screen readers? Unstated assumptions: 8. Does "available time" account for the doctor's leave, double-booking rules, or appointment length by type? Marking: full marks give 6+ questions across at least three categories (boundary, error/concurrency, NFR/compliance, assumptions) — not six rewordings of "what times are available". Bonus for the concurrency race and the Privacy Act/health-data angle, which are the senior-level catches.
11 Self-Check
Click each question to reveal the answer.
Q1: Why is reviewing requirements considered the cheapest form of testing?
Because a defect caught in a sentence costs a question, while the same defect caught in a build costs a rework cycle, and in production costs an incident and lost trust. It’s static testing — finding defects before any code exists, when they’re cheapest to fix.
Q2: Name the five requirement defect types.
Ambiguous (more than one reading), Missing/Incomplete (a case isn’t addressed), Contradictory (two statements can’t both be true), Untestable (no objective way to verify), and Unstated assumption (something taken for granted that you can’t verify).
Q3: Which defect type is the most dangerous, and why?
Missing requirements — because there’s nothing on the page to argue with. You have to notice the absence (the empty state, the error path, the boundary nobody mentioned) rather than critique something written, which takes deliberate skill.
Q4: What’s wrong with “the report should load quickly,” and how do you fix it?
“Quickly” is a weasel word — untestable, because there’s no number to pass or fail against. Fix it by quantifying: “the report renders within 2 seconds on a 4G connection for up to 500 rows.” Replace every weasel word with a number or a named, checkable condition.
Q5: In a Three Amigos, what is the tester’s distinctive contribution?
To ask “what about when…” — the edge cases, unhappy paths, and ambiguities. The developer is focused on building it and the PO on the goal; the tester is the only one whose job is to hunt how it could fail, and to turn that into testable acceptance criteria before anyone commits to building.
12 Interview Prep
Real questions asked in NZ QA interviews for senior roles. Read the model answers, then practise your own version.
“A PO hands you a one-line user story and asks for an estimate. What do you do?”
I don’t estimate a one-liner — I review it first, because a vague story is a defect waiting to happen. I read it for the five requirement defects: is anything ambiguous, missing, contradictory, untestable, or an unstated assumption? For each sentence I ask “could this mean something else, what’s the unhappy version, and how would I prove it’s done?” That usually surfaces a handful of questions in a few minutes — partial cases, error paths, boundaries, NFRs. We answer those, turn them into testable Given/When/Then criteria, and then estimate. Estimating before that is estimating the wrong thing.
“How do you make a vague requirement like ‘the system should be fast and secure’ testable?”
Those are weasel words — each hides an untested decision — so I replace them with numbers or named conditions, or I ask the question that gets me one. “Fast” becomes something like “renders within 2 seconds on 4G for up to 500 rows.” “Secure” isn’t one requirement at all; I’d break it into specific controls — password reset links expire in 30 minutes and are single-use, login locks after 5 failed attempts, data in transit is encrypted — each of which I can actually test. If I can’t write a pass/fail check from it, it isn’t a requirement yet.
“Tell me about a time finding a defect early saved real cost.”
The pattern I’d describe: in a requirements review for a payment feature, the story covered the happy path but said nothing about a gateway timeout where the bank still takes the money. I raised it as a question in the Three Amigos — “what happens on a timeout, could we double-charge?” — and we wrote an acceptance criterion making the payment idempotent before a line was built. That one question turned what would have been a production incident, refunds, and angry customers into a five-minute conversation and one extra test. That’s the whole argument for shift-left: the same defect is trivially cheap in a sentence and very expensive in production.