Senior QA Interview Prep
30 model questions and answers for QA professionals at the 3–6 year mark. Covers risk strategy, technical depth, NZ-specific context, and the behavioural questions that separate senior candidates from mid-level ones.
1 Risk-Based Testing
Interviewers use these questions to gauge whether you can allocate limited test effort intelligently — the defining skill at senior level.
Q1. You have three days left before release and a backlog of 200 test cases you haven’t run. How do you prioritise?
Model answer
I start by scoring each area using a risk exposure matrix: likelihood of failure × business impact. Anything touching money movement, authentication, or data persistence goes to the top automatically because the blast radius of a miss is catastrophic. I then group the remaining cases by feature area, look at recent change history in git, and prioritise areas touched by recent commits — changed code carries higher defect probability. From there I identify which cases can give maximum coverage with minimum execution time: end-to-end happy paths and the three or four edge cases most likely to surface at the boundaries of new code. I communicate the triage explicitly to the PM in writing — “we are deliberately deferring X, Y, Z — here is the residual risk” — so the decision to ship is made consciously, not by default. That written acknowledgement protects the team and creates an audit trail. If we can get a short extension, even one day, I ask: the risk reduction per additional day is usually disproportionately high at this stage.
Q2. Explain risk-based testing to a non-technical stakeholder who has never heard the term.
Model answer
I usually use the airport security analogy. Airport security doesn’t search every single passenger with the same intensity — they use profiling, scanners, and risk signals to decide where to invest their attention. Testing works the same way. We have finite time and can’t test every combination of inputs, browsers, and data states. Risk-based testing means we deliberately focus our heaviest scrutiny on the parts of the system where a failure would hurt the most — for example, the checkout flow on a retail site or the payment gateway on a banking app. Lower-risk areas like a contact form or a FAQ page get lighter coverage. The practical upshot is that the team ships with confidence where it matters most, and we document the trade-offs we made so everyone understands what we chose not to test and why. It’s not about cutting corners — it’s about cutting smart.
Q3. How do you identify high-risk areas at the start of a project when you don’t yet know the codebase?
Model answer
I use five lenses. First, money and data: anything that moves funds, stores personal information, or generates financial records is automatically high-risk. Second, regulatory exposure: in NZ that means Privacy Act 2020, RBNZ prudential standards, and FMA guidance if financial services are involved — regulatory breaches carry fines and reputational damage far beyond a bug fix cost. Third, integration points: third-party APIs, payment gateways like Windcave or POLi, external data feeds — these fail in unpredictable ways because you don’t control them. Fourth, historical defect density: I ask the team where bugs have hurt them before — old pain points tend to recur. Fifth, complexity and change velocity: modules with high cyclomatic complexity or that were recently rewritten carry higher inherent risk. I synthesise these inputs into a simple risk register that the whole team can see and challenge — risk identification is a team sport, not a solo QA exercise.
Q4. What should a risk register contain, and how do you keep it from becoming shelfware?
Model answer
A risk register needs at minimum: the risk description, the likelihood score (1–5), the impact score (1–5), the exposure score (likelihood × impact), the owner, the mitigation action, and the current status. Without an owner and a status, a register is just a list of worries. To stop it becoming shelfware I do three things. First, I link it directly to sprint planning — the top five risks are a standing agenda item in sprint kickoffs, so prioritisation decisions are made against visible risk data. Second, I keep the format lightweight: a Confluence table or a Google Sheet works; a 12-tab Excel monster dies immediately. Third, I update it publicly when risks materialise or are retired — visible feedback loops teach the team that the register is a live decision tool, not a compliance artefact. At organisations I’ve worked in, the risk register was most effective when developers could also add items to it freely, not just QA.
Q5. How do you communicate residual risk to a PM who is pushing hard to ship on Friday?
Model answer
I frame it as a business decision, not a QA veto. I’ll say: “I want to share what we tested and what we didn’t, so you can make an informed call.” I then present the residual risk in business language — not “we haven’t run 47 test cases” but “we haven’t validated the bulk export under load, and if it fails in production it would affect every enterprise customer simultaneously and require a same-day hotfix.” I quantify where I can: how many users affected, what the likely support cost is, whether a workaround exists. I put this in writing before the ship/no-ship meeting so it’s documented regardless of the outcome. If the PM chooses to ship, that’s their call to make — my job is to make sure the risk is visible and understood, not to block the business. If they ask what one more day buys us, I give them a concrete answer: “Another day closes these three gaps and reduces that scenario from likely to unlikely.”
2 Technical Testing
Senior QAs are expected to test at the API and database layers, not just the UI. These questions expose whether your technical depth is real or surface-level.
Q6. Walk me through how you would test a REST API endpoint that returns a list of customer orders.
Model answer
I’d approach this in layers. First, happy path and schema validation: a valid authenticated request should return HTTP 200 with a JSON body matching the documented contract — I’d use Postman or Newman to assert on field names, types, and required fields. Second, authentication and authorisation: no token returns 401; a valid token for a different role returns 403 if that role shouldn’t see orders; a customer should never see another customer’s orders. Third, boundary inputs: what happens with no orders (empty array vs null vs 404?), pagination edge cases, very large result sets. Fourth, error handling: malformed query parameters, unsupported HTTP methods, content-type mismatch — the API should return meaningful 4xx codes with helpful error messages. Fifth, performance: under load, does the endpoint maintain acceptable response times when queried concurrently? I’d run a quick k6 or Artillery smoke test to establish a baseline. Finally, I’d check the contract against whatever API spec exists (OpenAPI/Swagger) so the tests can be regenerated if the spec changes.
Q7. Explain SQL JOINs and how you use them in a testing context.
Model answer
An INNER JOIN returns rows where both tables have a match; a LEFT JOIN returns all rows from the left table and matching rows from the right, with NULLs where there is no match; a RIGHT JOIN is the mirror image. In testing, I use JOINs constantly to verify that application actions produce the correct data state across multiple tables. For example, after a user places an order I’d run a query joining the orders, order_items, and customers tables to confirm the order was written with the correct customer ID, the correct line items, and the correct status. LEFT JOINs are especially useful for finding orphaned records — for instance, SELECT o.id FROM orders o LEFT JOIN customers c ON c.id = o.customer_id WHERE c.id IS NULL would surface orders with no valid customer, which is usually a data integrity bug. I also use JOINs in test data setup scripts to confirm the database is in a clean known state before test execution begins, which is essential for deterministic test results.
Q8. How do you test for SQL injection vulnerabilities?
Model answer
SQL injection testing happens at several levels. At the exploratory level I manually enter classic payloads — ' OR '1'='1, '; DROP TABLE users; --, encoded variants like URL-encoded apostrophes — into every input field that interacts with the database, including search boxes, login forms, and URL parameters. I look for unexpected error messages (which often leak table names or query structure), unexpected data appearing in responses, or application crashes. For more thorough coverage I use OWASP ZAP or sqlmap in a controlled test environment — never against production. At the code review level I flag anywhere the application builds SQL strings by concatenating user input rather than using parameterised queries or an ORM. In NZ enterprise contexts SQL injection is a Privacy Act 2020 exposure — a successful injection attack that leaks customer data triggers mandatory breach notification under the Act, so I document these findings as critical-severity regardless of the exploitability assessment. I also verify that database accounts used by the application follow least-privilege principles, so even a successful injection has a limited blast radius.
Q9. Describe your cross-browser testing strategy for a web application targeting NZ users.
Model answer
I start with real analytics data rather than assumptions. In NZ, Chrome on desktop and Chrome on Android are typically dominant, with Safari on iPhone a strong second due to high iOS adoption in the market. I tier browsers into must-pass (Chrome, Safari iOS, Firefox, Edge) and should-pass (Samsung Internet, older Safari on macOS). For automation I use Playwright, which natively supports Chromium, WebKit, and Firefox in the same test suite without separate driver management — this means every automated test run covers three engines at negligible extra cost. Cross-browser issues I watch for specifically in NZ contexts: date picker rendering (NZ date format DD/MM/YYYY conflicts with browser-native date inputs which often default to US format), system font rendering differences, and flexbox/grid quirks on older iOS WebKit versions common in government and DHB environments where device upgrade cycles are slow. I also include a mobile-first check because StatCounter data consistently shows NZ has higher-than-average mobile web usage. Real device testing matters for touch events and viewport behaviour — BrowserStack gives me a device farm without maintaining physical devices.
Q10. How do you approach accessibility testing, and what standards apply in a NZ government context?
Model answer
NZ government agencies are required to meet the NZ Government Web Accessibility Standard 1.1, which mandates WCAG 2.2 Level AA compliance. In practice that means no information conveyed by colour alone, minimum 4.5:1 contrast ratio for normal text (3:1 for large text), all functionality operable by keyboard, form fields with descriptive labels, images with meaningful alt text, and pages that work with screen readers. My testing process uses three layers. Automated scanning with Axe (the Deque library, integrated into our CI pipeline) catches the 30–40% of issues detectable programmatically — missing alt text, invalid ARIA roles, low contrast. Manual keyboard testing catches focus traps, missing focus indicators, and logical tab order that automated tools can’t reason about. Screen reader testing with NVDA on Windows or VoiceOver on macOS/iOS validates the actual user experience for people relying on assistive technology. I log accessibility bugs with severity mapped to WCAG success criteria — a Level A failure is P1 equivalent for a government client because it can constitute unlawful discrimination under the Human Rights Act 1993.
Q11. Describe your performance testing experience. What metrics matter and what tools have you used?
Model answer
Performance testing for me covers four scenarios: load testing (expected normal traffic), stress testing (traffic beyond capacity to find the breaking point), soak/endurance testing (sustained load over hours to find memory leaks), and spike testing (sudden traffic surges, especially relevant for NZ retail around Black Friday or election night for government platforms). The metrics I care about are response time at percentiles (p50, p95, p99 — averages lie), throughput (requests per second), error rate, and resource utilisation on the server side (CPU, memory, DB connection pool saturation). I’ve used k6 for scripted load tests because it runs in JavaScript, integrates with CI, and outputs Grafana-compatible metrics. For legacy projects I’ve used JMeter. I always define acceptance criteria before testing — “p95 response under 800ms at 200 concurrent users” — because “is it fast enough?” is unanswerable without a baseline. I also flag performance regressions between releases by running a baseline test suite in CI and alerting when p95 degrades by more than 20% compared to the previous release.
3 Strategy & Planning
These questions test whether you can operate strategically, not just tactically. Interviewers want to see structure and judgement alongside technical skill.
Q12. How do you estimate testing effort for a new feature when requirements are still fuzzy?
Model answer
Fuzzy requirements are the norm, so I estimate in ranges rather than point estimates and I make assumptions explicit. My process: I decompose the feature into testable areas as best I can from what exists (wireframes, stories, verbal description), then assign a T-shirt size to each area based on complexity and risk. I convert T-shirt sizes to time using historical velocity from similar work — if a medium API integration historically takes two days to test properly, I start there. I then add risk buffers: 20% for a feature touching existing complex modules, 50% for anything involving third-party APIs or new infrastructure. I document the assumptions underpinning the estimate in writing: “This assumes the API contract is finalised before test execution begins; if it changes after we start, re-estimate.” That way scope creep hits the estimate visibly rather than silently degrading coverage. I also flag the estimate to the PM as a confidence rating — “medium confidence, +-40%” — which is honest and helps them make better planning decisions than a false-precision single number would.
Q13. What belongs in a test strategy document, and who should write it?
Model answer
A test strategy is the “why and how we test” document for a project or system. It should cover: the objectives and scope of testing (what is in and explicitly out); the risk assessment and how risks drive prioritisation; the test levels and types (unit, integration, system, UAT, performance, security — and which the team owns vs which are outsourced); the entry and exit criteria for each phase; the tools and environments; roles and responsibilities; and the defect management process. It should not include specific test cases — those go in test plans or Jira. On who writes it: the lead QA should own the draft, but it should be a collaborative document reviewed by the dev lead, PM, and a business representative. A strategy written solely by QA in isolation often doesn’t reflect the business risk model or the team’s actual delivery constraints. I keep strategies as short as possible — two to four pages for most projects — because a 30-page document nobody reads is worse than a one-pager everyone knows.
Q14. A PM adds three new stories to the sprint on Wednesday. How do you handle scope creep and protect test coverage?
Model answer
I treat mid-sprint additions as a capacity and risk conversation, not a QA problem. My first move is to make the impact visible and immediate: I calculate the testing hours the three new stories require and show that against the hours remaining in the sprint. Then I present three options — (1) accept the new stories and explicitly defer test coverage on existing stories to be determined together, (2) reduce scope of the new stories to something testable within remaining time, or (3) push the new stories to the next sprint. I resist the implicit option (4) that is never named but often assumed: add the stories and silently reduce test depth without telling anyone. That option is how defects reach production without a deliberate decision being made. I also raise this in the sprint review as a pattern if it recurs — chronic mid-sprint scope additions are a process problem that affects not just QA but dev capacity too, and surfacing the data across multiple sprints makes it a team conversation rather than a QA complaint.
Q15. How do you manage regression testing when the team is deploying to CI/CD multiple times per day?
Model answer
High-frequency deployment makes manual regression completely unviable — the only sustainable answer is a layered automated regression suite in the pipeline itself. I structure regression in tiers: a unit and integration test suite that runs on every commit (under two minutes, gives fast feedback to developers), a focused smoke test of critical paths that runs on every deploy to staging (five to ten minutes, catches show-stopping regressions before anyone tests manually), and a broader UI and API regression suite that runs on a schedule or on release candidates (twenty to forty minutes). The smoke and regression suites run in parallel where possible, and they block promotion to production if they fail. The key maintenance discipline is treating a flaky test as a P1 bug — a flaky test is worse than no test because it erodes trust in the pipeline and gets skipped. I also track regression coverage as a metric over time and prioritise adding automation for any area where a regression bug reached production, because that is evidence of a coverage gap.
Q16. Explain shift-left testing and how you’ve applied it in practice.
Model answer
Shift-left means moving testing activities earlier in the development cycle rather than waiting until code is complete. The earlier a defect is found, the cheaper it is to fix — a bug caught in a requirements review costs minutes to fix; the same bug found post-production can cost weeks. In practice I’ve applied shift-left in several ways. I attend requirements and design sessions to ask testability questions up front: “How will we know this is correct?” and “What happens in this edge case?” catch ambiguities before anyone writes code. I write acceptance criteria collaboratively with developers using Given-When-Then before development starts, which means the definition of done is unambiguous. I review pull requests for testability — flagging code that is hard to test in isolation encourages better separation of concerns. I also run brief three-amigo sessions (BA, dev, QA) on complex stories where alignment on behaviour prevents rework. The hardest part of shift-left is cultural: developers sometimes feel QA involvement earlier is intrusive. I frame it as reducing their rework burden rather than reviewing their work, which lands better.
4 Defect Analysis
Senior QAs are expected to think analytically about defects, not just report them. These questions test systemic thinking and root cause rigour.
Q17. Describe the most complex bug you’ve found. Walk me through how you identified and reported it.
Model answer
The most complex bug I found was a race condition in a payment processing service. The symptom was intermittent double-charging of customers — roughly one in 400 transactions would be charged twice, only visible in the bank statement, not in the application UI which showed a single charge. Reproducing it required simulating concurrent requests to the same endpoint with the same customer token, which I did using a multi-threaded Postman collection run with a shared idempotency key. The root cause was that the service checked for existing transactions before writing, but the check and the write were not atomic — two concurrent requests could both pass the check before either had written. The fix was a database-level unique constraint and an idempotency key pattern at the API layer. My defect report included: the exact reproduction steps with timing, the frequency rate from log analysis, the financial impact calculation (we found 23 affected transactions in three weeks), the root cause hypothesis (confirmed with the dev lead), and the recommended fix. I classified it P0 and flagged it to the engineering manager directly rather than waiting for the normal triage cycle, because financial double-charging in NZ has potential FMA implications.
Q18. How do you distinguish between a root cause and a symptom when analysing a defect?
Model answer
I use the Five Whys technique as a starting point, asking “why did this happen?” until the answer is something that can be permanently fixed rather than patched. A symptom is what the user sees — “the page showed an error.” The first why might reveal “the API returned a 500.” The second why: “a null pointer exception in the order service.” The third why: “the discount code field was not validated for null before being passed to the calculation engine.” The fourth why: “there was no validation requirement written in the acceptance criteria.” The root cause is the missing validation requirement, not the null pointer, and the permanent fix is adding null checks plus updating the AC template to always include validation rules. Fixing only the null pointer (the symptom one level up) leaves the process gap open for the next field that gets missed. I also ask: “is this a one-off error or does this pattern repeat elsewhere in the codebase?” Root cause thinking drives a search for related instances, not just a single bug fix, and that search often surfaces multiple related defects in one go.
Q19. What are the most common failure modes you see in NZ enterprise software projects?
Model answer
From my experience in NZ enterprise, the recurring failure patterns are: timezone mishandling — the NZST/NZDT shift catches teams regularly because most server infrastructure runs on UTC and the daylight saving transition (the last Sunday of September forward, the first Sunday of April back) creates off-by-one-hour bugs in scheduled jobs, report generation, and timestamp comparisons; GST edge cases — the 15% GST rate applied incorrectly in rounding, especially on multi-line invoices or split payments; integration fragility with legacy systems — NZ enterprise often involves connecting modern APIs to legacy systems (COBOL-era banking cores, SAP instances with limited test environments), and the integration layer is usually where the most serious and hardest-to-reproduce bugs live; data sovereignty assumptions — assuming cloud provider infrastructure is in NZ when it may not be, which creates Privacy Act exposure; and insufficient load testing for NZ-scale traffic — teams sometimes use overseas load benchmarks that are too high for NZ user numbers, leading to complacency about genuine NZ-scale bottlenecks at much lower thresholds.
Q20. A defect can’t be reproduced when you hand it over to the developer. How do you handle this?
Model answer
Non-reproducible defects are rarely truly non-reproducible — usually the reproduction environment or state differs in a way that matters. My first step is to verify I can still reproduce it in my own environment and document the exact steps, including data state, user account, browser/device, and time of day. If the developer can’t reproduce it, I look for environmental differences: are they using a different database seed? A different user role? Did a recent deployment change the code between when I found it and when they tested? I also check whether the bug is time-sensitive — race conditions and caching bugs are highly state-dependent and may require specific timing or concurrency to trigger. I always capture a screen recording and relevant network logs or console errors at the time I first find a bug, precisely because reproduction can be fragile — those artefacts become the evidence set. If after thorough investigation neither the developer nor I can reproduce it, I don’t close it — I move it to a “monitor” state with a note that it occurred once under documented conditions, and I add it to the next exploratory testing session as a target area. Intermittent bugs that close without root cause understanding tend to resurface.
5 NZ-Specific
NZ employers value QAs who know the local regulatory and cultural context, not just international testing theory. These questions separate candidates who’ve worked in NZ from those who haven’t.
Q21. How does the Privacy Act 2020 affect how you test a system that stores customer personal information?
Model answer
The Privacy Act 2020 replaced the 1993 Act and brought NZ into alignment with GDPR principles. For testing it creates several concrete obligations. First, test data must not use real customer personal information unless absolutely necessary and properly consented — production data masking or synthetic data generation is required for test environments. Second, I test the information privacy principles (IPPs) directly: IPP 3 (collection notice) means I verify that users are told what data is collected and why before it is collected; IPP 6 (access) means I verify that users can request and receive their own data; IPP 7 (correction) means the system must allow users to correct inaccurate information. Third, I test mandatory breach notification flows: the Act requires notification to the Privacy Commissioner and affected individuals for serious breaches, so I verify that breach detection and notification workflows exist and work. Fourth, for any system handling health data I apply the Health Information Privacy Code 2020 which has additional requirements. I log Privacy Act gaps as P1 equivalent defects because the Office of the Privacy Commissioner can now impose fines up to $10,000 for serious breaches, and reputational damage to NZ businesses from public breach notifications is substantial.
Q22. What are the NZ government accessibility standards and how do you verify compliance?
Model answer
The NZ Government Web Accessibility Standard 1.1 mandates WCAG 2.2 Level AA for all NZ government websites and web applications. This is a legal obligation under the Human Rights Act 1993 and the NZ Government ICT Strategy, not merely a best-practice recommendation. Compliance verification involves three tiers. Automated testing with tools like Axe, WAVE, or Lighthouse catches programmatically detectable issues: missing alt text, insufficient colour contrast, missing form labels, invalid HTML structure. Manual testing covers keyboard navigation (tab order, focus visibility, no focus traps), skip links, logical heading hierarchy, and ensuring modals can be dismissed without a mouse. Assistive technology testing with NVDA on Windows, JAWS in enterprise environments, and VoiceOver on macOS and iOS validates that screen reader users can complete key tasks. I also run quick user tests with team members unfamiliar with the interface using keyboard-only navigation — this catches usability issues that automated tools miss. For government clients I produce a WCAG 2.2 AA conformance report documenting every success criterion and its pass/fail status, which gives the agency an audit trail for their own accessibility obligations under the DIA web standards.
Q23. Describe how NZST/NZDT daylight saving transitions cause bugs and how you test for them.
Model answer
New Zealand observes NZST (UTC+12) in winter and NZDT (UTC+13) in summer. The transition forward happens on the last Sunday of September at 2am (clocks jump to 3am), and the transition back happens on the first Sunday of April at 3am (clocks go back to 2am). The forward transition creates a missing hour and the backward transition creates a repeated hour — both cause bugs. Common failure modes: scheduled jobs that fire at 2:30am on transition day either skip or double-fire; timestamps stored as local time become ambiguous during the repeated hour; date range queries that span midnight on transition day return wrong row counts; “send reminder 24 hours before” logic calculates 23 or 25 hours instead. My testing approach: I set the system clock (or database now() function in a test environment) to the minute before the transition, then step through the transition checking scheduled jobs, log timestamps, and any time-sensitive business logic. I also test with Australia/Lord_Howe time if the system serves users across Tasman, because Lord Howe Island has a 30-minute DST offset which is genuinely unusual. The root fix is always storing timestamps as UTC and converting to local time only for display — I flag any system not doing this as a structural risk.
Q24. What should a QA know about NZ banking regulations when testing a fintech application?
Model answer
NZ banking regulation sits across several agencies. The Reserve Bank of New Zealand (RBNZ) governs prudential standards for registered banks and now non-bank deposit takers under the Deposit Takers Act 2023. The Financial Markets Authority (FMA) regulates investment and financial advice services, including platforms offering KiwiSaver or managed funds. For QA this means: any feature touching funds movement needs to be tested against AML/CFT (Anti-Money Laundering and Countering Financing of Terrorism) requirements — transaction monitoring, identity verification thresholds, and suspicious activity reporting flows all need explicit test coverage. Payment rail testing should cover both the old high-value SWIFT-based rails and the NZ real-time payments infrastructure via Payments NZ. Settlement timing rules matter for some transaction types: same-day settlement vs T+2 for securities. I also pay close attention to disclosure obligations — the FMA requires clear disclosure of fees, returns, and risks, so I test that disclosure language matches the approved wording and that it appears at the legally required points in the user flow. Consumer credit contracts under the Credit Contracts and Consumer Finance Act 2003 (CCCFA) also have mandatory disclosure elements that must appear correctly in lending applications.
Q25. How do you approach testing UI that includes Te Reo Māori text, names, and macrons?
Model answer
Te Reo Māori support is increasingly a baseline expectation for NZ software, particularly in government, education, and health sectors. The technical focus areas are: macron rendering — macronated vowels (ā, ē, ī, ō, ū and their uppercase equivalents) must render correctly in all fonts used, including system fallback fonts, and must survive round-trips through the database and API (UTF-8 encoding throughout the stack, no Latin-1 or ISO-8859-1 anywhere). Name field validation must not reject macronated characters — I test rātapu (Sunday), names like Māia or Tūhoe, and Te Reo place names like Whanganui in every text input and database write path. Search and sort behaviour matters: does a search for “Maori” find results spelled “Māori”? Sorting Māori words alphabetically may require locale-aware collation. Copy accuracy: I verify that Te Reo text in the UI matches the approved source text exactly, including macrons, because incorrect spelling is culturally disrespectful and a reputational risk. I also check that screen readers pronounce macronated words correctly — some TTS engines mispronounce Māori words without macrons, so correct encoding improves the audio experience for vision-impaired Māori users.
6 Behavioural
Behavioural questions reveal how you operate under pressure and with people. Use the STAR format (Situation, Task, Action, Result) and be specific — vague answers signal shallow experience.
Q26. Tell me about a time you prevented a critical defect from reaching production.
Model answer
Situation: During a release at a Wellington SaaS company, we were two hours from deploying a major update to our subscription billing system. Task: I was the lead QA for the release and responsible for final sign-off. Action: During final regression I noticed that the upgrade path for customers on legacy annual plans was not included in the test matrix — it had been scoped out in a planning meeting I wasn’t present for. I ran a quick smoke test against a legacy plan account in staging and found that upgrading triggered a double-charge: one for the remaining balance on the old plan and one immediate charge for the new plan with no credit applied. I stopped the release, wrote up the defect with repro steps and the financial impact calculation (we had 340 customers on legacy annual plans), and escalated to the engineering lead and the CEO in the same message. Result: Release was delayed by three days. The fix was implemented and tested. We saved an estimated $47,000 in refunds and avoided what would have been a serious trust incident. The incident led us to create a mandatory legacy plan regression checklist for all billing changes going forward.
Q27. Describe a situation where you mentored a junior QA. What was your approach and what was the outcome?
Model answer
Situation: A junior QA joined our team with three months of testing experience from a bootcamp. They were enthusiastic but writing test cases that only covered happy paths and had no feel for when to escalate a defect versus fix it themselves. Task: I was assigned as their informal mentor alongside my own workload. Action: I ran weekly one-hour pairing sessions where we tested a feature together and I narrated my thinking out loud — why I was choosing a particular input, what I was looking for in the response, how I was deciding severity. I gave them progressively more complex features to own independently, with me reviewing their test plans before execution rather than after. When they logged a P3 bug that I thought was actually P1, I walked through the impact analysis with them rather than just changing the severity. I also encouraged them to attend sprint planning so they understood the business context of what they were testing. Result: Within four months they were independently testing API integrations, writing Postman collections, and accurately assessing severity without my review. They also started flagging requirements ambiguities in refinement sessions, which is a shift-left behaviour I hadn’t explicitly taught but that they picked up from the pairing sessions. They were promoted to mid-level QA the following year.
Q28. Tell me about a time you disagreed with a PM’s decision about a defect or release. What did you do?
Model answer
Situation: We had a known defect where certain PDF export files had incorrect GST calculations for line items over $10,000 — a rounding error in the tax engine. The PM wanted to ship and fix it in the next sprint because the release had been delayed twice already and client pressure was high. Task: I disagreed with the call because the GST error was a financial accuracy issue and in the NZ context, an incorrect tax invoice could expose our client to IRD compliance risk. Action: I didn’t argue in the moment or in public. I wrote a one-page risk summary after the meeting: what the defect was, the specific scenarios where it triggered, how many clients would likely be affected, and the IRD compliance angle — specifically that incorrect GST on a tax invoice is a GSMA violation that the end client, not us, would have to remediate. I sent it to the PM and the account manager and asked for a 30-minute conversation. Result: After seeing the compliance framing written out, the PM agreed to a 48-hour targeted fix. The developer fixed the rounding logic in an afternoon and we shipped with it resolved. The PM later told me the compliance angle changed their calculation entirely — they had been thinking of it as a UX annoyance rather than a legal exposure. I keep a “risk memo” template now for exactly these situations.
Q29. Describe a time you had to negotiate scope or timeline to protect quality.
Model answer
Situation: We were building a data migration tool for a local government client moving five years of records into a new system. The project had slipped and testing time had been squeezed from three weeks to five days. Task: I was responsible for the testing plan and I needed to either protect enough time to do the job properly or escalate the risk formally. Action: I put together a testing scope matrix showing what five days would actually cover versus the three weeks originally scoped. Five days covered migration accuracy for the most common record types but left edge cases — records with NULL fields, duplicate record IDs from a 2018 data clean-up, and Māori language characters in address fields — untested. I attached specific historical data showing that similar migrations had found critical errors in exactly those categories. I proposed a middle path: twelve working days covering 90% of the original scope by descoping the least-risky record types. The trade-off was documented in a risk matrix. Result: The client approved twelve days. We found eleven critical migration errors and six data corruption issues during that extended window, including one that would have silently dropped 847 records with macronated place names. None of those errors would have been caught in five days. The client later cited the testing rigour as a project success factor in their post-implementation review.
Q30. How do you stay current with QA tools, techniques, and the broader testing industry?
Model answer
I use a mix of structured and informal learning. For structured learning I hold an ISTQB Foundation certificate and I’m working toward CTAL Advanced Test Analyst, which forces me to engage with testing theory at a level that day-to-day work doesn’t always reach. I read the Ministry of Testing blog and the Testival Community Slack, and I follow a handful of practitioners on LinkedIn who write substantively about testing rather than just sharing motivational content. For tools I set aside time each quarter to run a spike on one new tool — I’ll spend a day or two on a personal or work project using a tool I haven’t used before (recently k6 for load testing, Playwright for its component testing capabilities, and Allure for test reporting). I also attend the NZ ISIG (Information Systems Interest Group) events and the Wellington Test meetup group when they run — conversations with local practitioners surface what’s actually being used in NZ enterprise rather than what’s trending on international Twitter. Finally, I believe the best way to consolidate learning is to teach it: writing internal lunch-and-learn sessions on techniques I’ve learned forces me to understand them well enough to explain them to someone who doesn’t.