Senior Level · Learning

Usability Testing

A system that passes every functional test can still be so confusing that users give up. Usability testing is how you catch the gap between "it works" and "people can use it".

Senior ISTQB CTAL-TA 3.3 ~15 min read + exercise

1 The Hook

The NZ Ministry of Social Development launched a redesigned online application form for a benefit. Functional testing passed completely — every field saved correctly, every validation fired, every submission processed and reached the back-end system. The sign-off report had no open defects.

During the first week of live operation, completion rates were 31% lower than the previous form. Exit surveys told a consistent story: users were confused by the document upload step. The new design required selecting a document category before the upload button appeared. The button was hidden behind a dropdown that first-time users didn't recognise as a prerequisite step. UX had tested the design internally with team members who already knew how the form worked.

No usability test was run with actual benefit recipients before launch. MSD had to add supporting instructional text, reorganise the page layout, and ship a patch three weeks after go-live. The rework cost significantly more than a half-day think-aloud session with five users would have.

2 The Rule

Functional correctness and usability are separate properties. Test them separately. A page can be pixel-perfect and completely unusable.

3 The Analogy

Analogy — Use this in interviews

Nielsen's 10 heuristics are a building code for software.

A building can pass a structural inspection — the foundation is sound, the roof won't collapse — and still fail a usability inspection. The staircase is in the wrong place. The exit signs aren't marked. The bathroom door opens inward in a tiny cubicle. The building works structurally but people struggle to use it daily. Functional testing is the structural inspection. Heuristic evaluation is the usability inspection. You need both.

4 Watch Me Do It

Heuristic evaluation of a fictitious NZ government portal — MyService NZ. Walk through each of Nielsen's 10 heuristics and document findings.

Heuristic Finding on MyService NZ Verdict

1. Visibility of system status

"Upload processing" spinner disappears before confirmation message appears — users don't know if upload succeeded or timed out.

FAIL

2. Match system & real world

"Lodgement" used throughout instead of "Submit application". Correct legal term, but general public will hesitate.

FAIL

3. User control & freedom

No Cancel button once document upload starts. Users cannot abort a mis-selected file without refreshing the page and losing form data.

FAIL

4. Consistency & standards

"Next" button on page 2 is positioned left of "Back", reversing the standard NZ government design system order.

FAIL

5. Error prevention

Date field accepts 13 as a month value with no client-side validation. Only caught server-side after form submission.

FAIL

6. Recognition over recall

Document category dropdown has no helper text. Users must remember category names from a previous page to make correct selection.

FAIL

7. Flexibility & efficiency

No way to save progress and return later. Users with slow internet or interruptions lose all data and must restart.

FAIL

8. Aesthetic & minimalist

Page 1 has 4 separate instructional banners, each repeating similar messaging. Cognitive overload for first-time users.

FAIL

9. Help users recognise errors

Validation error messages appear at the top of the page, not adjacent to the field that caused the error. Users must scroll up and then back down.

FAIL

10. Help & documentation

Help link opens a new browser tab to a 47-page PDF manual. No inline contextual help for specific fields.

FAIL

A heuristic evaluation like this takes 2–3 hours and produces a structured defect list before a single user session is run. Raise each finding as a defect with severity (cosmetic / minor / major / critical) and the specific heuristic violated.

SUS (System Usability Scale). Run a 10-question Likert survey with users after any usability session. Each question scores 1–5. Odd questions: subtract 1 from the score. Even questions: subtract the score from 5. Sum all adjusted scores, multiply by 2.5. Result is 0–100. A score above 68 is above average. Below 68 is statistically likely to result in poor adoption and elevated support load. Present the raw score and the percentile benchmark to stakeholders — not just "users found it confusing".

Pro tip: Run a think-aloud session with five users before presenting heuristic findings. Ask users to speak their thoughts aloud while completing a task. Don't guide them. Five users catch 85% of usability problems (Nielsen's rule). Record the session with permission, then clip the moments of confusion to show stakeholders. A 30-second video of a user saying "I don't understand what I'm supposed to do here" is more persuasive than a written defect.

5 When to Use It

Run usability testing when: a new feature has complex multi-step interaction flows; before public launch of any government-facing service; drop-off or abandonment rates are unexpectedly high; accessibility testing has passed but users with disabilities or low digital literacy still struggle; a redesign has replaced a familiar interface.

Heuristic evaluation (no users needed) is the right choice when: you need rapid feedback with no budget for sessions; you want to front-load defect discovery before user research; the product isn't stable enough to put in front of real users yet.

When you can skip it: internal admin tools used exclusively by trained power users who already know the domain. Even then, a quick heuristic pass is worth an hour of anyone's time before a major release.

6 Common Mistakes

🚫 Treating usability as UX's problem, not QA's

I used to think: if UX designed it, usability is their responsibility, not mine.
Actually: testers are uniquely positioned to run heuristic evaluations because we read interfaces critically and we know the business rules. UX owns the design. QA validates that the design actually works for users. These are different activities and both need to happen.

🚫 Thinking usability testing needs an expensive lab setup

I used to think: usability testing requires specialist UX researchers and dedicated lab facilities.
Actually: five users in a think-aloud session over Zoom catches 85% of usability issues. You can run one in an afternoon. You need: a task list, a Zoom call with recording permission, and enough discipline not to help users when they get stuck.

🚫 Dismissing SUS as a satisfaction survey

I used to think: SUS is just a quick satisfaction check with no real analytical value.
Actually: SUS is a validated psychometric scale developed by John Brooke at DEC in 1986 and refined over thousands of studies. A score below 68 is statistically correlated with poor adoption and high support overhead. Cite the benchmark when you report the score, not just the number.

7 Now You Try

🧪 Prompt Lab — RealMe Login Usability

Evaluate this user story against Nielsen's 10 heuristics: "As a first-time user of the RealMe login page, I need to create an account." List 3 potential usability issues and the heuristic each violates. For each issue, suggest a specific fix. Then ask the AI to identify any additional issues you may have missed.

<div class="teach-section" id="check">
          <h2><span class="section-num">8</span> Self-Check</h2>
          <p style="color:var(--ink-3);margin-bottom:var(--sp-4);">Click each question to reveal the answer.</p>

<div class="self-check" onclick="this.classList.add('revealed')">
            <p class="q">Q1. What is a cognitive walkthrough and how does it differ from a heuristic evaluation?</p>
            <p class="a">A cognitive walkthrough steps through a specific user task action-by-action, asking four questions at each step: Will the user know what to do? Will the user notice the correct action? Will the user understand the feedback? A heuristic evaluation assesses the overall interface against a fixed set of principles without following a specific task path. Cognitive walkthrough is narrower and task-focused; heuristic evaluation is broader and principle-based. Use both when you have time, or heuristic evaluation alone for a rapid pass.</p>
          </div>

<div class="self-check" onclick="this.classList.add('revealed')">
            <p class="q">Q2. A user completes a task but says "that was confusing." Does the usability test pass or fail?</p>
            <p class="a">It depends on what you're measuring. If the task completion metric is "did they finish the task" — it passes. But usability is not just task completion. Satisfaction, efficiency, and perceived difficulty are all usability dimensions. A user who finishes the task while saying "that was confusing" is a usability finding. Raise it as a severity-2 (minor) issue: task completes but causes unnecessary cognitive load. Completion rate and SUS score together give you the full picture.</p>
          </div>

<div class="self-check" onclick="this.classList.add('revealed')">
            <p class="q">Q3. What SUS score would you flag to your Test Lead as requiring design remediation?</p>
            <p class="a">Below 68. This is the established benchmark for "below average" usability, validated across thousands of studies. A score of 51–67 is "OK" but marginal — flag it and document the specific issues driving it down. A score below 51 is "Poor" — recommend a design review before release. Always report the score with the percentile rank, not just the number: "Our SUS score of 58 is in the 25th percentile, below the industry average of 68."</p>
          </div>
        </div>

<div class="teach-section" id="istqb">
          <h2><span class="section-num">9</span> ISTQB Mapping</h2>
          <div class="istqb-box">
            <p style="margin:0 0 var(--sp-2);"><strong>CTAL-TA v3.1.2 — Section 3.3.3</strong>: Non-functional test design — usability. Covers heuristic evaluation as a structured technique for non-functional quality assessment, alongside performance and reliability testing.</p>
            <p style="margin:0;"><strong>CTFL v4.0 — Section 4.1.2</strong>: Test techniques — experience-based techniques, applicable to heuristic evaluation as a form of exploratory and checklist-based testing. Also aligns with Section 1.1.1 (quality characteristics — usability as an ISO 25010 product quality attribute).</p>
          </div>
        </div>

<div class="teach-section" id="next">
          <h2><span class="section-num">10</span> Next Steps</h2>
          <div class="teach-nav">
            <a href="/specialised/usability-testing/" class="btn btn-teal">Usability Testing Specialised Track →</a>
            <a href="/senior/learning/accessibility-testing/" class="btn btn-ghost">Accessibility Testing</a>
            <a href="/senior/practice/" class="btn btn-ghost">Senior Practice Page</a>
          </div>
        </div>

</article>
    </div>
  </div>
</main>