New · ISO/IEC 42119

Testing AI Systems — ISO/IEC 42119 in Practice

The first NZ-localised guide to testing AI systems against the international standard.

ISO/IEC TS 42119-2:2025 is the new technical specification for testing AI. It extends the ISO/IEC/IEEE 29119 software testing series into the AI domain, with a risk-based approach and AI-specific test types that traditional testing never covered — data representativeness, model drift, fairness, and explainability. This module shows you how to apply it on real NZ systems.

Start with Lesson 1 → Back to Test with AI

This section covers

Why AI Testing Is Different Data Quality Testing Model Testing Bias & Fairness Audit-Ready Artefacts Applying 42119

Certification alignment

ISO/IEC TS 42119-2:2025

Aligned to ISO/IEC TS 42119-2:2025. This is not a certification programme — it is a practical guide to applying the standard. Because 42119 is a Technical Specification, frame your compliance claims as “aligned to” rather than “certified against”.

Who this is for

Senior testers, Test Leads, QA Architects, and anyone working on AI systems in a regulated NZ context. Assumes ISTQB Foundation Level or equivalent experience.

Standard structure

How ISO/IEC 42119 is organised

42119 is a multi-part Technical Specification. The lessons below map to the specific parts that NZ practitioners are most likely to encounter. Parts 7 & 8 (Red Teaming and GenAI quality) are covered in the AI Evaluation section.

Part 2

Overview & Risk-Based Testing

Extends ISO/IEC/IEEE 29119 into the AI lifecycle. Defines the risk-based test approach that drives scope. Lessons 1, 4, 5.

Part 3

Verification & Validation Analysis

Simulation, formal verification methods, lifecycle evaluation, and documentation evidence requirements. Lessons 2, 3, 6, 7.

Parts 7 & 8

Red Teaming & GenAI Quality

Adversarial testing of AI systems and quality assessment of text-to-text (GenAI) outputs. Covered by the Prompt Injection and RAG Evaluation practical labs.

The 8 lessons

Aligned to ISO/IEC TS 42119-2:2025

Lesson 1

Why AI Testing Is Different

Deterministic versus probabilistic systems. The five AI-specific failure modes traditional testing misses. The AI lifecycle, how 42119 extends 29119, and the risk register that drives test scope.

~30 min read · ~70 min with exercises · Part 2

Lesson 2

Data Quality Testing

Representativeness, provenance, and label correctness testing. The data quality dimensions that map to AI failure modes. What a data test case looks like, and the audit-ready evidence 42119 requires.

~30 min read · ~70 min with exercises · Parts 2 & 3

Lesson 3

Model Testing

Model performance, adversarial, and explainability testing. Accuracy, precision, recall, and F1 in plain English. Drift testing and continuous validation — the model that fails silently after go-live.

~35 min read · ~80 min with exercises · Part 3

Lesson 4

Bias and Fairness Testing

Where AI bias comes from. Counterfactual fairness and demographic parity testing. Protected characteristics under the NZ Human Rights Act 1993. Fairness as a testable quality characteristic, not an opinion.

~30 min read · ~75 min with exercises · Part 2

Lesson 5

Risk-Based AI Testing

Calibrating test depth to the consequences of failure. The AI risk equation, the AI-specific risk dimensions, and a risk register that maps each risk to the 42119 test types it demands.

~30 min read · ~70 min with exercises · Part 2

Lesson 6

Drift, Monitoring & Ongoing Testing

Why a model good at release degrades in production. Data, concept, and model drift; monitoring strategies; drift detection in plain terms; and the retraining-and-regression loop that keeps a model honest.

~30 min read · ~70 min with exercises · Part 3

Lesson 7

Audit-Ready Test Artefacts

The mandatory fields a 42119 test case carries. Risk-based traceability. What a test summary report looks like to an FMA or RBNZ auditor. Evidence requirements for every AI test type.

~30 min read · ~70 min with exercises · Part 3

Lesson 8

Applying 42119 in a Real NZ Project

An end-to-end walkthrough: risk register in sprint 1, data tests during development, fairness testing before go-live, drift detection post-deployment. What good looks like — and the 42119 roadmap ahead.

~35 min read · ~80 min with exercises · All Parts

Why this section

A standard for a new kind of system

Traditional software testing was built for systems that do the same thing every time. AI systems do not. The same input can produce a different output. The system is shaped by its training data, not just its code. It can keep learning after go-live, and its behaviour can drift without anyone touching a line of code. It can be biased. It can fail silently.

ISO/IEC TS 42119-2:2025 is the international response to that gap. It does not replace the testing you already know — it builds on ISO/IEC/IEEE 29119 and adds the AI-specific test types those standards never had. It is risk-based throughout, it ties to the ISO/IEC 25059 AI quality model, and it requires test artefacts a regulator can actually read.

NZ testers need this now. The FMA, the RBNZ, the Government Algorithm Charter, and public sector procurement are all moving in the same direction: if an AI system makes or shapes a decision about a person, you will be asked to show how it was tested. This module teaches you to answer that question — with NZ examples throughout, on the systems you actually work on.

The Test with AI modules

Ch 1

GenAI Foundations

How large language models work, and what they can and cannot do for testers.

Ch 3

Managing AI Risks

Hallucination, bias, data privacy under the NZ Privacy Act 2020, and non-determinism.

Ch 5

Adopting GenAI

Shadow AI, GenAI test strategy, model selection, and building AI capability in a test team.

Track context

This track

AI Testing & Quality — testing AI systems themselves, not using AI to test

Best for

Senior testers, test leads, test managers, and developers working on AI products or integrating LLMs into production systems

Foundation

Senior Tester track
Risk-based testing and API testing foundations make this track significantly more effective.

Standard

ISO/IEC TS 42119-2:2025
The international standard for testing AI systems — covered in depth in the ISO module.