Test with AI · NZ Privacy Tool

NZ Privacy Compliance Checklist

Five checks every tester must complete before pasting any data into a generative AI tool when working for a New Zealand government agency.

Privacy Act 2020 NZISM Government agencies CT-GenAI Ch 3

Why government agencies are different

Private sector testers can often use synthetic or anonymised data with relatively low friction. Government agencies in New Zealand operate under additional constraints: the Privacy Act 2020, the New Zealand Information Security Manual (NZISM), the Protective Security Requirements (PSR), and agency-specific AI policies that may prohibit certain tools entirely.

The risk is not hypothetical. Pasting a real IRD number, a Health NZ patient record, or an immigration case reference into a commercial LLM means that data has left your agency's security boundary — potentially permanently, depending on the provider's data retention policy. That is a Privacy Act notifiable privacy breach if the data relates to an identifiable individual.

Work through the five checks below before every session. Tick each one off when you're confident it passes.

0 / 5 checks complete
1

Classify your data

Is this UNCLASSIFIED, IN CONFIDENCE, SENSITIVE, RESTRICTED, or CONFIDENTIAL?

What to check

Every piece of information handled by a New Zealand government agency must be classified under the New Zealand Government Security Classification System (part of the PSR). The classifications in ascending sensitivity are:

  • UNCLASSIFIED — public or non-sensitive information. May be shared externally with appropriate care.
  • IN CONFIDENCE — limited official use; release would cause embarrassment or minor harm.
  • SENSITIVE — release would cause significant harm to individuals or operations.
  • RESTRICTED — unauthorised release would compromise security or significant interests.
  • CONFIDENTIAL / SECRET / TOP SECRET — highest protection; never leaves approved systems.

Commercial LLM tools (including tools accessed via browser or API) are not approved environments for anything above UNCLASSIFIED unless your agency has a specific security accreditation in place for that tool and that classification level.

If the data is IN CONFIDENCE or above, stop here. Use synthetic data or request an approved internal AI environment from your ICT team.
Safe swap: Replace real classified content with clearly fictional placeholders — e.g. "Agency X processed 1,000 applications in March" instead of the actual figures.
Legal basis: Protective Security Requirements (PSR) — Security Classification Policy; NZISM Chapter 2 (Information Classification).

Do not proceed. Use your agency's approved internal AI tool, or replace the data with synthetic equivalents before using an external tool.

2

Check for personally identifiable information (PII)

Does the data contain details that could identify a real person?

What counts as PII under the Privacy Act 2020

The Privacy Act 2020 (IPP 1–13) applies to any "information about an identifiable individual." In a government testing context, common sources of PII in test data include:

  • Full names combined with any other identifier (DOB, address, IRD number)
  • IRD numbers, NHI numbers, passport numbers, driver licence numbers
  • Health and disability information (including ACC claim references)
  • Immigration status, visa details, case file numbers
  • Real email addresses or phone numbers of citizens
  • Any database row that was exported from a production system

De-identification is not just name removal

Removing a name is not sufficient. A record with IRD number 049-123-456, suburb "Remuera", employer "Auckland City Council", and a mortgage amount can still uniquely identify a person. True de-identification requires removing or replacing all fields that could be used — alone or in combination — to re-identify an individual.

Production database exports are almost never safe to use directly. Even "anonymised" exports from some agencies still contain partial identifiers. Treat any real export as PII until proven otherwise.
Safe swap: Use a data generation tool (e.g., Faker with NZ locale) to create structurally valid but entirely fictional records. Replace real IRD numbers with valid-format but fake ones (e.g., 123-456-789).
Legal basis: Privacy Act 2020, Information Privacy Principle 5 (storage security) and Principle 11 (disclosure outside New Zealand). A notifiable privacy breach under s113 requires notification to the Privacy Commissioner and affected individuals.

De-identify the data before proceeding. Remove all direct identifiers, then check for combinations that could still re-identify an individual. When in doubt, use synthetic data instead.

3

Verify data residency and provider policy

Will this data leave New Zealand, and does the provider retain it?

The cross-border transfer problem

When you send data to a commercial LLM via a browser or API, that data is processed on servers that may be located in the United States, Europe, or elsewhere. For government agencies, this constitutes a transfer of information outside New Zealand under IPP 12 of the Privacy Act 2020.

IPP 12 requires that you do not disclose personal information to a foreign person or entity unless you have reasonable grounds to believe the recipient provides comparable protections to New Zealand law, or the individual has authorised the transfer.

What to check in provider terms

  • Data retention for training: Does the provider use your inputs to train future models? (Many do by default; enterprise tiers often opt out.)
  • Data residency: Are there NZ or Australian data-region options? Is there a Data Processing Agreement (DPA) available?
  • Logging and access: Can provider staff access your submitted content for safety review?
  • Subprocessors: What third parties does the provider share data with?
Many free-tier AI tools explicitly state in their terms that submitted content may be used to improve their models. Pasting any data — even apparently innocuous test data — into these tools means that data may persist indefinitely on overseas servers.
Safe swap: Check whether your agency has a whole-of-government agreement for an AI tool with NZ/AU data residency (e.g., a Microsoft 365 Copilot enterprise agreement with AU data region). If not, use only locally-run or fully synthetic data.
Legal basis: Privacy Act 2020, IPP 12 (Disclosure of information outside New Zealand). GCDO Cloud Procurement guidance — Agencies must assess cloud providers against the Privacy Act and NZISM requirements.

Do not proceed until you have confirmed the provider's data residency region and training data policy. Raise the question with your ICT security team if you are uncertain.

4

Confirm your agency has approved this AI tool

Is this tool on the agency's approved software list, or has it been through security review?

Shadow AI is a real risk in NZ government

Shadow AI refers to the use of AI tools that have not been reviewed, approved, or procured through official channels. In a government agency, this is not just a policy violation — it can constitute a security incident, particularly if the tool processes information that falls under the PSR or Privacy Act.

Common shadow AI scenarios in testing teams:

  • Using a personal ChatGPT or Claude account for work tasks
  • Pasting bug report content or log files into a browser-based AI tool
  • Using a VS Code AI extension that was not procured or reviewed
  • Sharing test data with a publicly available AI "test generator"

How to check

Ask your ICT team, security team, or check your agency's intranet for an approved software list or AI use policy. Many NZ central government agencies published AI use policies in 2023–2025. The GCDO (Government Chief Digital Officer) has also issued cross-agency guidance on generative AI use.

If the tool is not on an approved list and there is no formal review in progress, do not use it for any work-related testing. The productivity gain is not worth the personal and organisational liability.
Safe swap: Use the agency's approved Microsoft 365 Copilot, or raise a request to your ICT team to assess the tool. Document the request so there is a paper trail.
Legal basis: NZISM Chapter 14 (Software Management); PSR — Governance and oversight requirements. Agency AUPs (Acceptable Use Policies) typically prohibit unapproved software. GCDO Generative AI Guidance (2024).

Raise a formal tool approval request with your ICT security team before using this tool for any government work. Use approved tools in the meantime.

5

Apply a de-identification protocol and document it

Have you replaced all real values with synthetic equivalents and recorded what you did?

Why documentation matters

If a privacy incident is later investigated, you need to demonstrate that reasonable steps were taken to protect the information. "I thought I removed all the names" is not a defence. A brief written record of what data you received, what substitutions you made, and what tool you used to make them is both protection for you and a repeatable process for your team.

A minimal de-identification checklist

  • List every field type in the source data (name, DOB, IRD, address, etc.)
  • For each field: confirm it has been removed, generalised (e.g., suburb → region), or replaced with a fictional value
  • Run a text search for NZ-specific patterns: \d{3}-\d{3}-\d{3} (IRD), NHI[A-Z]{2}\d{4} (NHI), \d{2}/\d{2}/\d{4} (DOB), real street names
  • Have a second person spot-check the final dataset if the source contained highly sensitive data
  • Record: date, source data description, substitution method, who performed it, who reviewed it

Minimum record to keep (one line is enough)

2026-04-24 | Test data for billing regression | Source: IRD sandbox export | De-identified: names → Faker NZ, IRD nos → random valid-format, addresses → Auckland suburbs only | Reviewed: nil (low sensitivity) | Tool used: ChatGPT Enterprise (AU region)

A substitution that is not documented is not auditable. If your agency is subject to an OIA request or Privacy Commissioner investigation, undocumented "I think I anonymised it" processes will not hold up.
Safe swap: Store de-identification records in your test management tool (e.g., as a test data note in Jira or Azure DevOps), not in a personal document. That way the record survives staff changes.
Legal basis: Privacy Act 2020, IPP 5 (storage security) and s113 (notifiable privacy breaches). Privacy Commissioner guidance on de-identification (2022). NZISM Chapter 16 (Disposal and destruction of data).

Complete the de-identification record before proceeding. Even a brief note is better than none — it demonstrates due diligence.

Quick reference: the five checks

# Check Fail condition Law / policy
1 Data classification Anything above UNCLASSIFIED PSR, NZISM Ch 2
2 PII present Any identifiable individual info, not fully de-identified Privacy Act 2020, IPP 1–13
3 Data residency & retention Unknown region, or provider retains for training Privacy Act 2020, IPP 12
4 Tool approval Tool not approved — shadow AI NZISM Ch 14, GCDO AI Guidance
5 De-identification documented No written record of what was substituted Privacy Act 2020 s113, NZISM Ch 16

When all five pass

If all five checks pass, you can proceed with reasonable confidence. However, "proceed" means use the tool with the specific dataset you have just checked — not a blanket clearance to use it with any future data. Repeat the checklist each time your data source or tool changes.

This is a minimum baseline, not a complete framework

Some agencies have additional requirements: sector-specific legislation (Health Information Privacy Code, Tax Administration Act, Immigration Act), classification requirements above UNCLASSIFIED for certain test data types, or security reviews that must be completed before any AI tool is used on a project. Always check your agency's AI use policy first — it may be stricter than this checklist.

Interview prep: common questions

What is the Privacy Act 2020 and why does it matter to testers?

The Privacy Act 2020 governs how agencies collect, store, use, and share personal information about New Zealand individuals. Testers matter because test data is often derived from real production data. If a tester pastes real customer or citizen data into an AI tool, they may be triggering an IPP violation — especially IPP 11 (limits on disclosure) or IPP 12 (cross-border transfers) — which can result in a notifiable privacy breach requiring disclosure to the Privacy Commissioner and affected individuals.

What is a notifiable privacy breach?

Under section 113 of the Privacy Act 2020, an agency must notify the Privacy Commissioner and affected individuals if it has suffered a privacy breach that it is reasonable to believe has caused, or is likely to cause, serious harm. Sending real citizen data to an unapproved overseas AI service would likely qualify — the data may have been retained, the agency may have lost control of it, and individuals could potentially be harmed if the provider is compromised or misuses the data.

What is the NZISM and who does it apply to?

The New Zealand Information Security Manual (NZISM) is the New Zealand Government's manual for information security. It is mandatory for Public Service agencies and strongly recommended for other Crown entities and local government. It covers data classification, system security, access controls, software management, and more. Testers working in agencies subject to NZISM must ensure that any tools they use — including AI tools — meet the relevant security requirements for the classification level of the data being processed.

How do you explain the data residency risk to a developer who says "it's just test data"?

Test data that came from a production export is not "just test data" — it is production data being used in a test context. Even genuinely synthetic data still represents the business logic and structure of the system. More importantly, once data is sent to a commercial LLM, you cannot retrieve it. Even if the provider claims they do not retain data, there is no audit trail proving this. The risk is asymmetric: the productivity gain is small and temporary; the liability from a breach is large and permanent. The correct response is to use synthetic data generated by a tool your agency controls.

What would you do if your team was already using an unapproved AI tool?

Raise it immediately with your team lead and ICT security team — not as an accusation but as a process gap. Frame it as: "I've noticed we're using [tool] — can you confirm it's been through security review?" If it hasn't, the response is to stop using it for work data until approval is in place, and to document that the concern was raised. Staying silent and continuing to use it is not a neutral act — it makes you complicit in the continued risk. If you raised it and were ignored, escalate to the security team directly.