Lesson 3 of 3 · Shift-Left & Shift-Right

Continuous Testing

Continuous testing is not running tests continuously. It’s having the right test feedback at the right time, for the right audience, at every stage from commit to production. DORA metrics measure whether you’ve got it right.

Shift-Left & Shift-Right CTFL v4.0 — Section 2.1.5 ~35 min read · ~70 min with exercises

1 The Hook

Two NZ SaaS companies. Company A: tests run in CI take 45 minutes. Developers merge PRs without waiting for results. Failed tests are “investigated tomorrow.” Build failures accumulate. Once a month, someone spends a week fixing the backlog. Deploy frequency: once per month, emergency deploys only.

Company B: CI takes 8 minutes (fast unit and integration tests) with slower E2E running in parallel on a separate job. Test failures block the PR — you cannot merge without a green build. Fix before merge is the norm. Deploy frequency: multiple times per day.

The difference is not test suite size or engineering talent. It’s pipeline design. Company B staged their tests by speed and coverage so developers get fast feedback when they need it. Company A ran everything together and made fast feedback impossible.

DORA Deployment Frequency is a proxy for test pipeline quality. You cannot deploy frequently if your pipeline takes 45 minutes.

2 The Rule

Continuous testing means the right tests run at the right time. Unit tests run in seconds. Integration tests run in minutes. E2E tests run overnight or on demand. Each layer gives different feedback at different speed.

3 The Analogy

Analogy

Continuous testing is like a hospital triage system.

Critical emergencies get treated immediately. Moderate injuries get seen within 2 hours. Routine check-ups are scheduled. The system does not treat everything with the same urgency — it matches resources to risk level. Running 600 E2E tests on every commit is like treating every patient as a code blue regardless of their condition. It wastes resources, creates bottlenecks, and delays the feedback that actually matters.

4 Watch Me Do It

A continuous testing pipeline for a NZ SaaS product, staged by speed and coverage.

Stage 1 Pre-commit — developer’s machine, <30 seconds

  • Linting and type checking (ESLint, TypeScript)
  • Unit tests affected by the changed files only (Jest — --testPathPattern)
  • Optional: mutation testing on changed functions (~2 min, run when time permits)

Audience: The developer, before they push. Feedback must be fast enough to not break their flow.

Stage 2 PR Pipeline — CI, <10 minutes (blocks merge)

  • All unit tests (~450 tests, avg 2ms each = ~1 second)
  • All API integration tests (~80 tests, avg 800ms each = ~65 seconds)
  • Chromium-only E2E smoke test — critical path only, ~15 tests
  • SAST security scan (Semgrep, Snyk)
  • Code coverage report (fails if coverage drops below threshold)

Audience: The developer and reviewers. Must be fast enough that people wait for it before merging.

Stage 3 Main Branch Pipeline — <20 minutes (runs after merge)

  • Full E2E suite — all browsers (Chromium, Firefox, WebKit)
  • Performance tests (k6 — critical API endpoints, response time thresholds)
  • Accessibility tests (axe-core — full page scan, WCAG 2.2 AA ruleset)
  • Visual regression (Percy — screenshot comparison on key pages)

Audience: QA and the team. Failures here block deployment to production. Fix within hours, not tomorrow.

Stage 4 Production — continuous, every 5 minutes

  • Synthetic monitoring — critical user journeys (login, checkout, key API calls)
  • Alerting on error rate spikes and p95 response time degradation
  • Business metric monitoring (booking conversion rate, payment success rate)

Audience: On-call engineers and QA. Failures here mean production is broken right now.

DORA metrics and what they reveal about your pipeline:

MetricWhat it measuresPipeline signal
Deployment FrequencyHow often you deploy to productionLow frequency = slow pipeline or no confidence in tests
Lead Time for ChangesCommit to production timeLong lead time = slow CI or manual gates
Change Failure Rate% of deployments causing incidentsHigh rate = insufficient test coverage or wrong tests
MTTRHow fast you recover from failuresLong MTTR = no synthetic monitoring, slow root-cause
Pro tip: If your CI takes more than 15 minutes, developers will not wait for it. They will merge, move on, and check the results an hour later. At that point, test failures have lost their urgency. Invest in pipeline speed — parallelise tests, use test impact analysis, cache aggressively — before adding more tests. A fast pipeline with fewer tests delivers more value than a slow one with comprehensive coverage.

5 When to Use It

Continuous testing is the right investment for any team shipping software more than once a week. If you deploy less than once a month:

  • You do not have continuous delivery — you have a waterfall or staged-gate release model
  • A continuous testing pipeline is the wrong starting point. Fix the delivery model first
  • Monthly deployments usually indicate risk aversion, not stability. The irony: low deployment frequency means each release is larger and riskier

For NZ SaaS companies deploying weekly or more: start with Stage 1 and Stage 2. Get those right before building Stage 3. Synthetic monitoring (Stage 4) should follow naturally once deployment confidence is established.

6 Common Mistakes

🚫 “I used to think: continuous testing means running the full E2E suite on every commit.”

Actually: Running 600 E2E tests on every commit would take hours. Stage your tests by speed and coverage. Fast tests (unit, API integration) gate PRs. Slow tests (full E2E, visual regression, performance) run after merge on the main branch. Each stage has a different audience and a different acceptable time budget.

🚫 “I used to think: DORA metrics are management vanity metrics.”

Actually: DORA metrics are leading indicators of delivery performance, and they correlate strongly with test pipeline quality. Teams in the DORA ‘Elite’ category (multiple deploys per day, MTTR under an hour) have fast, reliable pipelines that give developers confidence to deploy frequently. Teams with high Change Failure Rate almost always have either insufficient coverage or tests that do not match production conditions.

🚫 “I used to think: the CI pipeline is the DevOps team’s responsibility.”

Actually: DevOps maintains the CI infrastructure — the runners, caching, secrets management. QA is responsible for what runs in the pipeline: which tests, in which order, with what pass criteria. A CI pipeline that DevOps built without QA input will monitor infrastructure metrics but miss business-logic failures. QA must own the test pipeline design even if they don’t own the CI platform.

7 Now You Try

📋 Prompt Lab — Design a Continuous Testing Pipeline

Design a continuous testing pipeline for a NZ fintech that deploys microservices 10 times per day. The system has: 450 unit tests (avg 2ms each), 80 API integration tests (avg 800ms each), and 120 E2E tests (avg 35s each). Allocate these tests to the right pipeline stage and explain your rationale for each decision.

8 Self-Check

Click each question to reveal the answer.

What are the four DORA metrics and what does each measure?

Deployment Frequency (how often you deploy to production — daily or more is elite); Lead Time for Changes (commit-to-production time — under one hour is elite); Change Failure Rate (% of deployments that cause an incident — under 5% is elite); Mean Time to Restore, MTTR (how fast you recover from production failures — under one hour is elite). All four correlate with test pipeline quality.

Why should fast unit tests gate PRs but slow E2E tests not gate PRs?

PR gate tests must be fast enough that developers wait for results before merging. If a gate takes 45 minutes, developers merge without waiting and the gate becomes meaningless. Fast tests (unit, API integration, smoke E2E) can realistically run in 8–10 minutes and block merges. Full E2E suites belong after merge on the main branch where a slower pipeline is acceptable because the feedback is less time-sensitive.

Who should own the test pipeline design — QA or DevOps?

QA owns what runs in the pipeline and the pass criteria. DevOps owns the infrastructure the pipeline runs on. This is a shared responsibility, but the test logic — which tests, at which stage, with which failure thresholds — is a QA decision. A pipeline designed only by DevOps will optimise for infrastructure health, not business-logic correctness.

9 ISTQB Mapping

CTFL v4.0 Section 2.1.5 — Testing in DevOps. Continuous testing is explicitly named as a DevOps testing practice. The syllabus covers the relationship between CI/CD pipelines and test automation.

CTAL Test Automation Engineer (TTA) goes deeper on pipeline architecture. CTAL-TTA Section 5 covers CI/CD integration including test stage design, parallel execution, and test result reporting. DORA metrics are not ISTQB-mapped but are the industry-standard measurement framework for DevOps performance, which includes testing outcomes.