Senior · Integration Technique

Microservices Testing

Q: Q1. What's the difference between a contract test and an integration test?

Contract test: Tests that a service honours its interface. Order Service (consumer) tests that Inventory Service returns the expected response shape. Runs locally, mocks the provider. Fast. Integration test: Tests that multiple services work together. Spins up real service instances, simulates real requests and failures. Slower but catches real issues like service discovery, environment variables, timing.

In a monolith, you test locally. In microservices, services live in separate codebases, deploy independently, and fail independently. Testing becomes distributed debugging. One failed call to a downstream service cascades through the entire system.

Senior ISTQB CTAL-TTA 6.2 — K4 Analyse ~13 min read + exercise

1 The Hook — Why This Matters

In 2023, a major NZ fintech company deployed a payment processing system built on microservices: a Payment Service, Fraud Detection Service, Notification Service, and Ledger Service. Each team owned their service, deployed independently, and trusted that interfaces were stable. Everything worked in production for three months. Then, without warning, 40% of payments were failing.

Investigation revealed: the Fraud Detection Service had added a new required field, riskScore, but hadn't informed the Payment Service. The Payment Service was calling Fraud Detection without the field. Fraud Detection rejected the request with a cryptic error. The Payment Service retried, then timed out, then crashed. Customers couldn't pay. The ledger got out of sync. Recovery took six days.

Microservices testing finds these breaks before they hit production. You must test service boundaries, contract violations, failure modes, and data consistency across multiple services. A single failing service can bring down the entire system.

2 The Rule — The One-Sentence Version

If services have contracts, test the contracts. If services depend on each other, test the failure modes. If data crosses service boundaries, test consistency.

Microservices testing spans three layers: (1) Contract Testing — does each service provide what consumers expect? (2) Integration Testing — do multiple services work together under normal and failure conditions? (3) Observability Testing — can you trace, debug, and understand what happened when things fail? Test layers in isolation: contract tests are fast and run per-service, integration tests are slower and run across services, observability tests validate that you can detect and debug problems.

3 The Analogy — Think Of It Like...

Analogy

Testing a supply chain with multiple vendors, not a single factory.

In a single factory (monolith), you control everything. In a supply chain (microservices), each vendor operates independently. The vendor who supplies motors doesn't know about the vendor who supplies wheels. The factory (orchestrator) must negotiate contracts with each vendor, handle delays when a vendor is slow, reroute orders if a vendor fails, and reconcile inventory across all vendors. Test that each vendor delivers to spec, that the factory can recover if a vendor disappears, and that inventory stays consistent even when some vendors are down.

Senior engineer insight

For years I treated microservices testing as a harder version of integration testing — spin up more services, write more mocks, done. What changed my thinking was a production incident where our PACT tests were all green but a NZ payments platform was dropping roughly one in twelve transactions. The contract was correct; the semantics were wrong. The provider was returning available: true on items that were soft-reserved by another service — something no schema check could catch. Real microservices testing has to cover the data lifecycle across service boundaries, not just the wire format.

The most common mistake senior testers make: trusting that passing contract tests mean services are compatible — they prove the interface shape, not the business logic flowing through it.

From the field

On a NZ government digital-services platform we had twelve microservices behind an Istio service mesh with mTLS enforced between every hop. The team had excellent unit coverage and a solid PACT pipeline — and we all assumed the mesh would handle resilience for us. What we discovered during a controlled chaos experiment was that when the identity service slowed under load, the mesh's default retry budget caused a retry storm: downstream services started retrying simultaneously, the identity service CPU spiked to 100%, and the entire platform fell over in forty seconds. The thing that generalises: infrastructure-level resilience (service mesh retries, load balancer timeouts) and application-level resilience (circuit breakers, backpressure) can amplify each other catastrophically if you have not tested them together under realistic load.

4 Watch Me Do It — Step by Step

Here is a real NZ e-commerce example: an Order Service calls an Inventory Service and a Shipping Service. Follow these steps to test the boundaries.

Define and test service contracts Use PACT (consumer-driven contracts). The Order Service (consumer) defines: "I expect Inventory Service to accept a POST /check-stock with {sku, qty} and return {available: boolean, quantity: number}." Write a PACT test that mocks Inventory Service. Your test passes locally. Then, in CI, Inventory Service runs the same PACT against its real implementation. If the contract is broken, the build fails before anyone merges.

// Order Service (consumer) PACT test
const pact = new Pact({consumer: 'OrderService', provider: 'InventoryService'});
pact.addInteraction({
  state: 'product SKU-001 has 50 units',
  uponReceiving: 'a stock check request',
  withRequest: {method: 'POST', path: '/check-stock', body: {sku: 'SKU-001', qty: 5}},
  willRespondWith: {status: 200, body: {available: true, quantity: 50}}
});
expect(inventoryClient.checkStock('SKU-001', 5)).resolves.toEqual({available: true, quantity: 50});

Test backward compatibility When Inventory Service adds a new field (e.g., warehouseId), Order Service must still work if it ignores the field. Test: deploy Inventory Service with the new field, run Order Service tests. They should pass. Then, add warehouseId to the response and verify Order Service ignores unknown fields without breaking.
Found: Order Service was deserializing the response using strict schema validation. When Inventory Service added warehouseId, Order Service crashed trying to deserialize an unexpected field.
Test service boundaries with network mocks Use WireMock or Mountebank to simulate service behaviour. Test happy path: Order Service calls Inventory, gets a response, calls Shipping. Test failures: Inventory Service returns 500, Shipping Service times out, Inventory responds with a malformed JSON. For each failure, verify Order Service handles it gracefully (retries, circuit break, fallback, or explicit error).
```
// WireMock setup for Inventory failure
stubFor(post(urlEqualTo('/check-stock'))
  .willReturn(aResponse()
    .withStatus(503)
    .withBody("Service Unavailable")));

// Order Service should retry or fallback
expect(orderService.createOrder({...})).rejects.toThrow('InventoryServiceUnavailable');
```
Test timeout and resilience patterns If Inventory Service is slow (responds after 10 seconds), Order Service should timeout after 3 seconds, not wait forever. Test: configure WireMock to delay responses by 10 seconds, verify Order Service times out and handles it. Test circuit breaker: if Inventory Service fails 5 times, stop calling it for 60 seconds (circuit open). After 60 seconds, try once (circuit half-open). If it succeeds, circuit closes.
Pattern: Use timeout + retry + circuit breaker. Timeout alone risks cascading failures. Retry alone risks overwhelming a failing service. Circuit breaker alone risks silent failures. Combined, they prevent cascades and allow graceful recovery.
Test eventual consistency and data synchronization When Order Service places an order, it writes to its own database, then publishes an "OrderCreated" event. Inventory Service listens and decrements stock asynchronously. If the event broker is down, or Inventory Service is slow, the data falls out of sync. Test: place an order, verify it's written to Order database, then verify (after a delay) that Inventory stock is decremented. Test the failure case: Order succeeds, but event broker is down. Inventory stock is not decremented. When the event broker recovers, orders should be reprocessed.

Test with Docker Compose locally Spin up all three services (Order, Inventory, Shipping) in Docker Compose on localhost. Run integration tests against real service instances, not mocks. This catches environment-specific issues: database connection strings, service discovery, startup order, and timing.

# docker-compose.yml
services:
  order-service:
    build: ./order-service
    ports: ["8001:8080"]
    environment:
      INVENTORY_URL: "http://inventory-service:8080"
  inventory-service:
    build: ./inventory-service
    ports: ["8002:8080"]
  shipping-service:
    build: ./shipping-service
    ports: ["8003:8080"]
    depends_on: [order-service, inventory-service]

Test distributed tracing and correlation IDs Assign a unique correlation ID to each order request. Pass the ID through all service calls. When debugging a failed order, search logs across all services for the same correlation ID. Verify that each service logs the ID and passes it downstream. Test: create an order with correlation ID abc123, verify it appears in Order Service logs, Inventory Service logs, and Shipping Service logs.
Pattern: Use OpenTelemetry or similar to auto-inject correlation IDs. Without this, debugging is impossible: you see a failed order in logs but can't trace what happened in downstream services.

Pro tip: Use TestContainers to spin up real service instances (e.g., PostgreSQL, RabbitMQ, Redis) in Docker containers for each test run. It's slower than mocks but catches real integration issues. Run fast contract tests and mock-based tests in CI, save container-based tests for a pre-deployment gate.

5 When to Use It / When NOT to Use It

✅ Prioritise microservices testing when...

Services call other services (synchronous or async)
Services own separate databases (eventual consistency risk)
Services deploy independently (contract breaking risk)
You have more than 3 services in your architecture
You need to understand failure cascades
SLAs require 99.9%+ availability

❌ Don't fall into these traps...

Testing only happy path with all services running
Skipping contract tests because you "own" both services
Ignoring timeout and retry logic in favour of "fast" tests
Testing without correlation IDs (impossible to debug)
Deploying to staging/prod without chaos testing
Assuming eventual consistency will "eventually" be consistent

6 Common Mistakes — Don't Do This

❌ Testing only the happy path with all services running

I used to think: If Order, Inventory, and Shipping all run and the order succeeds, the system is working.
Actually: You need to test failures: Inventory times out, Shipping is offline, the event broker is down. Test each failure mode in isolation so you understand what happens and whether error handling is correct. In production, failures will happen. Your test suite should reveal every scenario.

❌ Skipping contract tests because you own both services

I used to think: Order Service and Inventory Service are both owned by the same team, so we can skip PACT and just test integration directly.
Actually: Different developers work on each service, they deploy on different schedules, and they can introduce breaking changes without realising. Contract tests catch these breaks before they hit production. They're not extra work; they prevent the six-day outage.

❌ Ignoring timeout and circuit breaker patterns

I used to think: If Inventory Service is slow, Order Service will just wait longer. No problem.
Actually: If Order Service waits indefinitely, its thread pool exhausts, it stops accepting new requests, and the entire system grinds to a halt. Timeouts prevent cascading failures. Retries allow transient failures to recover. Circuit breakers prevent hammering a failing service. Test all three patterns together.

7 Now You Try — Interview Warm-Up

🎯 Interactive Exercise

Question: You're testing a checkout flow: Cart Service → Payment Service → Notification Service. Payment Service occasionally fails with a 500 error. Your team says "that's unlikely in production, so we won't test it." What's your response, and what tests would you write?

Think about this before revealing.

Your response:

"If it's unlikely, fine. But unlikely doesn't mean never. In production, services fail. Let's test what happens: (1) Payment Service returns 500 once, then succeeds on retry. (2) Payment Service returns 500 five times in a row (circuit break?). (3) Payment Service times out. (4) Payment Service succeeds but Notification Service fails — does the payment still get recorded?"

Tests to write:

Mock Payment Service to return 500, then 200. Verify Cart Service retries and succeeds.
Mock Payment Service to return 500 five times. Verify Cart Service circuit breaks (stops retrying) and returns a user-friendly error.
Mock Payment Service to timeout. Verify Cart Service times out (not waits forever) and rolls back the order.
Mock Notification Service to fail after Payment succeeds. Verify payment is recorded in the database and notification is queued for retry.

Why teams fail here

Writing contract tests only for the response schema and never for provider state — a contract can be green while the provider returns logically incorrect data for a given business state.
Treating eventual consistency as an implementation detail rather than a test concern — teams check that an event was published but never verify it was consumed, reprocessed after broker restart, or handled idempotently on duplicate delivery.
Testing services in isolation with perfect mocks in CI but deploying to a service-mesh environment where TLS policies, retry budgets, and circuit-breaker defaults are set by platform engineers — the gap between mock and mesh has caused multiple production outages on NZ distributed platforms.
Failing to include correlation IDs in test assertions — teams verify the end state but cannot reproduce or triage failures because no test ever validated that the trace ID propagated end-to-end through all hops.

Key takeaway

In a distributed system, a green test suite that never kills a downstream service tells you nothing useful about what happens in production — test the boundaries, the failures, and the data flow, not just the happy path.

8 Self-Check — Can You Actually Do This?

Click each question to reveal the answer. If you got all three, you're ready to test microservices.

Q1. What's the difference between a contract test and an integration test?

Contract test: Tests that a service honours its interface. Order Service (consumer) tests that Inventory Service returns the expected response shape. Runs locally, mocks the provider. Fast. Integration test: Tests that multiple services work together. Spins up real service instances, simulates real requests and failures. Slower but catches real issues like service discovery, environment variables, timing.

Q2. What is eventual consistency and why is it hard to test?

Eventual consistency means data is not synchronous across services. Order Service writes an order, publishes an event, and Inventory Service listens and decrements stock asynchronously. If Inventory is slow or offline, the data falls out of sync temporarily. It's hard to test because you can't just check the inventory immediately after placing an order; you must poll/wait and verify it eventually updates. Test with realistic delays and failure scenarios (event broker down, Inventory Service down).

Q3. Why is a correlation ID important in microservices testing?

When an order request flows through Order → Inventory → Shipping → Notification, each service logs events. A correlation ID (a unique UUID per request) ties all these logs together. When debugging a failed order, search logs for the correlation ID and see the entire request flow across all services. Without it, you see "payment failed" in the database but can't trace what actually happened in the Inventory or Shipping services.

9 Interview Prep — Common Questions

Q. "How do you test service contracts?"

I use PACT (consumer-driven contracts). The Order Service (consumer) defines what it expects from Inventory Service (provider). I write a PACT test that mocks Inventory Service with that expectation. The test passes locally. Then, in CI, Inventory Service runs the same PACT against its real implementation. If the contract is broken, the build fails. This catches breaking changes before they reach production. It's fast, automated, and prevents the "I deployed and broke a downstream service" surprise.

Q. "How do you handle service timeouts and cascading failures?"

I implement three layers: (1) Timeout: if Inventory Service doesn't respond within 3 seconds, stop waiting. (2) Retry: retry once, with exponential backoff. (3) Circuit Breaker: if Inventory fails 5 times, open the circuit and stop calling it for 60 seconds. This prevents exhausting thread pools and overwhelming a failing service. I test each pattern: mock Inventory to respond slowly, verify timeout triggers. Mock Inventory to fail repeatedly, verify circuit opens.

Q. "How do you test eventual consistency?"

I place an order, verify it's written to the Order database immediately, then poll the Inventory database until stock is decremented or timeout. I test the failure case: place an order, kill the event broker, verify inventory is not decremented, restart the event broker, verify the order is reprocessed and inventory eventually decrements. I use TestContainers to spin up real databases and message brokers, not mocks. This catches real timing issues.

Q. "What's your strategy for testing with Docker Compose locally?"

I write a docker-compose.yml that spins up all services (Order, Inventory, Shipping), databases (PostgreSQL), and brokers (RabbitMQ) on localhost. I configure service discovery so each service knows how to find the others (e.g., INVENTORY_URL=http://inventory-service:8080). I run integration tests against this local environment before committing. This catches startup order issues, environment-specific bugs, and timing problems that mocks never reveal. It's slower than unit tests but faster than deploying to staging.

All Senior learning Message Queue & Event Testing →

Microservices Testing

1 The Hook — Why This Matters

2 The Rule — The One-Sentence Version

3 The Analogy — Think Of It Like...

4 Watch Me Do It — Step by Step

5 When to Use It / When NOT to Use It

6 Common Mistakes — Don't Do This

7 Now You Try — Interview Warm-Up

8 Self-Check — Can You Actually Do This?

Related techniques

9 Interview Prep — Common Questions