Performance Testing — Senior Techniques

1 The Hook — The "Black Friday" Crash

Imagine a physical store in Wellington. Usually, 10 people shop there at a time. It works great. But on Black Friday, 1,000 people try to push through the front door at the exact same second. The door jams, the floorboards crack, and the staff quit. The store still "works" (the products are there), but nobody can buy anything.

This is what happens to your website without performance testing. Functional tests tell you it works for one person; Performance tests tell you it works for everyone.

2 The Rule — Load, Stress, and Spike

Performance Testing is the study of Throughput, Latency, and Stability under load.

Throughput: How much work can we do? (e.g. Orders per second).
Latency: How fast is each task? (e.g. Page load time).
Scalability: Can we add more "staff" (servers) to handle more "customers"?

3 The Analogy — The Motorway

Analogy

Throughput vs. Latency.

Think of the SH1 Wellington Motorway.
Latency is the time it takes for *your* car to get from A to B (Speed).
Throughput is the number of cars that can pass through the tunnel in one minute (Capacity).
You can have high speed (Low Latency) at 2 AM, but if there's only one lane, your Throughput is low. If you add 10 lanes, your Throughput goes up, but if the speed limit is 10 km/h, your Latency is still terrible. Performance Testing is finding the balance between lanes and speed.

4 Watch Me Do It — The "Wellington Wind" Load

Scenario: You're testing a NZ power company's portal. During a Wellington storm, everyone logs in at once to report power cuts.

Define the Goal: Handle 5,000 users at once with < 2 second load time.
Step 1: The Load Test: Ramp up to 5,000 users over 10 minutes. Result: The app stays fast. Great!
Step 2: The Spike Test: Trigger 5,000 users in 10 *seconds*. Result: The database CPU hits 100% and the app crashes. Bug Found!
The Fix: Implement a "Queue" or "Waiting Room" for high-traffic moments.
Step 3: Verification: Run the Spike Test again. Result: The app slows down slightly but stays alive. Pass!

5 Decision Tool — Which test do I run?

Test Type	Question it answers	Analogy
Load Testing	"Can we handle the expected traffic?"	A busy Saturday at the supermarket.
Stress Testing	"When and how will we eventually break?"	Putting weight on a bridge until it collapses.
Spike Testing	"Can we handle a sudden surge?"	Everyone rushing inside when it rains.
Soak Testing	"Does it fail after 24 hours of use?"	A long-distance marathon (checking for leaks).

6 Common Mistakes

🚫 Testing on a "Tiny" Dev Database

I used to think: The code is fast on my laptop, so it's fast everywhere.
Actually: A search that takes 0.1s on a database with 10 rows might take 10s on a database with 10 million rows. You must test with Production-scale data volumes.

🚫 Only measuring "Average" time

I used to think: The average response time is 1s, which is perfect.
Actually: If 90% of users get 0.5s but 10% get 30s, your "average" looks okay, but 1 in 10 customers is quitting. Use 90th or 99th percentiles (p90/p99) instead.

7 Now You Try — Calculate the Throughput

🎯 Interactive Exercise

Scenario: Your system processes 300 orders in 5 minutes. What is the Throughput per minute?

Math: 300 / 5 = 60. Simple, but critical for capacity planning!

8 Self-Check

Q1. What is a "Memory Leak"?

It's a bug where a program doesn't "clean up" its memory after it's finished with a task. Over time (Soak Testing), the app uses more and more memory until it eventually crashes. It's like a dripping tap filling up a bucket.

Q2. What is the "Knee" in a performance graph?

The Knee is the point where the response time starts to climb exponentially. It's the limit of your system's healthy capacity. Once you pass the knee, you are in the "Stress" zone.

9 Interview Prep

"What are the most important metrics to track during a load test?"

Answer: "I focus on four pillars: 1. Latency (specifically p95 response times), 2. Throughput (requests per second), 3. Error Rate (do we start failing under load?), and 4. Resource Utilization (CPU, Memory, and Database connections)."

10 Next Step

You've handled the traffic. Now, let's make sure everyone can use it. Next: Accessibility Testing.

Next: Accessibility Testing →