There is no single "performance test." There are five distinct types, each targeting a different risk. A load test that proves your system handles normal traffic tells you nothing about what happens during a Black Friday spike. A spike test that shows recovery after a burst tells you nothing about a memory leak that crashes the system after 48 hours. You need different types for different risks.
Load testing is the most common type. You simulate the expected number of users doing realistic things. If your app normally has 500 concurrent users, you generate 500 virtual users and see what happens. The goal is to verify that the system meets performance requirements under expected conditions.
Think of load testing like testing a bridge. You know 100 cars cross it per hour normally. So you put 100 cars on it and check: does the bridge hold? Is there traffic congestion? Is anyone delayed? If the bridge handles 100 cars smoothly, you have confidence in normal operations.
Stress testing pushes beyond expected load to find the breaking point. Where load testing asks "can it handle normal traffic?", stress testing asks "how far can we push it before it breaks?" You gradually increase load from expected (500 users) to extreme (5,000 users) and observe when response times degrade, errors spike, or the system crashes.
The value of stress testing is not finding that the system breaks -- every system breaks eventually. The value is knowing WHERE it breaks. Does the database run out of connections at 1,200 users? Does the API gateway start returning 503s at 2,000 users? Does the app server run out of memory at 3,500 users? Knowing these limits lets you plan capacity and set up alerts.
Spike testing simulates a sudden traffic burst. Not a gradual increase -- a near-instant jump. Think flash sales, viral social media posts, breaking news, or the moment a Super Bowl ad airs. Your system goes from 100 users to 5,000 users in 30 seconds. The questions: Does the system survive the spike? Does it recover when the spike subsides? How quickly does it return to normal?
The classic spike test story: an e-commerce site runs a "deal of the day" at noon. Traffic jumps 20x in 2 minutes. The auto-scaler takes 3 minutes to spin up new instances. For that 3-minute window, every user sees a loading spinner. The fix? Pre-warm the auto-scaler before deal time. But you would never discover this without a spike test.
Soak testing runs expected load for an extended period -- 4 hours, 12 hours, 24 hours, even 72 hours. The goal is to find problems that only appear over time. Memory leaks are the classic example. Your app uses 500MB of memory on startup. After 8 hours under load, it uses 4GB. After 24 hours, it runs out of memory and crashes. A 30-minute load test would never catch this.
Scalability testing measures how performance changes when you add resources. If you double the number of servers, does throughput double? It should, but often it does not due to database bottlenecks, shared state, or lock contention. Scalability testing tells you whether throwing more hardware at the problem will actually solve it.
| Test Type | What It Tests | Duration | Users | Key Question |
|---|---|---|---|---|
| Load | Normal conditions | 30-60 min | Expected (500) | Can it handle normal traffic? |
| Stress | Breaking point | 30-60 min | Beyond expected (5000) | Where does it break? |
| Spike | Sudden burst | 15-30 min | Sudden jump (100→5000) | Does it survive a burst? |
| Soak | Long-term stability | 4-72 hours | Expected (500) | Does it degrade over time? |
| Scalability | Resource scaling | 30 min x N runs | Increasing | Does more hardware help? |
Every project should start with load testing. It is the baseline. After that, the type depends on your risk profile.
Never run performance tests against production without explicit approval and during low-traffic windows. Performance tests generate massive load that can bring down a production system. Use a staging or pre-production environment that mirrors production as closely as possible.
Q: Explain the difference between load testing, stress testing, and spike testing with examples.
A: Load testing simulates expected traffic -- 500 users browsing for 30 minutes -- to verify the system meets performance requirements. Stress testing goes beyond expected traffic -- gradually increasing from 500 to 5,000 users -- to find the breaking point and understand failure behavior. Spike testing simulates a sudden burst -- jumping from 100 to 5,000 users in 30 seconds -- to test the system's ability to handle and recover from traffic surges. Example: for an e-commerce site, a load test verifies normal daily traffic, a stress test finds the maximum capacity for Black Friday planning, and a spike test simulates the moment a viral tweet sends 10x normal traffic.
Key Point: Five types of performance testing target five different risks: Load (normal traffic), Stress (breaking point), Spike (sudden bursts), Soak (long-term stability), and Scalability (resource scaling). Start with load testing, then choose based on your risk profile.
Key Point: Five types: Load (normal), Stress (breaking point), Spike (sudden burst), Soak (long-term), Scalability (resource scaling)