Chapter 10: Practice: Load Test a Web App

Running the Stress Test -- Find the Breaking Point

The load test told us the application handles expected peak load with some strain. Now the business wants to know: "What if the Diwali sale is even bigger than expected? What if we get 2.5x the traffic?" That is what stress testing answers. We push beyond the expected load, progressively, until something breaks. Then we know exactly where the ceiling is.

Think of it like this. The load test was checking if the bridge can handle Monday morning traffic. The stress test is driving increasingly heavy trucks across it until a cable snaps. You want to find that limit before a real truck does it.

Stress Test Strategy: Progressive Ramp

Do not just jump to 500 users. Increase gradually so you can identify exactly where performance degrades from "slower" to "broken." I recommend stepping up in increments of 50.

run-stress-test.shbash

# Create results directory
mkdir -p results/stress-test

# Run stress test -- 250 users, 120s ramp-up, 10 min duration
# The longer ramp-up lets us see degradation as users increase
jmeter -n -t shopping-portal-load-test.jmx \
  -Jthreads=250 \
  -Jrampup=120 \
  -Jduration=600 \
  -Jloops=5 \
  -l results/stress-test/results.jtl \
  -e -o results/stress-test/html-report

echo "Stress test complete!"
echo "Open results/stress-test/html-report/index.html in browser"

How to Identify the Breaking Point

The breaking point is not always a dramatic crash. More often, it is a gradual collapse. Here are the signs to watch for in the stress test results:

Stage	Users	What You See	What It Means
Normal	0-100	Stable response times, 0% errors	Application is comfortable
Strain	100-150	Response times increasing, still < SLA	Application is working harder but coping
Degradation	150-200	Response times exceed SLA, occasional errors	Some resource is becoming scarce (CPU, memory, DB connections)
Saturation	200-230	Error rate climbing, throughput plateaus	A resource is maxed out, requests are queuing
Breaking	230+	Errors spike, response times explode, throughput drops	System cannot handle more load, requests are failing or timing out

Stress Test Performance Curve

Comfort Zone

0-100 users: Fast, no errors, throughput grows linearly

→

Stress Zone

100-200 users: Slower but functional, throughput growth slows

→

Buckle Zone

200-230 users: SLA breaches, errors appear, throughput flat

→

Break Zone

230+ users: Errors spike, response times 10x+, throughput drops

Key Metrics at the Breaking Point

stress-test-results.txttext

Breaking Point Analysis (250 users)
=====================================

Transaction          | p50    | p95     | p99      | Error %  | TPS
---------------------+--------+---------+----------+----------+------
01_Homepage          | 1.8s   | 4.2s    | 8.1s     | 2.3%     | 12.1
02_Login             | 1.2s   | 3.8s    | 12.4s    | 5.1%     | 8.4
03_Search            | 2.1s   | 6.9s    | 15.2s    | 8.7%     | 6.2
04_ProductDetail     | 0.8s   | 2.1s    | 4.3s     | 1.1%     | 11.8
05_AddToCart         | 0.6s   | 1.8s    | 5.9s     | 3.4%     | 9.3
06_ViewCart          | 0.9s   | 2.8s    | 6.7s     | 2.0%     | 10.5
07_Checkout          | 3.2s   | 9.8s    | 22.1s    | 12.3%    | 4.1
08_OrderConfirmation | 0.7s   | 1.9s    | 3.8s     | 1.8%     | 11.2

Overall Error Rate: 4.6%
Overall Throughput: 74.6 RPS (peaked at 89 RPS at ~180 users, then declined)

Bottleneck identified: Checkout and Search endpoints
- Checkout error rate 12.3% -- likely database lock contention
- Search p99 at 15.2s -- likely missing database index on product search
- Throughput peaked at 180 users then declined -- classic saturation pattern

Look at that throughput pattern. It peaked at ~180 users and then started declining even as we added more users. This is the classic saturation curve. Adding more users beyond 180 does not increase throughput -- it just makes everything slower because requests are queuing up behind a bottleneck. The system's practical capacity is around 180 concurrent users.

During stress tests, monitor the JMeter machine itself. If YOUR machine runs out of CPU or memory, the bottleneck is in the test tool, not the application. JMeter on a standard laptop can comfortably drive about 300-500 threads. Beyond that, use distributed testing (Chapter 6) or switch to non-GUI mode.

Key Point: The breaking point is where throughput stops growing and errors start climbing. In our test, the system saturated at around 180 concurrent users. This is the number your stakeholders need to know.

Key Point: Stress testing reveals the breaking point where throughput plateaus and errors spike. Ramp up progressively and watch for the saturation curve.

Previous Up NextRunning the Spike Test -- Simulate a Flash Sale