Chapter 8: Analyzing Results and Bottleneck Identification
Time to put everything together. In this exercise, you will run a load test against the Shopping Portal, analyze the results, identify bottlenecks, and write a mini performance report. This is the closest thing to what you will do on a real project.
Create a JMeter test plan targeting the Shopping Portal with three endpoints: Homepage (/shopping), Product List (/shopping/products), and Product Detail (any product page).
Configure a Thread Group: 50 users, ramp-up over 60 seconds, hold for 3 minutes, then ramp down.
Add a Summary Report listener and configure results file output (.jtl).
Run the test in non-GUI mode with HTML report generation: jmeter -n -t test.jmx -l results.jtl -e -o ./report
Open the HTML report. Answer: What is the overall error rate? What is the p95 response time? What is the peak throughput?
Look at the Response Times Over Time chart. Is the line flat, rising, or spiking? At what user count did degradation start (if any)?
Look at the Aggregate Report. Which endpoint is slowest? Which has the highest error rate?
Write a 3-sentence executive summary of your findings.
Even without running a test, you can practice analysis skills. Study the following simulated data and answer the questions below.
Test Configuration:
- 300 concurrent users, ramp-up over 3 minutes, hold for 10 minutes
- 4 endpoints: /login, /dashboard, /api/transactions, /api/transfer
Aggregate Results:
────────────────────────────────────────────────────────────
Endpoint Samples Avg(ms) p50 p95 p99 Err%
/login 12,000 180 150 420 800 0.1%
/dashboard 30,000 220 180 550 1,200 0.2%
/api/transactions 25,000 450 300 1,800 4,500 1.5%
/api/transfer 8,000 2,100 800 8,200 15,000 6.8%
────────────────────────────────────────────────────────────
Server Metrics During Peak Load:
- App Server CPU: 55%
- App Server Memory: 72% (stable)
- DB Server CPU: 94%
- DB Active Connections: 19 of 20 (connection pool almost full)
- DB Slow Query Log: SELECT * FROM accounts WHERE user_id = ?
AND status = 'active' -- avg 1,200ms, called 33,000 times
Error Details:
- 95% of errors are HTTP 504 (Gateway Timeout) on /api/transfer
- 5% of errors are HTTP 500 on /api/transactionsUsing the simulated data above, create a findings table with the following SLA targets: Login p95 < 500ms, Dashboard p95 < 1 second, Transactions p95 < 2 seconds, Transfer p95 < 3 seconds, Overall error rate < 1%, System must support 300 concurrent users. Mark each as PASS or FAIL and write a one-paragraph executive summary with a go/no-go recommendation.
Practicing with simulated data is how you build pattern recognition. On a real project, you will not have time to ponder each chart for 30 minutes. You need to recognize "oh, app CPU is low but DB CPU is high -- it is a query problem" instantly. Run through these exercises multiple times until the analysis feels automatic.
Key Point: Practice makes the analysis automatic. Run real load tests against the Shopping Portal, study simulated data, and write findings tables until you can identify bottleneck patterns in minutes, not hours.
Key Point: Run load tests against the Shopping Portal, analyze reports top-down, correlate with server metrics, and practice writing findings tables with clear pass/fail verdicts.
Answer all 8 questions, then submit to see your score.
1. In a JMeter HTML report, which metric should you use instead of Average Response Time to understand realistic user experience?
2. During a load test, throughput increases proportionally with users, then plateaus, then DROPS while response times spike. What phase has the system entered?
3. Your load test shows 5% errors, all HTTP 502 Bad Gateway, but the application logs show no errors. What is the most likely explanation?
4. Application server CPU is 40%, database server CPU is 92%, and response times are increasing linearly with user count. Where is the bottleneck?
5. In a soak test under constant load, memory usage climbs steadily over 4 hours and never stabilizes. What does this indicate?
6. What does Little's Law state in the context of performance testing?
7. When presenting performance test results to a non-technical product manager, what should come FIRST in your report?
8. The p50 response time for an endpoint is 100ms, but the p99 is 5,000ms (a 50x ratio). What does this large gap indicate?