1. Why are percentiles preferred over averages for measuring response time?

Percentiles are faster to calculateAverages hide outliers -- p95 shows what 95% of users actually experiencePercentiles always give lower numbers, making reports look betterAverages only work for small datasets

2. What does Little's Law state?

Response Time = Throughput + Error RateConcurrent Users = Throughput × Average Response TimeThroughput = CPU Usage × Memory UsageError Rate = Failed Requests / Concurrent Users

3. What happens when you run a load test without think time?

The test fails to startResults are more accurate because there are no artificial delaysThe test generates 10-50x more load than real users, giving unrealistic resultsThink time has no impact on test results

4. If CPU usage is low but response time is high during a load test, what is the likely bottleneck?

The CPU needs to be upgradedAn external dependency (database, third-party API) is slowThere are too many virtual usersThe test script has bugs

5. What is an acceptable error rate during a load test with expected traffic?

Up to 10% is normalUp to 5% is acceptableBelow 0.1% -- near zero under expected loadAny error rate is acceptable if response time is fast

Metric	Use For	Healthy Value
p95 Response Time	SLA pass/fail	< 2s pages, < 500ms APIs
Throughput (RPS)	Capacity planning	Matches expected traffic
Error Rate	Reliability check	< 0.1% under normal load
CPU Usage	Server bottleneck	< 70% under load
Memory Usage	Leak detection	Stable, not growing
DB Connections	Pool exhaustion	< 80% of max pool

Metric	Use For	Healthy Value
p95 Response Time	SLA pass/fail	< 2s pages, < 500ms APIs
Throughput (RPS)	Capacity planning	Matches expected traffic
Error Rate	Reliability check	< 0.1% under normal load
CPU Usage	Server bottleneck	< 70% under load
Memory Usage	Leak detection	Stable, not growing
DB Connections	Pool exhaustion	< 80% of max pool

Summary

What You Learned

Quick Reference Card

Summary

What You Learned

Quick Reference Card