Entry and exit criteria are the guard rails that prevent you from wasting time on bad tests and declaring victory on incomplete results. Without them, you end up in one of two traps: running tests on a broken environment and getting garbage data, or endlessly running "just one more test" because nobody defined what "done" looks like.
Entry criteria answer: "Should we even start this test?" If any entry criterion is not met, stop and fix it before wasting time on a test that will produce useless results. Think of it like a pre-flight checklist -- a pilot does not skip it because they are in a hurry.
| Entry Criterion | How to Verify | What Happens if Skipped |
|---|---|---|
| Build is stable (no critical bugs) | Functional test suite passes, no P0 bugs open | Performance data is corrupted by functional failures -- 500 errors look like performance issues |
| Test environment is provisioned and validated | Smoke test passes, resource monitoring is working | Results are environment artifacts, not application behavior |
| Test data is loaded and verified | Row counts match targets, sample queries return expected results | False positives (fast queries on empty tables) or errors (missing data) |
| Test scripts are reviewed and validated | Scripts run successfully with 1 user, correlation is correct | Script errors during load test waste the entire test window |
| Monitoring tools are configured | Grafana dashboards show live metrics, alerts are active | You see response times but miss the root cause (CPU at 100%, disk full, etc.) |
| Stakeholder sign-off on test plan | Email confirmation or meeting minutes | After results come in, stakeholders dispute the test methodology |
| Third-party mocks are deployed and validated | Mock endpoints respond with expected latency and data | Tests hit real services accidentally, or mocks return errors |
Exit criteria define "done." They prevent both premature declarations of success and endless test cycles. Each criterion should be binary -- pass or fail, no "kind of."
A performance budget is a hard limit on specific metrics that, if exceeded, blocks a release. It is the performance equivalent of saying "this build does not ship until unit tests pass." Without budgets, performance is always a suggestion -- "it would be nice if it were faster" is not enforceable.
# Performance Budget -- Banking Portal
# These are pass/fail gates. Exceeding ANY budget blocks the release.
response_time:
# Measured at p95 under target load (3,000 concurrent users)
login: 1500ms # Users abandon after 2s
view_balance: 800ms # Must feel instant
fund_transfer: 2000ms # Users tolerate more for money movement
transaction_history: 1500ms # Search + render
bill_payment: 2000ms
statement_download: 5000ms # Acceptable for PDF generation
error_rate:
overall: 0.1% # 1 in 1,000 requests
fund_transfer: 0.01% # 1 in 10,000 -- money movement has zero tolerance
throughput:
sustained_minimum: 300 # TPS for 30 minutes without degradation
peak_burst: 500 # TPS for 5-minute burst
resource_utilization:
cpu_average: 70% # Headroom for unexpected spikes
cpu_peak: 85% # Brief peaks are OK
memory_usage: 80% # Must not grow over time (leak detection)
disk_io_wait: 10% # Storage should not be the bottleneck
db_connection_pool: 80% # 20% headroom for connection spikes
stability:
memory_growth_per_hour: 50MB # Max acceptable -- indicates potential leak
gc_pause_p99: 200ms # GC pauses directly impact user experience
thread_count_growth: 0 # Thread count should stabilize, not climbQ: What entry and exit criteria do you define for performance testing?
A: For entry criteria, I verify five things before starting: the build is functionally stable (no P0 bugs), the test environment matches the specification in the test plan, test data is loaded with production-equivalent volume, test scripts have been validated with a 1-user dry run, and monitoring tools are active and collecting data. For exit criteria, I require: all critical scenarios executed at target load, p95 response times within the agreed SLA, error rate below 0.1%, system resources stable during steady state (no memory leaks, no CPU climbing), and results documented with server-side metrics correlated to client-side response times. If any exit criterion fails, I log it as a defect and the performance test status is "fail with known issues" rather than "pass."
Negotiate performance budgets BEFORE the first test, not after. Once stakeholders see the actual numbers, they will try to adjust the budget to match the results. "Well, 3 seconds is not that bad..." is easier to argue when there is no pre-agreed budget. Set budgets during the planning phase and get written sign-off.
Apdex (Application Performance Index) converts response times into a user satisfaction score between 0 and 1. It uses a threshold T (say 1 second) to classify every response as Satisfied (< T), Tolerating (T to 4T), or Frustrated (> 4T). The formula: Apdex = (Satisfied + Tolerating/2) / Total. An Apdex of 0.94 means 94% of users are satisfied. Many teams set a performance budget of Apdex > 0.85 or 0.90.
Threshold T = 1.0 second
Out of 10,000 requests:
Satisfied (< 1.0s): 8,500 requests
Tolerating (1.0s - 4.0s): 1,200 requests
Frustrated (> 4.0s): 300 requests
Apdex = (8,500 + 1,200/2) / 10,000
Apdex = (8,500 + 600) / 10,000
Apdex = 9,100 / 10,000
Apdex = 0.91
Interpretation:
1.00 = Every user is satisfied (unrealistic)
0.94 - 1.0 = Excellent
0.85 - 0.93 = Good
0.70 - 0.84 = Fair (some users frustrated)
0.50 - 0.69 = Poor (many users frustrated)
< 0.50 = UnacceptableKey Point: Entry criteria prevent testing on broken setups, exit criteria define when you are done, and performance budgets set hard pass/fail thresholds -- negotiate all of these before the first test, not after.