Here is where performance testing goes from a one-time activity to a continuous quality gate. If performance tests only run before a big release, they catch problems too late. By the time you find a slow query introduced 6 sprints ago, nobody remembers the change that caused it. Integrating performance tests into your CI pipeline means every build gets tested, and regressions are caught within hours, not months.
Not all performance tests belong in CI. A 4-hour soak test in every build would make your pipeline take all day. The key is to run lightweight tests in CI and reserve heavy tests for scheduled runs.
| Test Type | CI Pipeline | Scheduled (Weekly) | Pre-Release (Manual) |
|---|---|---|---|
| Smoke (10 users, 2 min) | Every build | N/A | N/A |
| Baseline (50 users, 10 min) | Every merge to main | N/A | N/A |
| Load Test (target users, 30 min) | No (too slow) | Weekly | Every release |
| Stress Test (2x target, 30 min) | No | No | Every major release |
| Spike Test (burst pattern, 20 min) | No | No | When architecture changes |
| Soak Test (normal, 4-8 hours) | No | No | Quarterly or before major releases |
name: Performance Smoke Test
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
performance-smoke:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install k6
run: |
sudo gpg -k
sudo gpg --no-default-keyring --keyring /usr/share/keyrings/k6-archive-keyring.gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys C5AD17C747E3415A3642D57D77C6C491D6AC1D68
echo "deb [signed-by=/usr/share/keyrings/k6-archive-keyring.gpg] https://dl.k6.io/deb stable main" | sudo tee /etc/apt/sources.list.d/k6.list
sudo apt-get update
sudo apt-get install k6
- name: Run Performance Smoke Test
run: k6 run tests/performance/smoke-test.js
env:
K6_OUT: json=results.json
BASE_URL: ${{ secrets.PERF_TEST_URL }}
- name: Check Results Against Budget
run: |
# Parse k6 JSON output and check against budgets
python3 scripts/check-perf-budget.py results.json
- name: Upload Results
if: always()
uses: actions/upload-artifact@v4
with:
name: performance-results
path: results.json
- name: Comment PR with Results
if: github.event_name == 'pull_request'
uses: actions/github-script@v7
with:
script: |
const fs = require('fs');
const results = JSON.parse(fs.readFileSync('results.json', 'utf8'));
// Post a summary comment on the PR
github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: `## Performance Smoke Test Results\n
| Metric | Value | Budget | Status |
|--------|-------|--------|--------|
| p95 Response Time | ${results.p95}ms | <1500ms | ${results.p95 < 1500 ? "PASS" : "FAIL"} |
| Error Rate | ${results.errorRate}% | <0.1% | ${results.errorRate < 0.1 ? "PASS" : "FAIL"} |
| Throughput | ${results.rps} RPS | >50 RPS | ${results.rps > 50 ? "PASS" : "FAIL"} |`
});The performance smoke test is designed to run in under 5 minutes with minimal resources. It does not validate capacity -- it catches regressions. If login was 200ms last build and is now 2,000ms, the smoke test catches it before it reaches production.
import http from "k6/http";
import { check, sleep } from "k6";
import { Trend, Rate } from "k6/metrics";
// Custom metrics for budget checking
const loginDuration = new Trend("login_duration");
const balanceDuration = new Trend("balance_duration");
const errorRate = new Rate("errors");
export const options = {
// Smoke test: 10 users for 2 minutes
vus: 10,
duration: "2m",
thresholds: {
login_duration: ["p(95)<1500"], // Same budgets as full test
balance_duration: ["p(95)<800"],
errors: ["rate<0.01"], // < 1% errors
http_req_duration: ["p(95)<2000"], // Overall p95
},
};
const BASE_URL = __ENV.BASE_URL || "https://staging.banking.app";
export default function () {
// Login
const loginRes = http.post(`${BASE_URL}/api/auth/login`, JSON.stringify({
username: `testuser_${__VU}`,
password: "Test@1234",
}), { headers: { "Content-Type": "application/json" } });
loginDuration.add(loginRes.timings.duration);
errorRate.add(loginRes.status !== 200);
check(loginRes, { "login 200": (r) => r.status === 200 });
if (loginRes.status !== 200) return;
sleep(2); // Minimal think time for smoke test
// Balance check
const token = loginRes.json("token");
const balanceRes = http.get(`${BASE_URL}/api/accounts/balance`, {
headers: { Authorization: `Bearer ${token}` },
});
balanceDuration.add(balanceRes.timings.duration);
errorRate.add(balanceRes.status !== 200);
check(balanceRes, { "balance 200": (r) => r.status === 200 });
sleep(2);
}Running performance tests in CI is only half the battle. You also need to track trends over time. A 10% regression per release compounds into a 2x slowdown over 8 releases. Store results in a time-series database (InfluxDB, TimescaleDB) or even a simple CSV file in your repository and chart them.
Store performance test results as build artifacts (JSON files) and build a simple dashboard that charts p95 response time, error rate, and throughput per build number. When a graph spikes, click through to the build, find the commit, and you have your root cause. Tools like Grafana with InfluxDB make this trivial.
CI performance tests must run against a dedicated, isolated environment -- never against a shared staging server. If another team deploys a broken build to staging during your CI test, your build fails and everyone wastes time debugging a false positive. Use a dedicated performance test environment or spin up ephemeral environments for each test run.
Key Point: Run lightweight performance smoke tests in every CI build to catch regressions early, and schedule full load tests weekly. Track results over time to detect gradual degradation before it compounds.