Response Time Deep Dive

Response time is the metric users feel directly. When someone says "the app is slow," they are talking about response time. It is measured from the moment a request leaves the client to the moment the complete response is received. But that single number hides a lot of complexity.

What Makes Up Response Time

A request travels through many layers. Each layer adds time. Understanding where time is spent is the key to fixing performance issues.

Response Time Breakdown

DNS Resolution (1-50ms) -- Converting the domain name to an IP address. Cached after the first request.

TCP Connection (10-100ms) -- Establishing the connection. TLS/SSL handshake adds more time.

Request Transfer (1-10ms) -- Sending the request data over the network.

Server Processing (10-5000ms) -- The server processing the request. This is usually the biggest chunk. Includes application logic, database queries, external API calls.

Response Transfer (1-1000ms) -- Sending the response back. Large payloads (images, big JSON) take longer.

Client Rendering (10-500ms) -- Browser parsing HTML, executing JavaScript, rendering the page. Not measured in API tests.

Types of Response Time

Type	What It Measures	When to Use
Time to First Byte (TTFB)	DNS + connection + server processing until first byte arrives	Server-side performance -- excludes network transfer
Full Response Time	Total time from request sent to last byte received	API testing -- the complete server round-trip
Page Load Time	Full response + browser rendering + all sub-resources	End-user experience -- what the user actually perceives
Time to Interactive (TTI)	When the page becomes usable (not just visible)	Frontend performance -- when users can actually click things

What Good Response Times Look Like

Users perceive speed in three thresholds. Under 100ms feels instant -- the user does not notice any delay. Between 100ms and 1 second feels responsive -- the user notices a pause but does not lose focus. Above 1 second, the user starts feeling frustrated. Above 3 seconds, many users abandon the action.

Operation	Target p95	Unacceptable	Rationale
API endpoint (simple)	< 200ms	> 1 second	APIs are building blocks -- slow APIs cascade
API endpoint (complex)	< 500ms	> 2 seconds	Report generation, search with filters
Web page load	< 2 seconds	> 5 seconds	Users expect pages to load in 2-3 seconds
Login/authentication	< 1 second	> 3 seconds	First interaction sets perception of speed
Checkout/payment	< 3 seconds	> 10 seconds	Users are anxious during payment -- slow = abandoned
File upload/download	Depends on size	No progress indicator	Users tolerate slow if they see progress

A slow API that is called 10 times per page load makes the page 10x slower. When measuring page performance, trace slow response times back to individual API calls. Often one slow API call is the root cause of a slow page.

Q: What is response time and what factors affect it?

A: Response time is the total duration from when a request is sent to when the complete response is received. It is broken down into DNS resolution (1-50ms), TCP/TLS connection (10-100ms), request transfer (1-10ms), server processing (10-5000ms -- the biggest factor), and response transfer (1-1000ms). Factors that affect it include: server-side processing speed (database queries, business logic), network latency, payload size, server load (more concurrent requests means more queuing), and infrastructure (CDN, load balancer, caching). For performance testing, I focus on the p95 response time because averages hide outliers.

Key Point: Response time is what users feel. Under 100ms feels instant, under 1 second feels responsive, above 3 seconds users leave. Always measure p95, not average. Trace slow pages back to individual slow API calls.

Key Point: Response time has thresholds: <100ms instant, <1s responsive, >3s users leave. Always measure p95, trace slow pages to individual APIs.

Previous Up NextPercentiles: The Honest Metric

Chapter 2: Key Metrics