These three are the ones you will use most. COUNT to verify row counts. SUM to verify totals. AVG to verify averages. Let me show you how they work with real QA scenarios — not textbook examples.
-- Total orders in the system
SELECT COUNT(*) AS total_orders FROM orders;
-- Orders placed today
SELECT COUNT(*) AS today_orders
FROM orders
WHERE order_date = CURRENT_DATE;
-- Orders with a coupon applied (non-NULL coupon_code)
SELECT
COUNT(*) AS total_orders,
COUNT(coupon_code) AS orders_with_coupon,
COUNT(*) - COUNT(coupon_code) AS orders_without_coupon
FROM orders;
-- Count distinct customers who ordered this month
SELECT COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE order_date >= DATE_TRUNC('month', CURRENT_DATE);COUNT(DISTINCT column) is your best friend for deduplication checks. If COUNT(*) is 500 but COUNT(DISTINCT email) is 480, you have 20 duplicate email entries. That is a data quality issue worth investigating.
-- Total revenue from all completed orders
SELECT SUM(total_amount) AS total_revenue
FROM orders
WHERE status = 'completed';
-- QA scenario: Verify the "Total Revenue" on the admin dashboard
-- Dashboard shows: Rs 12,45,678
-- Your query should return the same number
SELECT SUM(total_amount) AS expected_revenue
FROM orders
WHERE status IN ('completed', 'delivered')
AND order_date BETWEEN '2024-01-01' AND '2024-12-31';
-- Sum of quantities ordered per product
SELECT
product_id,
SUM(quantity) AS total_units_sold
FROM order_items
GROUP BY product_id
ORDER BY total_units_sold DESC;-- Average order value
SELECT
ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders;
-- QA scenario: Average API response time per endpoint
SELECT
endpoint,
ROUND(AVG(response_time_ms), 2) AS avg_response_ms,
COUNT(*) AS total_requests
FROM api_logs
WHERE log_date = CURRENT_DATE
GROUP BY endpoint
ORDER BY avg_response_ms DESC;
-- Average order value for each customer
SELECT
customer_id,
ROUND(AVG(total_amount), 2) AS avg_order_value,
COUNT(*) AS total_orders
FROM orders
GROUP BY customer_id
ORDER BY avg_order_value DESC;AVG ignores NULL values. If you have 5 rows with values 10, 20, NULL, 30, 40 — AVG returns 25 (sum of 100 / 4 non-NULL rows), not 20 (sum of 100 / 5 total rows). If NULLs should count as zero, use AVG(COALESCE(column, 0)) instead.
Q: What is the difference between COUNT(*) and COUNT(column_name)?
A: COUNT(*) counts all rows regardless of NULL values. COUNT(column_name) counts only rows where that column is not NULL. Example: if you have 100 orders but only 60 have a shipping_date filled in, COUNT(*) returns 100 and COUNT(shipping_date) returns 60. Use COUNT(*) to count total rows. Use COUNT(column_name) to count how many rows have a value in that specific column.
Key Point: COUNT for "how many," SUM for "what is the total," AVG for "what is the average." These three cover 80% of your data validation needs.