Aggregation is where SQL gets powerful. COUNT, SUM, AVG, MIN, MAX — combined with GROUP BY and HAVING. These are the queries your manager will ask for when they want reports.
How much revenue per order status? This tells you completed vs cancelled vs pending amounts.
SELECT status,
COUNT(*) AS order_count,
SUM(total_amount) AS total_revenue,
ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
GROUP BY status
ORDER BY total_revenue DESC;Find the 5 customers who spent the most. Only count completed orders.
SELECT c.customer_id, c.first_name, c.last_name,
COUNT(o.order_id) AS total_orders,
SUM(o.total_amount) AS total_spent
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.customer_id, c.first_name, c.last_name
ORDER BY total_spent DESC
LIMIT 5;For each product category, show: total products, total units sold, total revenue generated, and average rating.
SELECT p.category,
COUNT(DISTINCT p.product_id) AS total_products,
COALESCE(SUM(oi.quantity), 0) AS units_sold,
COALESCE(SUM(oi.quantity * oi.unit_price), 0) AS revenue,
ROUND(AVG(r.rating), 1) AS avg_rating
FROM products p
LEFT JOIN order_items oi ON p.product_id = oi.product_id
LEFT JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.category
ORDER BY revenue DESC;Show revenue per month for completed orders. This is a classic reporting query.
SELECT
DATE_FORMAT(order_date, '%Y-%m') AS month,
COUNT(*) AS order_count,
SUM(total_amount) AS monthly_revenue,
ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
WHERE status = 'completed'
GROUP BY DATE_FORMAT(order_date, '%Y-%m')
ORDER BY month;Find cities that have more than 2 customers. This uses HAVING.
SELECT city, COUNT(*) AS customer_count
FROM customers
GROUP BY city
HAVING COUNT(*) > 2
ORDER BY customer_count DESC;Show how much revenue came from each payment method. Only count successful payments.
SELECT payment_method,
COUNT(*) AS transaction_count,
SUM(amount) AS total_amount,
ROUND(AVG(amount), 2) AS avg_transaction,
MIN(amount) AS smallest_payment,
MAX(amount) AS largest_payment
FROM payments
WHERE payment_status = 'success'
GROUP BY payment_method
ORDER BY total_amount DESC;Find products that need improvement. Average rating below 4, with at least 1 review.
SELECT p.product_id, p.product_name, p.category,
COUNT(r.review_id) AS review_count,
ROUND(AVG(r.rating), 1) AS avg_rating,
MIN(r.rating) AS worst_rating,
MAX(r.rating) AS best_rating
FROM products p
JOIN reviews r ON p.product_id = r.product_id
GROUP BY p.product_id, p.product_name, p.category
HAVING AVG(r.rating) < 4
ORDER BY avg_rating ASC;Categorize customers by order frequency. "1 order" = One-time, "2-3 orders" = Regular, "4+ orders" = Loyal. Show count in each bucket.
SELECT
CASE
WHEN order_count = 1 THEN 'One-time'
WHEN order_count BETWEEN 2 AND 3 THEN 'Regular'
ELSE 'Loyal'
END AS customer_segment,
COUNT(*) AS customer_count
FROM (
SELECT customer_id, COUNT(*) AS order_count
FROM orders
GROUP BY customer_id
) AS customer_orders
GROUP BY
CASE
WHEN order_count = 1 THEN 'One-time'
WHEN order_count BETWEEN 2 AND 3 THEN 'Regular'
ELSE 'Loyal'
END
ORDER BY customer_count DESC;COUNT(*) counts all rows. COUNT(column) counts non-NULL values. COUNT(DISTINCT column) counts unique values. In interviews, they will test if you know the difference. Always clarify what you are counting.
Q: Can you use WHERE with aggregate functions? Why or why not?
A: No. WHERE filters individual rows BEFORE aggregation happens. If you write WHERE COUNT(*) > 5, you get an error. You must use HAVING for aggregate conditions. The execution order is: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY. By the time WHERE runs, there are no groups yet — so aggregate functions are meaningless. HAVING runs after GROUP BY, so it can filter on aggregates.
Key Point: GROUP BY + HAVING is how you summarize and filter data. Know the execution order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY.