Chapter 3: Aggregate Functions and Window Functions
WHERE filters individual rows before grouping. HAVING filters groups after aggregation. This is the key difference. You cannot use WHERE to say "only show customers with more than 5 orders" because the count does not exist yet when WHERE runs. That is where HAVING comes in.
-- Find customers with more than 5 orders
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(total_amount) AS total_spent
FROM orders
GROUP BY customer_id
HAVING COUNT(*) > 5
ORDER BY total_spent DESC;
-- Find product categories with revenue above 1,00,000
SELECT
p.category,
SUM(oi.quantity * oi.unit_price) AS category_revenue
FROM order_items oi
INNER JOIN products p ON oi.product_id = p.product_id
GROUP BY p.category
HAVING SUM(oi.quantity * oi.unit_price) > 100000;
-- WHERE + HAVING together
-- Find customers who placed more than 3 completed orders
SELECT
customer_id,
COUNT(*) AS completed_orders,
SUM(total_amount) AS total_revenue
FROM orders
WHERE status = 'completed' -- WHERE: filter rows first
GROUP BY customer_id
HAVING COUNT(*) > 3 -- HAVING: filter groups after
ORDER BY total_revenue DESC;-- Detect duplicate emails (data quality check)
SELECT
email,
COUNT(*) AS occurrences
FROM customers
GROUP BY email
HAVING COUNT(*) > 1;
-- Find days with suspiciously high order volume
-- (possible bot attack or test data leak)
SELECT
order_date,
COUNT(*) AS order_count
FROM orders
GROUP BY order_date
HAVING COUNT(*) > 1000
ORDER BY order_date DESC;
-- Products ordered in unusual quantities
-- (possible UI bug allowing qty > 100)
SELECT
product_id,
MAX(quantity) AS max_qty_ordered,
AVG(quantity) AS avg_qty_ordered
FROM order_items
GROUP BY product_id
HAVING MAX(quantity) > 100;A quick memory trick: WHERE filters the menu before cooking. HAVING tastes the dish after cooking and sends it back if it is not right. WHERE works on raw ingredients (rows). HAVING works on the final dish (groups).
Q: Can you use WHERE instead of HAVING with aggregate functions?
A: No. WHERE runs before GROUP BY, so aggregate values do not exist yet. Writing WHERE COUNT(*) > 5 will throw an error. You must use HAVING for conditions on aggregated values. However, you should use WHERE whenever possible because it filters rows early, reducing the amount of data GROUP BY needs to process. Use HAVING only when you need to filter on aggregate results.
Key Point: WHERE filters rows before grouping. HAVING filters groups after aggregation. Use WHERE for column values, HAVING for aggregate results. Always prefer WHERE when possible — it is faster.