Chapter 3: Aggregate Functions and Window Functions

MIN, MAX, GROUP BY — Finding Extremes and Grouping Data

MIN and MAX find the extremes. GROUP BY splits your data into groups and runs aggregates on each group separately. Think of GROUP BY like sorting a deck of cards by suit — then counting how many cards are in each suit. Without GROUP BY, aggregates work on the entire table. With GROUP BY, they work on each group independently.

MIN and MAX — Finding Boundaries

min-max-examples.sqlsql

-- Cheapest and most expensive order
SELECT
  MIN(total_amount) AS smallest_order,
  MAX(total_amount) AS largest_order
FROM orders;

-- First and last order dates
SELECT
  MIN(order_date) AS first_order_ever,
  MAX(order_date) AS most_recent_order
FROM orders;

-- QA scenario: Detect outliers
-- If max order amount is 99,99,999 but your product max price is 50,000
-- something is very wrong
SELECT
  MIN(total_amount) AS min_amount,
  MAX(total_amount) AS max_amount,
  MAX(total_amount) - MIN(total_amount) AS range_spread
FROM orders
WHERE status = 'completed';

-- Earliest and latest login per user (detect dormant accounts)
SELECT
  user_id,
  MIN(login_at) AS first_login,
  MAX(login_at) AS last_login
FROM login_history
GROUP BY user_id
HAVING MAX(login_at) < CURRENT_DATE - INTERVAL '90 days';

MIN and MAX work on dates too. MIN(order_date) gives you the earliest date. MAX(order_date) gives the latest. This is how you verify "First order placed on" and "Last active on" fields in user profiles.

GROUP BY — Aggregate Per Category

Without GROUP BY, you get one row — one total for the whole table. With GROUP BY, you get one row per group. Group by status, you get totals per status. Group by customer, you get totals per customer. Group by date, you get totals per day.

group-by-examples.sqlsql

-- Orders grouped by status
SELECT
  status,
  COUNT(*) AS order_count,
  SUM(total_amount) AS total_revenue
FROM orders
GROUP BY status
ORDER BY order_count DESC;

-- Daily order summary
SELECT
  order_date,
  COUNT(*) AS orders_placed,
  SUM(total_amount) AS daily_revenue,
  ROUND(AVG(total_amount), 2) AS avg_order_value
FROM orders
GROUP BY order_date
ORDER BY order_date DESC;

-- Revenue by product category
SELECT
  p.category,
  COUNT(DISTINCT oi.order_id) AS orders_with_category,
  SUM(oi.quantity) AS units_sold,
  SUM(oi.quantity * oi.unit_price) AS category_revenue
FROM order_items oi
INNER JOIN products p ON oi.product_id = p.product_id
GROUP BY p.category
ORDER BY category_revenue DESC;

GROUP BY Execution Order

1

FROM — pick the table

2

WHERE — filter individual rows first

3

GROUP BY — split remaining rows into groups

4

Aggregate functions — run on each group

5

HAVING — filter groups (not individual rows)

6

SELECT — pick the columns to display

7

ORDER BY — sort the final result

Every column in SELECT must either be in the GROUP BY clause or inside an aggregate function. Writing SELECT customer_id, first_name, COUNT(*) GROUP BY customer_id will fail because first_name is not grouped or aggregated. Add first_name to GROUP BY or remove it from SELECT.

Q: Can you use GROUP BY with multiple columns?

A: Yes. GROUP BY customer_id, status creates a group for each unique combination of customer_id and status. Customer 101 with status "completed" is one group. Customer 101 with status "pending" is a separate group. This is useful when you need aggregates at a more granular level — like revenue per customer per month, or order count per category per status.

Key Point: MIN/MAX find extremes and boundaries. GROUP BY splits data into groups for per-category aggregation. Every non-aggregated column in SELECT must appear in GROUP BY.

Previous Up NextHAVING Clause — Filtering Groups, Not Rows