PARTITION BY inside OVER() is like GROUP BY for window functions. It divides the result set into groups (partitions), and the window function runs separately on each partition. Without PARTITION BY, the window covers all rows. With PARTITION BY, each partition gets its own independent calculation.
Imagine a building with multiple floors. Each floor is a partition. If you calculate ROW_NUMBER without PARTITION BY, you number everyone in the entire building from 1 to N. With PARTITION BY floor, you restart numbering on each floor — 1, 2, 3 on floor 1... then 1, 2, 3 on floor 2... independent counts.
-- Running total per customer (not across all customers)
SELECT
customer_id,
order_id,
order_date,
total_amount,
SUM(total_amount) OVER(
PARTITION BY customer_id
ORDER BY order_date
) AS customer_running_total
FROM orders
ORDER BY customer_id, order_date;
-- Result:
-- customer_id | order_id | order_date | total_amount | customer_running_total
-- 101 | 1 | 2024-01-01 | 500 | 500
-- 101 | 5 | 2024-01-15 | 300 | 800 <-- resets per customer
-- 101 | 9 | 2024-02-01 | 700 | 1500
-- 102 | 2 | 2024-01-03 | 400 | 400 <-- restarts here
-- 102 | 7 | 2024-01-20 | 600 | 1000-- Rank orders by amount within each customer
SELECT
customer_id,
order_id,
total_amount,
RANK() OVER(
PARTITION BY customer_id
ORDER BY total_amount DESC
) AS amount_rank_per_customer
FROM orders;
-- Each customer's order compared to their own average
SELECT
customer_id,
order_id,
total_amount,
ROUND(AVG(total_amount) OVER(PARTITION BY customer_id), 2) AS customer_avg,
total_amount - ROUND(AVG(total_amount) OVER(PARTITION BY customer_id), 2) AS diff_from_avg
FROM orders
ORDER BY customer_id, order_id;-- Each product's % contribution to its category's revenue
SELECT
p.category,
p.product_name,
SUM(oi.quantity * oi.unit_price) AS product_revenue,
SUM(SUM(oi.quantity * oi.unit_price)) OVER(
PARTITION BY p.category
) AS category_total,
ROUND(
SUM(oi.quantity * oi.unit_price) * 100.0 /
SUM(SUM(oi.quantity * oi.unit_price)) OVER(PARTITION BY p.category),
2
) AS pct_of_category
FROM order_items oi
INNER JOIN products p ON oi.product_id = p.product_id
GROUP BY p.category, p.product_name
ORDER BY p.category, pct_of_category DESC;PARTITION BY can take multiple columns. PARTITION BY category, status creates a separate partition for each combination of category and status. This is like GROUP BY with multiple columns — just for windows instead of collapsing.
Q: What is the difference between GROUP BY and PARTITION BY?
A: GROUP BY collapses rows into one row per group and is used with aggregate functions. PARTITION BY divides rows into groups for window functions but keeps all rows in the output. GROUP BY reduces the row count. PARTITION BY does not. If you GROUP BY customer_id with 5 customers, you get 5 rows. If you PARTITION BY customer_id, you still get all the original rows — but the window function calculates separately for each customer.
Key Point: PARTITION BY is GROUP BY for window functions. It creates independent partitions — the window function resets and recalculates for each partition. Rows are never collapsed.