What are Window Functions?
Window functions perform calculations across a set of rows related to the current row, without collapsing rows (unlike GROUP BY).
You get aggregated values + row-level data together.
Window functions calculate values over a window (set of rows) while keeping each row visible.
🔹 Why Window Functions Are Important
- Needed for ranking
- Needed for running totals
- Needed for top N per group
- Used heavily in ETL transformations
- Avoids complex subqueries
🔹 Basic Syntax
function_name (expression)
OVER (
PARTITION BY column
ORDER BY column
ROWS / RANGE clause
)
🔹 Example Table: sales
|
emp |
dept |
amount |
|
A |
HR |
5000 |
|
B |
HR |
7000 |
|
C |
IT |
9000 |
|
D |
IT |
6000 |
🔹 Example 1: Running Total
SELECT emp, dept, amount,
SUM(amount) OVER (
PARTITION BY dept
ORDER BY amount
) AS running_total
FROM sales;
➡ Calculates cumulative sum per department
🔹 Example 2: Ranking Employees by Salary
SELECT emp, dept, amount,
RANK () OVER (
PARTITION BY dept
ORDER BY amount DESC
) AS rank_in_dept
FROM sales;
🔹 Example 3: Top 1 Salary per Department
SELECT *
FROM (
SELECT emp, dept, amount,
ROW_NUMBER() OVER (
PARTITION BY dept
ORDER BY amount DESC
) AS rn
FROM sales
) t
WHERE rn = 1;
🔹 Common Window Functions :
🔸 Ranking Functions
· ROW_NUMBER()
· RANK()
· DENSE_RANK()
🔸 Aggregate Window Functions
· SUM()
· AVG()
- COUNT()
- MAX(), MIN()
🔸 Analytical Functions
- LAG()
- LEAD()
- FIRST_VALUE()
- LAST_VALUE()
🔹 LAG & LEAD (ETL Usage)
SELECT emp, amount,
LAG(amount) OVER (ORDER BY emp) AS prev_amount
FROM sales;
➡ Used for change detection / delta logic
🔹 GROUP BY vs Window Functions (Important)
|
GROUP BY |
Window Function |
|
Aggregates rows |
Keeps all rows |
|
Reduces output rows |
Same number of rows |
|
Cannot show row-level + aggregate |
Can show both |
🔹 ETL / DataStage Context
- Replace Aggregator stage logic in SQL
- Used in delta load, SCD, ranking
- Improves performance vs subqueries
No comments:
Post a Comment