LAG () and LEAD () are window (analytic) functions used to access data from another row without using self-joins.
· LAG () → gets value from a previous row
· LEAD () → gets value from a next row
LAG() and LEAD() allow comparison of a row with its previous or next row within a window.
🔹 Syntax
LAG (column, offset, default)
OVER (PARTITION BY column ORDER BY column)
LEAD (column, offset, default)
OVER (PARTITION BY column ORDER BY column)
· offset → how many rows before/after (default = 1)
- default → value if row doesn’t exist (optional)
🔹 Example Table: sales
|
month |
amount |
|
Jan |
1000 |
|
Feb |
1200 |
|
Mar |
1100 |
|
Apr |
1500 |
🔹 LAG() Example (Previous Value)
SELECT month, amount,
LAG(amount) OVER (ORDER BY month) AS prev_amount
FROM sales;
Output:
|
month |
amount |
prev_amount |
|
Jan |
1000 |
NULL |
|
Feb |
1200 |
1000 |
|
Mar |
1100 |
1200 |
|
Apr |
1500 |
1100 |
🔹 LEAD() Example (Next Value)
SELECT month, amount,
LEAD(amount) OVER (ORDER BY month) AS next_amount
FROM sales;
Output:
|
month |
amount |
next_amount |
|
Jan |
1000 |
1200 |
|
Feb |
1200 |
1100 |
|
Mar |
1100 |
1500 |
|
Apr |
1500 |
NULL |
🔹 Very Common Interview Use Case: Difference Calculation
SELECT month, amount,
amount - LAG(amount) OVER (ORDER BY month) AS diff
FROM sales;
➡ Shows month-to-month change
🔹 Using Default Value
LAG (amount, 1, 0) OVER (ORDER BY month)
➡ First row gets 0 instead of NULL
🔹 PARTITION BY
SELECT emp, dept, salary,
LAG(salary) OVER (
PARTITION BY dept
ORDER BY salary
) AS prev_salary
FROM emp;
➡ Comparison happens within each department
🔹 ETL / Data Engineering Use Cases
✔
Delta / incremental load detection
✔
Change data capture logic
✔
SCD Type-2 comparison
✔
Trend analysis
✔
Data quality checks
No comments:
Post a Comment