Tuesday, 23 December 2025

What are Window Functions in SQL?

 

What are Window Functions?

Window functions perform calculations across a set of rows related to the current row, without collapsing rows (unlike GROUP BY).

  You get aggregated values + row-level data together.


Window functions calculate values over a window (set of rows) while keeping each row visible.


🔹 Why Window Functions Are Important

  • Needed for ranking
  • Needed for running totals
  • Needed for top N per group
  • Used heavily in ETL transformations
  • Avoids complex subqueries

🔹 Basic Syntax

function_name (expression)

OVER (

  PARTITION BY column

  ORDER BY column

  ROWS / RANGE clause

)


🔹 Example Table: sales

emp

dept

amount

A

HR

5000

B

HR

7000

C

IT

9000

D

IT

6000


🔹 Example 1: Running Total

SELECT emp, dept, amount,

       SUM(amount) OVER (

         PARTITION BY dept

         ORDER BY amount

       ) AS running_total

FROM sales;

Calculates cumulative sum per department


🔹 Example 2: Ranking Employees by Salary

SELECT emp, dept, amount,

       RANK () OVER (

         PARTITION BY dept

         ORDER BY amount DESC

       ) AS rank_in_dept

FROM sales;


🔹 Example 3: Top 1 Salary per Department

SELECT *

FROM (

  SELECT emp, dept, amount,

         ROW_NUMBER() OVER (

           PARTITION BY dept

           ORDER BY amount DESC

         ) AS rn

  FROM sales

) t

WHERE rn = 1;


🔹 Common Window Functions :

🔸 Ranking Functions

·        ROW_NUMBER()

·        RANK()

·        DENSE_RANK()

🔸 Aggregate Window Functions

·        SUM()

·        AVG()

  • COUNT()
  • MAX(), MIN()

🔸 Analytical Functions

  • LAG()
  • LEAD()
  • FIRST_VALUE()
  • LAST_VALUE()

🔹 LAG & LEAD (ETL Usage)

SELECT emp, amount,

       LAG(amount) OVER (ORDER BY emp) AS prev_amount

FROM sales;

Used for change detection / delta logic


🔹 GROUP BY vs Window Functions (Important)

GROUP BY

Window Function

Aggregates rows

Keeps all rows

Reduces output rows

Same number of rows

Cannot show row-level + aggregate

Can show both


🔹 ETL / DataStage Context

  • Replace Aggregator stage logic in SQL
  • Used in delta load, SCD, ranking
  • Improves performance vs subqueries

 

No comments:

Post a Comment

Most Recent posts

What are LAG () and LEAD ()?

  LAG () and LEAD () are window (analytic) functions used to access data from another row without using self-joins. ·         LAG () →...