Wednesday, 3 December 2025

What is Slowly Changing Dimensions (SCD) – Type 1 & Type 2 ?


A Slowly Changing Dimension (SCD) tracks how dimension data changes over time.


ü SCD Type 1 (Overwrite old data)

Definition

No history tracked.

If dimension attributes change, update the existing row.

Useful for non-historical or correction-based fields.

Real-world example

Changing customer email ID or phone number where history is not important.


ü SCD Type 1 – Job Design (DataStage)

Job Stages

Source (DB extract / file)

Lookup Stage

Lookup on Dimension Key (e.g., Customer_ID)

Transformer

Compare columns → Identify changed rows

If no match → INSERT

If match but attributes changed → UPDATE

Two Output Links

Insert link → Dimension table

Update link → Dimension table

Target (DB Connector)

Change comparison logic example

IF Lkp_CustID IS NULL THEN 'I'ELSE IF Inp.Email <> Lkp.Email THEN 'U'ELSE 'R'   -- no change


Sample Input / Output – SCD Type 1

Source Input

Cust_ID

Name

Email

101

John

john@gmail.com

Existing Dimension Table

Cust_ID

Name

Email

101

John

john.old@gmail.com

Expected Output (after SCD1)

Cust_ID

Name

Email

101

John

john@gmail.com

Old email is overwritten, no history kept.


SCD Type 2 (Preserve full history)

Definition

Creates a new row when the attribute changes.

Maintains full history with date ranges or active flag.

Common SCD2 Fields

Column

Purpose

Start_Date

      When record became active

End_Date

     When record stopped being active

Current_Flag

     Y/N

Version_No     

Optional incremental version


SCD Type 2 – Job Design (DataStage)

Job Stages

Source Stage

Lookup Stage on business key

Transformer Stage

Compare incoming vs existing attribute(s)

Identify:

New record → INSERT

Changed record → EXPIRE existing + INSERT new

Two Output Links

Expired-Update Link → Update old row

Current_Flag = 'N'End_Date = CurrentTimestamp()

New-Insert Link → Insert new row

Current_Flag = 'Y'Start_Date = CurrentTimestamp()Version_No = previous+1

Target (DB Connector)


 Sample Input / Output – SCD Type 2

Source Input

Cust_ID

Name

City

101

John

Chennai

Existing Dimension Table

Dim_Key

Cust_ID

Name

City

Start_Date

End_Date

Curr_Flag

Version

1

101

John

Mumbai

2020-01-01

9999-12-31

Y

1

Incoming change:

City changed

Mumbai → Chennai

Expected Output

Expire old row

Dim_Key

Cust_ID

Name

City

Start_Date

End_Date

Curr_Flag

Version

1

101

John

Mumbai

2020-01-01

2024-12-03 17:00

N

1

Insert new row

Dim_Key

Cust_ID

Name

City

Start_Date

End_Date

Curr_Flag

Version

2

101

John

Chennai

2024-12-03

9999-12-31

Y

2


 

窗体底端

 

No comments:

Post a Comment

Most Recent posts

How to configure DB Connector Stages –