Friday, 26 December 2025

what is SCD ?

SCD (Slowly Changing Dimension) refers to dimension tables in a data warehouse where attribute values change slowly over time, not frequently like transaction data.

The challenge is how to handle those changes. SCD defines techniques to manage historical changes in dimension data.


🔹 Example Dimension Table: Customer

Cust_ID

 Name

  City

101

Ravi

Chennai

Later, customer moves to Bangalore.


🔹 Why SCD Is Important

Business wants:

  • Current customer data
  • OR historical data
  • OR both (depends on reporting needs)

SCD defines how much history to store.


🔹 Types of SCD .


🔹 SCD Type 0 – No Change

  • Data never changes
  • Ignore updates

Example: Date of Birth


🔹 SCD Type 1 – Overwrite

  • Old value is replaced
  • No history maintained

Example:

Cust_ID

Name

City

101

Ravi

Bangalore

Simple
History lost

Use case: Correction of wrong data


🔹 SCD Type 2 – Full History (Very Important)

  • Create new row for every change
  • Maintain history using:
    • Surrogate key
    • Start date / End date
    • Current flag

Example:

SK

Cust_ID

City

Start_Date

End_Date

Is_Current

1

101

Chennai

2019-01-01

2023-03-31

N

2

101

Bangalore

2023-04-01

9999-12-31

Y

Full history
Most commonly used


🔹 SCD Type 3 – Limited History

  • Stores previous value in another column

Cust_ID

City

Prev_City

101

Bangalore

Chennai

One level history
Not scalable


🔹 SCD Type 4 – History Table

  • Current data in main table
  • History stored in separate table

🔹 SCD Type 6 – Hybrid

  • Combination of Type 1 + Type 2 + Type 3

🔹 Which SCD Types Are Used Most?

1️Type 1 – Corrections
2️
Type 2 – History tracking (most important)
3️
Type 3 – Limited cases


🔹 ETL / DataStage Context

In DataStage:

  • Use Lookup stage to check existing record
  • Compare source vs target
  • Decide:
    • Insert new row (Type 2)
    • Update existing row (Type 1)
  • Manage surrogate keys

🔹 SCD Type 2 Logic (Simplified)

  1. Match business key
  2. Compare attributes
  3. If change detected:
    • Expire old record
    • Insert new record

No comments:

Post a Comment

Most Recent posts

what is SCD ?

SCD (Slowly Changing Dimension) refers to dimension tables in a data warehouse where attribute values change slowly over time, not frequen...