Learn ETL Datastage faster: What is a Surrogate Key ?

A Surrogate Key (SK) is an artificial, system-generated unique identifier assigned to dimension records in a data warehouse.
It has no business meaning and is used instead of natural/business keys.

🔹 Why Surrogate Keys are Used

✔️ Handles Slowly Changing Dimensions (SCD)
✔️ Improves join performance
✔️ Avoids issues with changing business keys
✔️ Ensures uniqueness across history

🔹 Example

Source (Business Key)

Customer_ID | Name | City

1001 | Ravi | Pune

Dimension Table

Cust_Key | Customer_ID | Name | City

1 | 1001 | Ravi | Pune

2 | 1001 | Ravi | Mumbai ← Type 2 change

Cust_Key is the surrogate key.

🔹 How Surrogate Keys are Handled in DataStage

DataStage provides multiple ways to generate and manage surrogate keys.

1️⃣ Surrogate Key Generator Stage.

🔸 What it does

Automatically generates sequential unique numbers
Thread-safe in parallel jobs

🔸 Job Flow

Source → Transformer → Surrogate Key Generator → Dimension Table

🔸 Configuration

Specify:

Key column name
Starting value
Increment value

2️⃣ Using Sequence in Database

🔸 How it works

Database sequence generates key
DataStage reads the generated value

🔸 Example

NEXTVAL(customer_seq)

3️⃣ Using Max + 1 Logic

SELECT MAX(CUST_KEY) + 1 FROM DIM_CUSTOMER

🔹 Surrogate Keys in SCD Handling

🔸 Type 1

Same surrogate key
Record updated

🔸 Type 2

New surrogate key generated
Old record expired

🔸 Type 3

Same surrogate key
Limited history columns updated

🔹 Surrogate Key Lookup During Fact Load

Fact table stores only surrogate keys, not business keys.

Fact Load Flow

Source → Lookup (Dimension) → Fact Table

Example

Fact Table

----------

Cust_Key | Amount

1 | 5000

🔹 Best Practices in DataStage

✔️ Always use Surrogate Key Generator stage
✔️ Maintain natural key + surrogate key
✔️ Never expose surrogate keys to business users
✔️ Use lookup caching for performance

Learn ETL Datastage faster

Pages

What is a Surrogate Key ?

No comments:

Post a Comment

Most Recent posts

Copy and Modify Stages