Dimension tables store
descriptive attributes, while fact tables store measurable business data
and foreign keys referencing dimensions. Fact data is loaded after
dimensions using surrogate keys generated during ETL.
Business Scenario (Sales Data)
A retail company wants to analyze:
-
Total sales
-
Sales by product
-
Sales by customer
-
Sales by date
-
Sales by store
Dimension Tables (Descriptive Data)
📌 DIM_CUSTOMER
Stores customer details.
| Customer_Key (SK) | Customer_ID (BK) | Customer_Name | City | Country |
|---|
| 101 | C001 | Rahul | Delhi | India |
| 102 | C002 | Anitha | Chennai | India |
👉 Customer_Key = Surrogate Key (generated in ETL)
👉 Customer_ID = Business Key (from source)
📌 DIM_PRODUCT
| Product_Key | Product_ID | Product_Name | Category |
|---|
| 201 | P1001 | Laptop | Electronics |
| 202 | P1002 | Mobile | Electronics |
📌 DIM_DATE
| Date_Key | Full_Date | Month | Year |
|---|
| 20240101 | 2024-01-01 | Jan | 2024 |
📌 DIM_STORE
| Store_Key | Store_ID | Store_Name | City |
|---|
| 301 | S01 | BigBazaar | Delhi |
3️⃣ Fact Table (Transactional / Measurable Data)
📌 FACT_SALES
| Date_Key | Customer_Key | Product_Key | Store_Key | Quantity | Sales_Amount |
|---|
| 20240101 | 101 | 201 | 301 | 2 | 120000 |
👉 Fact table contains only keys + measures
👉 No descriptive text
4️⃣ Source Data (OLTP / Flat File Example)
| Order_ID | Order_Date | Customer_ID | Product_ID | Store_ID | Qty | Amount |
|---|
| O1001 | 2024-01-01 | C001 | P1001 | S01 | 2 | 120000 |
5️⃣ How Data is Loaded (ETL Flow – Data Stage Style)
🔹 Step 1: Load Dimension Tables
Example: Customer Dimension
-
Read source customer data
-
Check if Customer_ID already exists
-
If new → generate Surrogate Key
-
Insert into DIM_CUSTOMER
📌 Tools used:
🔹 Step 2: Load Fact Table
-
Read sales transaction data
-
Lookup:
-
Populate measures (Quantity, Sales_Amount)
-
Insert into FACT_SALES
📌 Tools used:
-
Multiple Lookup stages
-
Transformer
-
Fact load job
6️⃣ DataStage Job Design (High Level)
No comments:
Post a Comment