Wednesday, 17 December 2025

What is Data Warehousing?

 

What is Data Warehousing?

Data Warehousing is the process of collecting, cleaning, transforming, and storing data from multiple source systems into a central repository (Data Warehouse) for reporting, analysis, and decision-making.It is not for daily transactions, but for analysis and business intelligence (BI).


A Data Warehouse is a centralized database designed for reporting and analysis, where historical and integrated data is stored.


🔹 Why Data Warehousing is Needed

Operational systems (OLTP) are:

·        Optimized for transactions

·        Not suitable for complex queries

Data Warehouse (OLAP) is:

·        Optimized for analysis

·        Supports business decisions

Example

·        OLTP: Banking transaction system

·        OLAP: Monthly revenue, customer trends, risk analysis


🔹 Key Characteristics of a Data Warehouse

1️.  Subject-Oriented

·        Organized by business subjects

·        Example: Sales, Customer, Finance

2️.  Integrated

·        Data from multiple sources combined

·        Example: Oracle, MySQL, flat files

3️Time-Variant

  • Stores historical data
  • Example: Sales data for last 5 years

4️Non-Volatile

  • Data is read-only
  • No frequent updates or deletes

🔹 Data Warehousing Architecture

🔸 3-Tier Architecture

Source Systems

(OLTP, Files, APIs)

       

ETL Layer

(DataStage, Informatica)

                             

Data Warehouse

(Tables, Facts, Dimensions)

       

BI / Reporting

(Tableau, Power BI)


🔹 Fact and Dimension Tables

🔸 Fact Table

·        Stores measurable data

·        Example: sales_amount, quantity

🔸 Dimension Table

·        Stores descriptive data

    • Example: customer, product, date

🔹 Types of Data Warehouse

1️. Enterprise Data Warehouse (EDW)

Organization-wide data

2️. Data Mart

·        Department-specific

·        Example: Finance mart, Sales mart

3️. Operational Data Store (ODS)

·        Near real-time data


🔹 Data Warehousing vs Database

Feature

Database (OLTP)

Data Warehouse (OLAP)

Purpose

Transactions

Analysis

Data

Current

Historical

Updates

Frequent

Rare

Queries

Simple

Complex


🔹 Real-World Example (Banking)

  • Source systems:
    • Core banking
    • Credit card system
    • Loan system
  • Data Warehouse:
    • Customer profitability
    • Fraud analysis
    • Risk reporting

 

 

No comments:

Post a Comment

Most Recent posts

Which Partitioning Method Does a Lookup Stage Use in Data Stage?

  In IBM Data Stage (Parallel Jobs), a Lookup stage typically uses HASH partitioning. 🔹 Default Partitioning of Lookup Stage   H...