Wednesday, 17 December 2025

what is full load and incremental load in Datastage.?

In IBM DataStage (DS), Full Load and Incremental Load describe how data is extracted and loaded from source to target. 


1️. Full Load

🔹 What is Full Load?

A Full Load means loading the entire source data into the target table every time the job runs, regardless of previous loads.

🔹 How it works

·        Target table is usually:

o   Truncated before load OR

o   Dropped & recreated

·        All records from source are loaded again

🔹 DataStage Implementation

·        Use Sequential File / Database stage

·        Optional Execute Command stage to truncate target

·        No comparison with old data

🔹 Example

Source table

EMP

1  Ravi

2  Anil

3  Meera

Target after Full Load

1  Ravi

2  Anil

3  Meera

Next day, even if only one record changed → all records are loaded again

🔹 When to use Full Load

·        Small data volume

·        Initial (first-time) load

·        Dimension tables (sometimes)

 Full load reloads the entire data set and is simple but not performance efficient for large data.


2️. Incremental Load (Delta Load)

🔹 What is Incremental Load?

An Incremental Load loads only new or changed records since the last successful run.

🔹 How it works

·        Uses:

o   Date columns (LAST_UPDATED_DATE)

o   Sequence IDs

o   CDC (Change Data Capture)

·        Target contains only delta data

🔹 Types of Incremental Load in DataStage

 a) Based on Date Column

SELECT *

FROM EMP

WHERE LAST_UPDATED_DATE > #LAST_RUN_DATE#

·        LAST_RUN_DATE passed via parameter file

               b) Based on Surrogate / Sequence ID

WHERE EMP_ID > #MAX_EMP_ID#

          c) Using Lookup Stage

·        Lookup target table

·        Identify:

o   New records → Insert

o   Existing records → Update

       d) Using CDC / Change Capture Stage

·        Reads database logs

·        Captures Insert / Update / Delete


🔹 DataStage Job Design (Incremental)

Typical stages

Source → Filter → Lookup → Transformer → Target

🔹 Example

Source

1 Ravi

2 Anil

3 Meera (updated)

4 Sita (new)

Target before

1 Ravi

2 Anil

3 Meera

Target after Incremental Load

3 Meera (updated)

4 Sita (new)


🔹 When to use Incremental Load

  • Large data volume
  • Daily/hourly loads
  • Fact tables: - Incremental load improves performance by loading only changed data.”

   Full Load vs Incremental Load

 

Feature

Full Load

Incremental Load

Data Loaded

All records

Only new/changed

Performance

Low (large data)

High

Complexity

Simple

Complex

Runtime

Long

Short

Use Case

Initial load

Regular load

 

Real Project Example.

  • Customer Master → Full load once, then incremental
  • Transaction Table → Incremental load daily
  • Reference tables → Full load

 

No comments:

Post a Comment

Most Recent posts

Which Partitioning Method Does a Lookup Stage Use in Data Stage?

  In IBM Data Stage (Parallel Jobs), a Lookup stage typically uses HASH partitioning. 🔹 Default Partitioning of Lookup Stage   H...