Learn ETL Datastage faster: IBM DataStage Architecture

Wednesday, 3 December 2025

IBM DataStage Architecture

The architecture contains three major layers:

1. Client Layer

This is where developers work.

DataStage Designer → Build ETL jobs
DataStage Director → Run and monitor jobs
DataStage Administrator → Manage projects, permissions

These tools are installed on the developer’s machine.

2. Server Layer

This is the engine where ETL jobs run.

DataStage Engine
Parallel Engine
Job execution components
Runtime metadata
Job logs

This layer handles:

Job execution
Resource allocation
Parallel processing (using multiple nodes)

3. Repository Layer

Stores all DataStage metadata.

Job designs
Schema definitions
Shared containers
Table definitions
Parameter sets

The metadata is stored in a repository database (XMETA).

✔️ Architecture Flow

Developer creates job in Designer
Job is compiled → converts to OSH (parallel job code)
Job runs on the engine layer using available resources
Logs and results are stored in repository
Director monitors job output

No comments:

Post a Comment

Subscribe to: Post Comments (Atom)