Wednesday, 17 December 2025

Config file,parameter file,value file and schema file

 

In IBM DataStage, these files are used to control environment setup, job behavior, and data structure.


1️. Configuration File (APT_CONFIG_FILE)

🔹 What it is

Defines how parallelism works in DataStage – nodes, CPUs, and resource usage.

  Why it is used

·        Controls number of nodes

·        Enables parallel processing

·        Improves performance

🔹 Used in

·        Parallel Jobs only

🔹 Example

{ node "node1" { fastname "server1" pools "" resource disk "/data1" { } } }

🔹 Set in

·        Job Properties → Environment

·        Variable: APT_CONFIG_FILE


2️. Parameter File (.parms)

🔹      What it is

A text file that supplies runtime values to job parameters.

🔹      Why it is used

·        Avoid hard-coding values

·        Support multiple environments (DEV/QA/PROD)

·        Easy job migration

🔹 Example

[ETL_Job] SRC_PATH=/data/source/ TGT_PATH=/data/target/ RUN_DATE=2025-12-17

🔹      Used when

·        Running jobs via Director

·        Running jobs via DSJOB

dsjob -run -paramfile etl.parms project JobName


3️. Value File

🔹 What it is

A file that contains lookup or reference data values used inside jobs.

🔹 Why it is used

·        Validation

·        Code mapping

·        Business rules

🔹    Example

     A,Active I,Inactive

🔹      Used in

·        Lookup stage

·        Transformer (via Sequential File / Dataset)


4️. Schema File (.schema)

🔹 What it is

Defines the metadata (structure) of a data file.

🔹 Why it is used

  • Avoids defining columns manually
  • Ensures consistent structure
  • Used in Sequential File / Dataset stages

🔹 Example

record ( emp_id:int32; emp_name:string[50]; salary:decimal[10,2]; )

🔹 Used when

  • Reading delimited / fixed-width files
  • Schema drift control

🔹 Schema files define record layout externally and improve reusability.


 Quick Comparison Table

File Type

Purpose

Used In

Key Benefit

Config File

Parallel execution setup

Parallel jobs

Performance

Parameter File

Runtime values

All jobs

Flexibility

Value File

Reference data

Lookups

Business logic

Schema File

File structure

File stages

Consistency


🧠 Real-World example

In a banking ETL project:

  • Config file → Controls nodes for large data loads
  • Parameter file → Separate DEV/QA/PROD paths
  • Value file → Account status mapping
  • Schema file → Standard customer file format

 

No comments:

Post a Comment

Most Recent posts

Which Partitioning Method Does a Lookup Stage Use in Data Stage?

  In IBM Data Stage (Parallel Jobs), a Lookup stage typically uses HASH partitioning. 🔹 Default Partitioning of Lookup Stage   H...