In IBM DataStage, these files are used to control environment setup, job behavior, and data structure.
1️. Configuration File (APT_CONFIG_FILE)
🔹 What it is
Defines how parallelism works in DataStage – nodes, CPUs, and resource usage.
Why it is used
· Controls number of nodes
· Enables parallel processing
· Improves performance
🔹 Used in
· Parallel Jobs only
🔹 Example
{ node "node1" { fastname "server1" pools "" resource disk "/data1" { } } }
🔹 Set in
· Job Properties → Environment
· Variable: APT_CONFIG_FILE
2️. Parameter File (.parms)
🔹 What it is
A text file that supplies runtime values to job parameters.
🔹 Why it is used
· Avoid hard-coding values
· Support multiple environments (DEV/QA/PROD)
· Easy job migration
🔹 Example
[ETL_Job] SRC_PATH=/data/source/ TGT_PATH=/data/target/ RUN_DATE=2025-12-17
🔹 Used when
· Running jobs via Director
· Running jobs via DSJOB
dsjob -run -paramfile etl.parms project JobName
3️. Value File
🔹 What it is
A file that contains lookup or reference data values used inside jobs.
🔹 Why it is used
· Validation
· Code mapping
· Business rules
🔹 Example
A,Active I,Inactive
🔹 Used in
· Lookup stage
· Transformer (via Sequential File / Dataset)
4️. Schema File (.schema)
🔹 What it is
Defines the metadata (structure) of a data file.
🔹 Why it is used
- Avoids defining columns manually
- Ensures consistent structure
- Used in Sequential File / Dataset stages
🔹 Example
record ( emp_id:int32; emp_name:string[50]; salary:decimal[10,2]; )
🔹 Used when
- Reading delimited / fixed-width files
- Schema drift control
🔹 Schema files define record layout externally and improve reusability.
Quick Comparison Table
|
File Type |
Purpose |
Used In |
Key Benefit |
|
Config File |
Parallel execution setup |
Parallel jobs |
Performance |
|
Parameter File |
Runtime values |
All jobs |
Flexibility |
|
Value File |
Reference data |
Lookups |
Business logic |
|
Schema File |
File structure |
File stages |
Consistency |
🧠Real-World example
In a banking ETL project:
- Config file → Controls nodes for large data loads
- Parameter file → Separate DEV/QA/PROD paths
- Value file → Account status mapping
- Schema file → Standard customer file format
No comments:
Post a Comment