OSH is the underlying execution engine / scripting language used by DataStage to run parallel jobs.
When you run a Parallel Job, DataStage internally converts the job design into an OSH script and executes it.
OSH (Orchestrate Shell) is the execution language used by IBM DataStage’s parallel engine. Parallel jobs are converted into OSH scripts, which control how stages run, how data is partitioned, and how processing happens across nodes.
Why OSH Is Important
- Controls job execution
- Manages parallelism
- Handles partitioning
- Manages data flow between stages
- Executes on multiple nodes
Simple Flow
DataStage Job Design
↓
Generated OSH Script
↓
APT Engine executes OSH
Where OSH Exists
- OSH scripts are created temporarily during job execution
- Location (example):
- $APT_TMPDIR
- Usually auto-deleted after job completion (unless debug enabled)
What OSH Contains
- Stage operators
- Link definitions
- Partitioning logic
- Sorting logic
- File paths
- Node allocations
Example (Conceptual)
ds_operator input | ds_transform | ds_aggregator | ds_operator output
OSH vs Unix Shell
|
Aspect |
OSH |
Unix Shell |
|
Purpose |
DataStage job execution |
OS command execution |
|
Used by |
DataStage engine |
Users / scripts |
|
Parallelism |
Built-in |
Manual |
|
User writes it? |
No (auto-generated) |
Yes |
When You See OSH (Real Projects)
- Job failure analysis
- Performance tuning
- Debugging parallel jobs
- DS_SUPPORT / DSENGINE logs