Friday, 9 January 2026

What is OSH?

OSH is the underlying execution engine / scripting language used by DataStage to run parallel jobs.

When you run a Parallel Job, DataStage internally converts the job design into an OSH script and executes it.

OSH (Orchestrate Shell) is the execution language used by IBM DataStage’s parallel engine. Parallel jobs are converted into OSH scripts, which control how stages run, how data is partitioned, and how processing happens across nodes. 


Why OSH Is Important

  • Controls job execution
  • Manages parallelism
  • Handles partitioning
  • Manages data flow between stages
  • Executes on multiple nodes

Simple Flow

DataStage Job Design

     

Generated OSH Script

     

APT Engine executes OSH


Where OSH Exists

  • OSH scripts are created temporarily during job execution
  • Location (example):
    • $APT_TMPDIR
  • Usually auto-deleted after job completion (unless debug enabled)

What OSH Contains

  • Stage operators
  • Link definitions
  • Partitioning logic
  • Sorting logic
  • File paths
  • Node allocations

Example (Conceptual)

ds_operator input | ds_transform | ds_aggregator | ds_operator output


OSH vs Unix Shell

Aspect

OSH

Unix Shell

Purpose

DataStage job execution

OS command execution

Used by

DataStage engine

Users / scripts

Parallelism

Built-in

Manual

User writes it?

No (auto-generated)

Yes


When You See OSH (Real Projects)

  • Job failure analysis
  • Performance tuning
  • Debugging parallel jobs
  • DS_SUPPORT / DSENGINE logs

 

No comments:

Post a Comment

Most Recent posts

What is OSH?

OSH is the underlying execution engine / scripting language used by DataStage to run parallel jobs. When you run a Parallel Job, DataStage...