Tuesday, 6 January 2026

How do you identify a datastge job is running parallely or sequentially.?

 

Check the Job Type in Data Stage Designer

This is the first and simplest check.

·        Parallel Job → runs in parallel

·        Server Job / Sequence Job → runs sequentially

📌 If it’s a Server Job, it cannot run in parallel.


2️ .Check Stage Type Used

Some stages are always sequential.

Sequential-only stages:

·        Server Sequential File

·        Server Transformer

·        Server Lookup

·        Server Join

📌 If your job mainly uses Server stages, the job is sequential.


3️. Look at the Job Log (Very Important)

Open Director → Job Log.

Parallel job log shows:

Operator: pxfunnel

Operator: pxpartition

Operator: pxsort

Number of nodes = 4

Sequential execution indicators:

  • No mention of px operators
  • No mention of nodes
  • Single process messages only

📌 If you don’t see px* operators → job is behaving sequentially.


4️. Check Environment Variable: $APT_CONFIG_FILE

This controls parallelism.

·        If not set or invalid, job runs on 1 node

·        If points to a valid config file → parallel execution

📌 Verify in:

Job Properties → Parameters → Environment


5️. Check Number of Partitions on Links

In Designer:

·        Right-click link → Properties

·        Check Partitioning

Sequential behavior if:

·        Partition count = 1

·        Partitioning = Entire / Same

Parallel behavior:

·        Hash / Range / Round-Robin with multiple partitions   


6️. CPU & Process Monitoring (OS Level)

On the DataStage server:

·        Parallel job → multiple osh / dsapi_slave processes

·        Sequential job → single process

Commands:

ps -ef | grep dsapi

top


7️. Dataset vs Sequential File

  • Dataset (.ds) → supports parallelism
  • Sequential File (.txt, .dat) → often forces serialization (unless multiple readers/writers)

📌 Heavy use of Sequential Files can make a parallel job behave sequentially.


8️. Peek / Debug Mode

If you enable Peek and see only one data stream, the job is not parallel.


I identify a DataStage job running sequentially by checking the job type, stage types, partitioning on links, $APT_CONFIG_FILE, and especially the job log for px operators and node count.


 

No comments:

Post a Comment

Most Recent posts

How do you identify a datastge job is running parallely or sequentially.?

  Check the Job Type in Data Stage Designer This is the first and simplest check. ·         Parallel Job → runs in parallel ·       ...