Saturday, 20 December 2025

Debugging Stages

What is Row Generator in IBM DataStage?

Row Generator is a DataStage stage used to generate dummy or test data.
It does not read data from any source; instead, it creates rows internally based on values you define.

Key Purpose of Row Generator

Create test data
Generate sequence numbers
Produce constant or derived values
Used in unit testing, debugging, and job validation

How Row Generator Works

You define:

  1. Number of rows to generate
  2. Column metadata
  3. Derivations for each column

The stage then produces that many rows and sends them to the next stage.


 Important Properties

1️ . Rows per partition

  • Defines how many rows each partition generates
  • Total rows = Rows per partition × number of partitions

Example:

  • Rows per partition = 100
  • Partitions = 4
    Total rows = 400

2️. Column Derivations

You can use:

  • Constants
  • Functions
  • System variables

Example:

ID        = @INROWNUM

NAME      = "TEST"

LOAD_DT   = CurrentDate()


 Common Use Cases

🔹 1. Generate Sequence Numbers

EMP_ID = @INROWNUM


🔹 2. Create Dummy Test Data

Useful when:

  • Source system not available
  • Testing job flow

CUST_ID = @INROWNUM

CUST_NM = "Customer_" : StringFromInt(@INROWNUM)


🔹 3. Debugging / Unit Testing

  • Test transformer logic
  • Test lookup logic
  • Validate target table mappings

🔹 4. Control Table Initialization

Used to:

  • Load initial control / parameter tables
  • Generate static reference data

🆚 Row Generator vs Sequential File

Feature

Row Generator

Sequential File

Reads source data

No

Yes

Generates data

Yes

No

Used for testing

⚠️ Limited

Requires file


Important Points.

No input link
Total rows depend on partitions
Commonly used with @INROWNUM
Mostly for testing, debugging, POC
Not used in production loads normally


🎯 Sample Job Flow

Row Generator → Transformer → Sequential File / DB Target


 

Top of Form

 

Bottom of Form

 

 


No comments:

Post a Comment

Most Recent posts

Head Stage

  The Head stage in IBM Data Stage is a Processing (active) stage used to limit the number of rows passed to the next stage. It is mainly us...