What is Row Generator in IBM DataStage?
Row
Generator is a DataStage stage used to generate dummy or test data.
It does not read data from any source; instead, it creates rows internally
based on values you define.
Key Purpose of Row Generator
✔
Create test data
✔
Generate sequence numbers
✔
Produce constant or derived values
✔
Used in unit testing, debugging, and job validation
How Row Generator Works
You define:
- Number of rows to generate
- Column metadata
- Derivations for each column
The stage then produces that many rows and sends them to the next stage.
Important Properties
1️ . Rows per partition
- Defines how many rows each partition generates
- Total rows = Rows per partition × number of partitions
Example:
- Rows per partition = 100
- Partitions = 4
➡ Total rows = 400
2️. Column Derivations
You can use:
- Constants
- Functions
- System variables
Example:
ID = @INROWNUM
NAME = "TEST"
LOAD_DT = CurrentDate()
Common Use Cases
🔹 1. Generate Sequence Numbers
EMP_ID = @INROWNUM
🔹 2. Create Dummy Test Data
Useful when:
- Source system not available
- Testing job flow
CUST_ID = @INROWNUM
CUST_NM = "Customer_" : StringFromInt(@INROWNUM)
🔹 3. Debugging / Unit Testing
- Test transformer logic
- Test lookup logic
- Validate target table mappings
🔹 4. Control Table Initialization
Used to:
- Load initial control / parameter tables
- Generate static reference data
🆚 Row Generator vs Sequential File
|
Feature |
Row Generator |
Sequential File |
|
Reads source data |
❌ No |
✅ Yes |
|
Generates data |
✅ Yes |
❌ No |
|
Used for testing |
✅ |
⚠️ Limited |
|
Requires file |
❌ |
✅ |
Important Points.
✔
No input link
✔
Total rows depend on partitions
✔
Commonly used with @INROWNUM
✔
Mostly for testing, debugging, POC
✔
Not used in production loads normally
🎯 Sample Job Flow
Row Generator → Transformer → Sequential File / DB Target
No comments:
Post a Comment