The Peek stage is a Development / Debugging stage in IBM Data Stage used to view (peek at) data flowing through a job during runtime. It helps developers verify data, troubleshoot issues, and validate transformations without writing data to a file or table.
What does Peek stage do?
- Displays actual data rows passing through a link
- Shows data in the job log
- Helps in debugging and testing
- Can be enabled or disabled easily
Key characteristics
🔹 Passive stage (does not change data)
🔹 Does not affect row count
🔹 Used only during development
🔹 Should be removed or disabled in production
Where is the output shown?
- Data appears in the Director → Job Log
- Output is shown as formatted rows under Peek stage messages
Example log output:
Peek_1, link output: Row 1: Eno=101, Ename=John, Salary=50000
Peek_1, link output: Row 2: Eno=102, Ename=Mary, Salary=60000
Common use cases
1️.Validate source data
Check whether source file or table data is read correctly.
2️.Debug transformation logic
Confirm output after:
- Transformer
- Aggregator
- Lookup
- Filter
3️.Identify data issues
- NULL values
- Incorrect calculations
- Unexpected records
Important configuration options
- Rows to display – Limit number of rows printed to log
- Column selection – Choose specific columns to display
- Enable/Disable – Can be turned off without deleting
Example job flow
Source
|
Transformer
|
Peek
|
Target
Peek vs other debugging options
|
Stage |
Purpose |
|
Peek |
View data in job log |
|
Sequential File |
Write data to file |
|
Dataset |
Store data for reuse |
|
Row Generator |
Generate test data |
Why
should Peek stage not be used in production?
Because it writes data to job logs, increases
log size, affects performance, and may expose sensitive data.
No comments:
Post a Comment