Thursday, 4 December 2025

How to configure DB Connector Stages in Datastage ?


Difference between ODBC Connector, Oracle Connector, and DB2 Connector stages.

    1. ODBC Connector: Generic connector; can connect to any DB supporting ODBC drivers.

    2. Oracle Connector: Optimized specifically for Oracle; supports array insert, bulk load, and Oracle-specific features (like direct path load).

   3. DB2 Connector: Optimized for IBM DB2; supports DB2 CLI features and better pushdown optimization.

Note: Use native connectors (Oracle/DB2) instead of ODBC for better performance.


1. Configuring connection parameters dynamically.

Use Job Parameters or Parameter Sets for DB name, user, password, and table name.In the connector properties, use below parameters like:

Server = #DB_SERVER#

User = #DB_USER#

Password = #DB_PASSWORD#.

This allows reuse across environments (Dev, QA, Prod).

 2.Main properties for reading data.

Table name / SQL Query

Column list

Read Mode (Sequential / Parallel)

Fetch Buffer Size / Array Size

Before / After SQL

Isolation Level.
      These control what and how data is extracted.

3.Difference between Read, Write, and User-defined SQL:

        Read → Automatically generates a SELECT * FROM table.

        Write → Automatically generates INSERT INTO table VALUES(...).

        User-defined SQL → Lets you write custom SQL (e.g., joins, filters, stored procedure calls).


4. Array Size / Batch Size :

         Defines how many rows are fetched or inserted per database round trip.

         Higher value = fewer DB calls = better performance.

        Too large can cause memory issues. Common value: 1000–5000.

5.Commit frequency:

        Defines how often a transaction is committed (e.g., every 10,000 rows).

       Helps control rollback size and improve performance.
       If set too high → long rollback; too low → frequent commits = slow.

     Auto-Commit = Yes.

           Each record is committed individually.
                ✅ Useful for small updates or CDC jobs.
               ❌ Avoid for bulk loads (very slow).

       Transaction Size :

        Similar to commit frequency — defines number of rows per commit.

         Example: Set Transaction Size = 10,000 → commits every 10k rows.


6. Capturing rejected records / SQL errors:

Enable Reject link or Reject file in the stage.

Properties: “Reject link”“Write to log”“Abort on error”.

You can also output SQLSTATESQLCODEDBMSCODE for debugging.


7. Performance tuning for large inserts :

Use Array Size and Transaction Size appropriately.

Disable Auto-Commit.

Use bulk load mode if supported (Oracle Direct Load).

Avoid index updates during load — rebuild indexes afterward.

Tune commit interval based on rollback segment.


8.Handling Deadlock / ORA-00060 :

   Identify conflicting transactions (using DB logs).

Retry logic in job or re-run the batch.

Use smaller commit intervals.

For update operations, maintain a consistent order of access to rows.


9. Can we pass a query as a parameter in Data Stage?

 Yes, we can pass a query as a parameter in Data Stage,

but only in certain stages and with some limitations.

    1.ODBC Connector / DB2 Connector / Oracle Connector

            You can define a job parameter like:

            #MyQuery#

            And assign a full SQL query to it in the job parameters:

            SELECT EMP_ID, EMP_NAME FROM EMPLOYEE WHERE DEPT = 'SALES';

            Then in the stage SQL, you put:

            #MyQuery#

            This is valid.


     2. Stored Procedure / Command Stage

            You can pass the SQL query as a parameter to a stored procedure or shell command.

 

2 comments:

Most Recent posts

IBM Cloud Pak for Datastage