Sequential file Stage.

What is a Sequential File?

A Sequential File is:

  • A flat file (text-based)
  • Records are processed line by line
  • Data is accessed only in order (top to bottom)

File formats:

  • .txt , .csv, .dat, Fixed-width files

Key Characteristics

  • No indexing
  • No random access
  • Must read records from the beginning
  • Simple and lightweight
  • Used for file ingestion and extraction

🔹 Sequential File in DataStage

The Sequential File stage is used to:

·        Read data from a file

·        Write data to a file

·        Act as source or target in jobs

Supports:

·        Delimited files (CSV, pipe-separated)

·        Fixed-width files

·        Single or multiple output files (in parallel jobs)

·        A file like:

      • 1001|John|USA
      • 1002|Maria|UK
      • 1003|Ahmed|India
  • Data Stage reads it record by record, from line 1 to line N.

        

Sequential File in Parallel Jobs

This is the most common case.

🔹 Parallel Mode (default)

      When used in a Parallel Job, the Sequential File stage runs in parallel.

       Each partition (node) reads or writes a separate file.

  • File naming uses partition-based suffixes, for example:
    • output.ds
    • output.ds.0
    • output.ds.1
    • output.ds.2
  • High performance due to parallel I/O.

🔹 Sequential Mode (forced)

You can force sequential behavior even in a Parallel Job by enabling:

  • “Combine part files” = Yes (for output)
  • Or “Read method = Sequential” (for input)

In this case:

 Only one node reads or writes the file.

  • Data flows through a single partition.
  • Performance is lower but useful when:
    • External systems expect a single file
    • Order of records must be preserved

2️. Sequential File in Server Jobs

  • Always runs in sequential mode
  • No partitioning
  • Single process reads/writes the file.









🔹 When Do We Use Sequential Files?

    • Source data from external systems
    • Deliver data to downstream systems
    • Archive or export data
    • Interface files (banking, insurance, ecommerce — like your projects)

A Sequential File in Data Stage is a flat file where data is read or written record by record in sequence. It is commonly used to exchange data between DataStage and external systems.A Sequential File stage can work in both sequential and parallel mode.In Parallel Jobs, it works in parallel by default, creating part files.If we enable options like Combine part files or Sequential read, it behaves sequentially.In Server Jobs, it always runs sequentially.


 

 


No comments:

Post a Comment

Most Recent posts

Copy and Modify Stages

In IBM Infosphere DataStage , both Copy Stage and Modify Stage are simple processing stages used in parallel jobs , but their purpose i...