Configuring a Distributor Node

Data Consumption Methods
  • From File: The node reads data from the path specified in the File Path setting.
  • From Map: The map executes and produces batches based on the defined input card.
  • Parameterization: To use a flow variable for the File Path, enclose the variable name in percent signs (for example, %my_data_file%).

Consuming Data from a Map

If a map provides the data, it must run in Burst Mode. This is configured in the map input card settings:
  • Set the Action property (Fetch As) to Burst.

  • This is supported by adapters including FILE, REST, and messaging adapters (for example, Kafka, JMS, and MQ).

  • The Fetch Unit controls the number of records per batch. If not set (default 0), the node will fetch all records.

  • Map Batch Size: You can override the map’s fetch unit at runtime using the map_batch_size property, allowing for flexible tuning via flow variables.

Distributed Instances

The Distributor node splits data into batches. Each batch is processed by downstream nodes as a separate flow instance. These instances can run on the same REST runtime or on other available REST runtime instances.
  • Parallel execution is limited by the Maximum Instances setting.
  • If Maximum Instances exceeds the number of available REST runtime instances, the parallel execution will be limited to the total number of available REST instances. A flow that distributes batches does not count toward the execution process limit for flow executions. However, the execution of distributed batches for a single flow instance is restricted to one batch per flow executor.

    Example Scenario

    Consider a system with 5 available executors and a Distributor node configured with the following parameters:

    • Source: A CSV file containing 1,000,000 records.

    • Maximum Instances: 5.

    • Batch Size: 100,000 records (resulting in 10 total batches).

    Execution Logic:

    1. Initial State: The Distributor node in the main flow generates 5 initial requests for distributed batches.

    2. Concurrency: Since each executor can process only one request at a time, all 5 executors are immediately engaged.

    3. Queueing: As each distributed batch completes, the main Distributor instance issues a new request. This cycle continues until all 10 distributed batches have been processed.