FAQ

File-based input:

If a file contains 1,000 records and the batch size is 100, does the Distributor node first create all batches and then start distributing them, or does it create each batch and immediately send it for processing?

The Distributor node creates one batch at a time. If the maximum number of instances has not been reached, it immediately sends the batch payload to Redis. If the maximum number of instances has been reached, it continues creating batches but does not send additional payloads to Redis until a running instance completes processing and returns a response. Once a response is received, the next available batch is sent, assuming the splitting process has already been completed.

For example, if the maximum number of instances is set to 1, only one payload is sent to Redis at a time. The node continues splitting the file into batches, but no additional payloads are sent until the active distributed instance finishes. This means only one pod will be actively running a distributed instance at any given time.

Map case with adapter-side bursting:

Does the Distributor node wait until all bursts or batches are created before starting distribution, or does it begin distributing as soon as each burst becomes available?

The algorithm is the same as when splitting CSV data from a file. The Distributor node begins distributing each batch as soon as it becomes available, provided the maximum number of instances has not been reached.