Ali
Ali

Reputation: 97

How acknowledgement works with Google Cloud PubSubIO.Read

I went through the documentation but I could not find how PubSubIO.Read function handles acknowledgement. Specifically I am interested in whether messages being acknowledged one by one or they are acknowledged in a micro-batch fashion. If the latter is the case, I wonder if we can set the batch size.

Any help would be appreciated.

Upvotes: 1

Views: 565

Answers (1)

Ben Chambers
Ben Chambers

Reputation: 6130

From When does Dataflow acknowledge a message of batched items from PubSubIO?:

Dataflow executes your code in bundles. After successful execution each bundle is committed to avoid re-execution on successfully processed elements. Bundles are not necessarily committed between every step in the pipeline. See the description of fusion optimization for details about when PCollections are materialized and committed.

For PubSub, messages that were read as part of a bundle will be acknowledged as part of committing the completion of that bundle. This means if you look at the PubSub read step, and any ParDos after it, these will be executed (and committed) together.

So, the messages are acknowledged neither one-by-one nor in controllable batches. It depends on how and when the processing of the messages is committed down stream.

Upvotes: 1

Related Questions