cjt
cjt

Reputation: 383

Throttle down GCP DataFlow?

Using the standard GCP provided Storage/text file to PubSub DataFlow template but although I have set #workernodes eq 1 the thruput of messages processed is "to high" for downstream components.

CloudFunction that runs on message event in Pub/Sub hits GCP quotas and with CloudRun I get a bunch of 500, 429 and 503 errors in the beginning (due to to step burst rate).

Is there any way to control the processing rate of DataFlow? Need to get a softer/slower start so downstream components have time to scale up.

Anyone?

Upvotes: 0

Views: 932

Answers (1)

Jayadeep Jayaraman
Jayadeep Jayaraman

Reputation: 2825

You can use Stateful ParDo's to achieve this where in you can buffer events in batches and make an API call with all the keys at once. This is very nicely explained with code snippets here

Upvotes: 1

Related Questions