Graham Polley
Graham Polley

Reputation: 14791

Hitting Cloud Dataflow quota limits

Suddenly, started getting the following warnings for one of our pipelines:

698382 [main] WARN  com.google.cloud.dataflow.sdk.runners.DataflowPipelineJob  - There were problems getting current job status: 429 Too Many Requests
{
  "code" : 429,
  "errors" : [ {
    "domain" : "global",
    "message" : "Request throttled due to project QPS limit being reached.",
    "reason" : "rateLimitExceeded"
  } ],
  "message" : "Request throttled due to project QPS limit being reached.",
  "status" : "RESOURCE_EXHAUSTED"
}.

The pipeline is reading a lot of data from BigQuery.

Is this a quota exhaustion relating to the BigQuery API (it's not clear from the message)?

Upvotes: 2

Views: 1029

Answers (2)

Ben Chambers
Ben Chambers

Reputation: 6130

Due to a mixup the quota for the RPC used by the BlockingDataflowPipelineRunner to check on the job status was too strictly limited. This has been fixed, and should not have affected the behavior of a running job. Please let us know if you continue seeing problems.

You could also avoid making these RPCs by using the DataflowPipelineRunner, which won't poll job status after it is submitted.

Upvotes: 2

James
James

Reputation: 2331

Disclaimer: I am not a Dataflow expert. :)

The Dataflow quota is 1000 queries per second (QPS) while the BigQuery quota is 100 QPS. I'd suspect you're hitting the BQ limit. Also, the BQ docs do mention If you make more than 100 requests per second, throttling might occur and your error seems to reflect that.

If you look at your BigQuery API use do you see 4xx errors?

Upvotes: -1

Related Questions