Tyn
Tyn

Reputation: 705

AWS Glue Job parallel running got error "Rate exceeded" ThrottlingException Status Code: 400

I have a simple (just print hello) glue 2.0 job that runs in parallel, triggered from a step function map. Glue job Maximum concurrency is set to 40 and so as Step Funcitons Map's MaxConcurrency.

Step function in Workflow Studio.

It runs fine if I kicked off under 20 parallel glue jobs but exceeding that (I tried max 35 parallel) I got intermittent errors like this:

Rate exceeded (Service: AWSGlue; Status Code: 400; Error Code: ThrottlingException; Request ID: 0a350b23-2f75-4951-a643-20429799e8b5; Proxy: null)

I've checked the service quotas documentation https://docs.aws.amazon.com/general/latest/gr/glue.html and my account settings. 200 max should have handled my 35 parallel jobs happily.

AWS Service Quotas screenshot

There are no other Glue job scheduled to be run at the same time in my aws account.

Should I just blindly request to increase the quota and see it fixed or is there anything I can do to get around this?

Upvotes: 2

Views: 9621

Answers (2)

Tyn
Tyn

Reputation: 705

Thanks to luk2302 and Robert for the suggestions. Based on their advice, I reach to a solution.

Add a retry in the Glue Task. (I tried IntervalSeconds 1 and BackoffRate 1 but that's too low and didn't work)

"Resource": "arn:aws:states:::glue:startJobRun",
"Type": "Task",
"Retry": [
  {
    "ErrorEquals": [
      "Glue.AWSGlueException"
    ],
    "BackoffRate": 2,
    "IntervalSeconds": 2,
    "MaxAttempts": 3
  }
]

Hope this helps someone.

Upvotes: 3

Robert Kossendey
Robert Kossendey

Reputation: 7028

The quota that you are hitting is not the concurrent job quota of Glue, but the Start Job Run API quota. You basically requested too many job runs per second. If possible just wait in between every Start Job Run call.

Upvotes: 1

Related Questions