Re-run failed glue job using a step function

Question

I basically want to retry a glue job twice after status FAILED or TIMEOUT, before moving to the next stage. My state machine looks like this:

 {
  "Comment": "A description of my state machine",
  "StartAt": "Glue StartJobRun",
  "States": {
    "Glue StartJobRun": {
      "Type": "Task",
      "Resource": "arn:aws:states:::glue:startJobRun",
      "Parameters": {
        "JobName": "test-to-fail"
      },
      "End": true,
      "Retry": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "MaxAttempts": 2,
          "IntervalSeconds": 300,
          "BackoffRate": 1
        }
      ]
    }
  }
}

Nevertheless, the job runs one single time when I execute the state machine. Could it be that I misunderstood the logic behind Retry and it applies to the step function state and not the job? I followed this tutorial, which seems to confirm it is the job status. Then, what is wrong? And how can I achieve this?

Thank you!

Re-run failed glue job using a step function

Answers (1)

Related Questions