Julian
Julian

Reputation: 21

Failing AWS Step Functions after Catching

I have 3 stages in my AWS Step Function:

  1. Stage 1 - Lambda
  2. Stage 2 - AWS Batch
  3. Stage 3 - AWS Batch (Mandatory Cleanup)

Everything works fine in that if Stage 1 fails then it moves to the Cleanup stage. However, since the cleanup stage always passes, the Step Function's final result is always a Pass, whereas if Stage 1 or 2 fails, I need the Cleanup to be performed, yet the Step Function final result should be a fail.

Options investigated:

  1. One way to solve this is to maintain a flag in a cache whether there is an error, but was wondering if there is an inbuilt way for this.
  2. Another option is to use the Result Path to check for an error but I am not sure how to access this result from an AWS Batch.

Appreciate any advice on this, thanks.

I have added the following Catch block in Stage 1 and 2:

"Catch": [
        {
          "ErrorEquals": [
            "States.ALL"
          ],
          "Next": "Cleanup"
        }
]

The Cleanup stage is as follows:

"Cleanup": {
  "Type": "Task",
  "Resource": "arn:aws:states:::batch:submitJob.sync",
  "Parameters": {
    "JobDefinition": "arn:aws:batch:<region>:<account>:job-definition/MyCleanupJob",
    "JobName": "cleanup",
    "JobQueue": "arn:aws:batch:<region>:<account>:job-queue/MyCleanupQueue",
    "ContainerOverrides": {
      "Command": [
        "java",
        "-jar",
        "cleanup.jar" ############ need to specify if an error occured as a command line parameter ###########
      ],
    }
  },
  "End": true
}

Upvotes: 1

Views: 2170

Answers (1)

Julian
Julian

Reputation: 21

Used below mechanism, credit for @LRutten for directing down this path.

  1. For all success stages, append the response to the ResultPath else the previous results will be overwritten.
  2. Set the error to the response path on an exception
  3. Use a choice to decide if the step function should fail based on the presence of the error element

Here is the end output:

"MyLambda": {
  "Type": "Task",
  "Resource": "arn:aws:lambda:<region>:<account>:function:MyLambda",
  "ResultPath": "$.mylambda",   #### All results from the lambda are added to "mylambda" in the JSON
  "Catch": [
    {
      "ErrorEquals": [
        "States.ALL"
      ],
      "ResultPath": "$.error",  #### If an error occurs it is appended to the result path as an "error" element
      "Next": "Cleanup"
    }
  ],
  "Next": "MyBatch"
},

"MyBatch": {
  "Type": "Task",
  "Resource": "arn:aws:states:::batch:submitJob.sync",
  "Parameters": {
    "JobDefinition": "arn:aws:batch:<region>:<account>:job-definition/MyBatchJob",
    "JobName": "cleanup",
    "JobQueue": "arn:aws:batch:<region>:<account>:job-queue/MyBatchQueue",
    "ContainerOverrides": {
      "Command": [
        "java",
        "-jar",
        "mybatch.jar"
      ],
    }
  },
  "ResultPath": "$.mybatch",
  "Catch": [
    {
      "ErrorEquals": [
        "States.ALL"
      ],
      "ResultPath": "$.error",
      "Next": "Cleanup"
    }
  ],
  "Next": "Cleanup"
},
"Cleanup": {
  "Type": "Task",
  "ResultPath": "$.cleanup",
  "Resource": "arn:aws:states:::batch:submitJob.sync",
  "Parameters": {
    "JobDefinition": "arn:aws:batch:<region>:<account>:job-definition/MyCleanupJob",
    "JobName": "cleanup",
    "JobQueue": "arn:aws:batch:<region>:<account>:job-queue/MyCleanupQueue",
    "ContainerOverrides": {
      "Command": [
        "java",
        "-jar",
        "cleanup.jar"
      ],
    }
  },
  "Next": "Should Fail"
},
"Should Fail" :{
  "Type" : "Choice",
  "Choices" : [
    {
      "Variable" : "$.error",   #### If an error element is present it means it is a Failure
      "IsPresent": true,
      "Next" : "Fail"
    }
  ],
  "Default" : "Pass"
},
"Fail" : {
  "Type" : "Fail",
  "Cause": "Step function failed"
},
"Pass" : {
  "Type" : "Pass",
  "Result": "Step function passed",
  "End" : true
}
 
}

Upvotes: 1

Related Questions