Sam Fen
Sam Fen

Reputation: 5264

How to see why a long-running AWS Step Function failed

I have an AWS Step Function with many state transitions that can run for a half hour or more.

There are only a few states, and the application loops through them until it runs out of items to process.

I have a run that failed after about half an hour. I can look at the logging under the "Execution event history". However, since this logs every transition and state, there are thousands of events. I cannot page down to show enough events (clicking the "Load More" button) without hanging my browser window.

There is no way to sort or filter this list that I can see.

How can I find the cause of the failure? Is there a way to export the Execution event history somewhere? Or send it to CloudWatch?

Upvotes: 1

Views: 1502

Answers (2)

Icehorn
Icehorn

Reputation: 1357

You can use the AWS CLI command aws stepfunctions get-execution-history with the --reverse-order flag in order to get the logs from the most recent (where the errors will be) first.

Upvotes: 5

koxon
koxon

Reputation: 898

How do you process your steps? Docker containers on ECS or Fargate? Give us some details on that.

Your tasks should be sending out logs to CloudWatch as they execute. You can also look at the Docker logs themselves on the physical machine if your run docker on a machine you can SSH to.

Upvotes: 0

Related Questions