TestingInProd
TestingInProd

Reputation: 379

SWF workflow causing START_TO_CLOSE for selected workflow

I am using SWF for one of my application where I used ExponentialRetry to poll continuously for an activity result. However sometime the activity caused START_TO_CLOSE timeout. This does not happen for all my workflows and happens 1 in 30 there by making it difficult to debug/reproduce. From the deciders logs i can see below. Could someone explain what the issue might be?

com.amazon.metrics.swf.AwsSwfMetricsRequestHandler: An error was thrown in previous decision in the thread <SWF Decider AWSFuncTaskList_110.0 7>, task-token <AAAAKgAAAAIAAAAAAAAAAgwtVps42342343434343A3hvItwKY0Vav/Kpexk2cat5fsWkiN1SxhfeoRwVgl+F2/EZQrhBP4RoA41LmLLC77WLU26uSMXaVnl+Cz64x+RZP0sBzofJWdAOdiwHAzsePFNQETXfyl+HibRiYxxO4Xyxn8ndVQ50f97W3IKkwrO7mySJSXbpe6Yaw/AiPmi4f6VoqQo/+nhRSEbzQpKNQeZAaCcAB/6oxEKOgYbW75AF9JsPbZEOdYE7Kq2JVjyghP2id9xAGKgj3ww3d1UBoRFxlulSUsNJmlpgR2+HPyWDHZKF7ECw==>, workflow-execution <RiskAnalysis-313434142331@22NB47i321UtA7w9dPnUTmmtKMeIP1DWrepdAJb0WdGqc=>, domain <Prod>, workflow-type <[email protected]>
        at com.amazon.metrics.swf.DecisionsMetricsExtractor.internalHandlePollForDecisionTask(DecisionsMetricsExtractor.java:183)
        at com.amazon.metrics.swf.DecisionsMetricsExtractor.handlePollForDecisionTask(DecisionsMetricsExtractor.java:168)
        at com.amazon.metrics.swf.AwsSwfMetricsRequestHandler.handlePollForDecisionTask(AwsSwfMetricsRequestHandler.java:508)
        at com.amazon.metrics.swf.AwsSwfMetricsRequestHandler.extractMetrics(AwsSwfMetricsRequestHandler.java:362)
        at com.amazon.metrics.sdk.AwsSdkMetricsRequestHandler.handleCall(AwsSdkMetricsRequestHandler.java:218)
        at com.amazon.metrics.sdk.AwsSdkMetricsRequestHandler.afterResponse(AwsSdkMetricsRequestHandler.java:196)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.afterResponse(AmazonHttpClient.java:975)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:746)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.doInvoke(AmazonSimpleWorkflowClient.java:3390)
        at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.invoke(AmazonSimpleWorkflowClient.java:3366)
        at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.executePollForDecisionTask(AmazonSimpleWorkflowClient.java:2112)
        at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.pollForDecisionTask(AmazonSimpleWorkflowClient.java:2088)
        at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller.poll(DecisionTaskPoller.java:191)
        at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller.access$000(DecisionTaskPoller.java:39)
        at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller$DecisionTaskIterator.next(DecisionTaskPoller.java:71)
        at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller$DecisionTaskIterator.next(DecisionTaskPoller.java:45)
        at com.amazonaws.services.simpleworkflow.flow.worker.HistoryHelper$EventsIterator.<init>(HistoryHelper.java:269)
        at com.amazonaws.services.simpleworkflow.flow.worker.HistoryHelper$SingleDecisionEventsIterator.<init>(HistoryHelper.java:74)
        at com.amazonaws.services.simpleworkflow.flow.worker.HistoryHelper.<init>(HistoryHelper.java:318)
        at com.amazonaws.services.simpleworkflow.flow.worker.AsyncDecisionTaskHandler.handleDecisionTask(AsyncDecisionTaskHandler.java:73)
        at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller.pollAndProcessSingleTask(DecisionTaskPoller.java:223)
        at com.amazonaws.services.simpleworkflow.flow.worker.GenericWorker$PollServiceTask.run(GenericWorker.java:85)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Upvotes: 0

Views: 904

Answers (1)

Joshua DeWald
Joshua DeWald

Reputation: 3219

All activities have a startToCloseTimeout when they are created (by the Decider). If not explicitly specified, they use the default from the versioned ActivityType that is defined in SWF for that activity. You are hitting the timeout because your retries are allowing you to go past that configured timeout. If you think you need more time, then when the activity is created, you will need to specify the startToCloseTimeout as a larger value in the scheduleActivityTaskDecision.

Upvotes: 0

Related Questions