Reputation: 379
I am using SWF for one of my application where I used ExponentialRetry to poll continuously for an activity result. However sometime the activity caused START_TO_CLOSE timeout. This does not happen for all my workflows and happens 1 in 30 there by making it difficult to debug/reproduce. From the deciders logs i can see below. Could someone explain what the issue might be?
com.amazon.metrics.swf.AwsSwfMetricsRequestHandler: An error was thrown in previous decision in the thread <SWF Decider AWSFuncTaskList_110.0 7>, task-token <AAAAKgAAAAIAAAAAAAAAAgwtVps42342343434343A3hvItwKY0Vav/Kpexk2cat5fsWkiN1SxhfeoRwVgl+F2/EZQrhBP4RoA41LmLLC77WLU26uSMXaVnl+Cz64x+RZP0sBzofJWdAOdiwHAzsePFNQETXfyl+HibRiYxxO4Xyxn8ndVQ50f97W3IKkwrO7mySJSXbpe6Yaw/AiPmi4f6VoqQo/+nhRSEbzQpKNQeZAaCcAB/6oxEKOgYbW75AF9JsPbZEOdYE7Kq2JVjyghP2id9xAGKgj3ww3d1UBoRFxlulSUsNJmlpgR2+HPyWDHZKF7ECw==>, workflow-execution <RiskAnalysis-313434142331@22NB47i321UtA7w9dPnUTmmtKMeIP1DWrepdAJb0WdGqc=>, domain <Prod>, workflow-type <[email protected]>
at com.amazon.metrics.swf.DecisionsMetricsExtractor.internalHandlePollForDecisionTask(DecisionsMetricsExtractor.java:183)
at com.amazon.metrics.swf.DecisionsMetricsExtractor.handlePollForDecisionTask(DecisionsMetricsExtractor.java:168)
at com.amazon.metrics.swf.AwsSwfMetricsRequestHandler.handlePollForDecisionTask(AwsSwfMetricsRequestHandler.java:508)
at com.amazon.metrics.swf.AwsSwfMetricsRequestHandler.extractMetrics(AwsSwfMetricsRequestHandler.java:362)
at com.amazon.metrics.sdk.AwsSdkMetricsRequestHandler.handleCall(AwsSdkMetricsRequestHandler.java:218)
at com.amazon.metrics.sdk.AwsSdkMetricsRequestHandler.afterResponse(AwsSdkMetricsRequestHandler.java:196)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.afterResponse(AmazonHttpClient.java:975)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:746)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.doInvoke(AmazonSimpleWorkflowClient.java:3390)
at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.invoke(AmazonSimpleWorkflowClient.java:3366)
at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.executePollForDecisionTask(AmazonSimpleWorkflowClient.java:2112)
at com.amazonaws.services.simpleworkflow.AmazonSimpleWorkflowClient.pollForDecisionTask(AmazonSimpleWorkflowClient.java:2088)
at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller.poll(DecisionTaskPoller.java:191)
at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller.access$000(DecisionTaskPoller.java:39)
at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller$DecisionTaskIterator.next(DecisionTaskPoller.java:71)
at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller$DecisionTaskIterator.next(DecisionTaskPoller.java:45)
at com.amazonaws.services.simpleworkflow.flow.worker.HistoryHelper$EventsIterator.<init>(HistoryHelper.java:269)
at com.amazonaws.services.simpleworkflow.flow.worker.HistoryHelper$SingleDecisionEventsIterator.<init>(HistoryHelper.java:74)
at com.amazonaws.services.simpleworkflow.flow.worker.HistoryHelper.<init>(HistoryHelper.java:318)
at com.amazonaws.services.simpleworkflow.flow.worker.AsyncDecisionTaskHandler.handleDecisionTask(AsyncDecisionTaskHandler.java:73)
at com.amazonaws.services.simpleworkflow.flow.worker.DecisionTaskPoller.pollAndProcessSingleTask(DecisionTaskPoller.java:223)
at com.amazonaws.services.simpleworkflow.flow.worker.GenericWorker$PollServiceTask.run(GenericWorker.java:85)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Upvotes: 0
Views: 904
Reputation: 3219
All activities have a startToCloseTimeout
when they are created (by the Decider). If not explicitly specified, they use the default from the versioned ActivityType
that is defined in SWF for that activity. You are hitting the timeout because your retries are allowing you to go past that configured timeout. If you think you need more time, then when the activity is created, you will need to specify the startToCloseTimeout
as a larger value in the scheduleActivityTaskDecision
.
Upvotes: 0