Reputation: 195
We have a use case where we need to run some steps sequentially at the beginning of a workflow followed by a bunch of tasks that need to be run in parallel. The difference though is that the number of tasks could vary for each request.
For example:
Request1: Start -> A -> B -> B1, B2 -> C -> C1, C2 -> D -> End
Request2: Start -> A -> B -> B1, B2, B3, B4, B5 -> C -> C1 -> D -> End
Note: Tasks separated with ->
are sequential whereas with ,
are to be run in parallel.
Is there a way to model this in step functions? One possible thing that we were thinking of doing is creating a state function for every request. Is this recommended? Or should I be considering using SWF and maintain the decider logic on my own?
Upvotes: 2
Views: 3379
Reputation: 21
AWS recently rolled out support for dynamic parallelism. So now, using the Map state, it's possible to pass in an array from an upstream state and iterate over each element in the array, using each item as the input for a subworkflow executed in parallel.
Upvotes: 2
Reputation: 91
If the number of requests are not too many, then different step functions can be considered. Since you only have to implement tasks A1,B1... etc only once, both the step functions could invoke the common tasks.
Otherwise you could take a look at choices and branch out states depending on the request.
Upvotes: -1
Reputation: 3209
I think this is a pretty straightforward usage of SWF, your decider is what determines which of those steps need to run and schedules the activities as appropriate.
Something along the lines of:
startDecision(fooRequest) {
switch (fooRequest.type) {
case workflowExecutionStarted:
scheduleActivity(type=A)
fooRequest.context.currentState=doingA
break;
case activityTaskCompleted:
handlers[fooRequest.context.currentState](fooRequest);
break;
}
}
handlers[doingA] = function(fooRequest) {
switch (fooRequest.payloadData.foo) {
case type1:
fooRequest.context.currentState=doingB
scheduleActivities([{type=B1},{type=B2}])
break;
case type2:
fooRequest.context.currentState=doingB
scheduleActivities([{type=B1},{type=B2},{type=B3}...])
break;
}
}
handlers[doingB] = function(fooRequest) {
if (numberOfRunningActivities == 0) { // all of them have finished
scheduleActivity(type=C)
state=doingC
} else {
respondEmpty() // still waiting
}
}
And so forth. Basically the role of the decider is to essentially maintain the state machine that is tracking what the next set of activities to schedule is. So I don't think it's necessary to have a state function (Activity) for each type of request, but rather you have different logic in the Decider around each type and their current state.
Upvotes: 0