Vipul
Vipul

Reputation: 195

Dynamic states to be run in parallel using Step Functions

We have a use case where we need to run some steps sequentially at the beginning of a workflow followed by a bunch of tasks that need to be run in parallel. The difference though is that the number of tasks could vary for each request.

For example:

Request1: Start -> A -> B -> B1, B2 -> C -> C1, C2 -> D -> End  
Request2: Start -> A -> B -> B1, B2, B3, B4, B5 -> C -> C1 -> D -> End

Note: Tasks separated with -> are sequential whereas with , are to be run in parallel.

Is there a way to model this in step functions? One possible thing that we were thinking of doing is creating a state function for every request. Is this recommended? Or should I be considering using SWF and maintain the decider logic on my own?

Upvotes: 2

Views: 3379

Answers (3)

joyrexus
joyrexus

Reputation: 21

AWS recently rolled out support for dynamic parallelism. So now, using the Map state, it's possible to pass in an array from an upstream state and iterate over each element in the array, using each item as the input for a subworkflow executed in parallel.

Upvotes: 2

mathpal
mathpal

Reputation: 91

If the number of requests are not too many, then different step functions can be considered. Since you only have to implement tasks A1,B1... etc only once, both the step functions could invoke the common tasks.

Otherwise you could take a look at choices and branch out states depending on the request.

Upvotes: -1

Joshua DeWald
Joshua DeWald

Reputation: 3209

I think this is a pretty straightforward usage of SWF, your decider is what determines which of those steps need to run and schedules the activities as appropriate.

Something along the lines of:

startDecision(fooRequest) {

   switch (fooRequest.type) {
      case workflowExecutionStarted:
         scheduleActivity(type=A)  
         fooRequest.context.currentState=doingA 
         break;
      case activityTaskCompleted:
         handlers[fooRequest.context.currentState](fooRequest);
         break;
   }
}

handlers[doingA] = function(fooRequest) {

         switch (fooRequest.payloadData.foo) {
            case type1:
               fooRequest.context.currentState=doingB

               scheduleActivities([{type=B1},{type=B2}])
               break;
            case type2:
               fooRequest.context.currentState=doingB
               scheduleActivities([{type=B1},{type=B2},{type=B3}...])
               break;
         }

   }

handlers[doingB] = function(fooRequest) {
  if (numberOfRunningActivities == 0) { // all of them have finished
    scheduleActivity(type=C)
    state=doingC
  } else {
     respondEmpty() // still waiting
  }

}

And so forth. Basically the role of the decider is to essentially maintain the state machine that is tracking what the next set of activities to schedule is. So I don't think it's necessary to have a state function (Activity) for each type of request, but rather you have different logic in the Decider around each type and their current state.

Upvotes: 0

Related Questions