Reputation: 506
I am creating a step function where my input is in the form of array like: {"ids": [1, 2, 3]}. Next, I have 2 Glue jobs that I want to execute for these ids. E.g. Glue job 1 will execute with id 1 and Glue job 2 will execute with id 2 and then Glue job 1 would execute with id 3 when it will process the job with id 1. I have tried using Parallel state in Step function, but that does not work on chunk of input but takes complete ids list as input. I have thought of using Map state, but Map state takes only one task to execute in parallel, but in my case I have 2 Glue jobs.
What could be the resolution for this? Please suggest a solution using Step function.
Upvotes: 0
Views: 841
Reputation: 1391
What if you spilt your id2 to two arrays first (your first step). So convert
{"ids": [1, 2, 3, 4, 5]}
To
{
ids1:[1,3,5]
ids2:[2,4]
}
Then add a step with two parallel step, each contain a map, one for iterate over ids1 and send them to Glue Job1 and the.other to iterate over ids2 and send them to Glue Job2
Update 1: If you don't want any Glue job finishs sooner and becomes idle then instead of two array you can keep one list but add a status to each row:
{
id:1,
Status: null | job1 | job2
}
And instead of map state for each job, create a while loop, first pick an item from the list and then call Glue job.
So your Select_an_id
state will chose one id from that list. and change the status for that record. You need to create a Lambda task state to do this.
Upvotes: 0