Sarathy Velmurugan
Sarathy Velmurugan

Reputation: 133

How to achieve Parallelism using AWS Batch Multi-node Parallel Jobs

I've got an SQS queue that will be filled with a json message when my S3 bucket has any CREATE event.

Message contains bucket and object name

Also have Docker image which contains python script that will read message from sqs. With help of that message, it will download respective object from S3. Finally script will read the object and put some values in dynamodb.

1.When submitting as single job to AWS batch, I can able achieve above use case. But it's time consuming because I have 80k object and average size of object 300 MB.

  1. When submitting as multi-node Parallel Job. Job is getting stuck in Running state and master node goes to failed state.

Note: Object Type is MF4 (Measurement File) from vehicle logger. So need to download to local to read the object using asammdf.

Question 1: How to use AWS batch multi node parallel Job.

Question 2: Can I try any other services for achieving parallelism.

Answers with examples will be more helpful.

Thanks😊

Upvotes: 0

Views: 3407

Answers (1)

guest
guest

Reputation: 11

I think you're looking for AWS Batch Array Jobs, not MNP Jobs. MNP jobs are for spreading one job across multiple hosts (MPI or NCCL).

Upvotes: 1

Related Questions