fagd
fagd

Reputation: 103

How do I launch multiple EC2 instances to run a program with different parameters

I am running a program which takes in certain parameters. Because the program will run for a long time, I am thinking to launch multiple EC2 instances such that I can run this program with different parameters in parallel. Ideally, I would like to

  1. Launch multiple EC2 instances (say 10 of them) with identical setup. I understand that a single instance could have many cores and I am using one with 96 cores but that is simply not enough.

  2. Schedule the instances to run my program with different parameters and save the result to certain github repository

  3. The instances can shut down automatically once the program is finished, or I can shut down all the instances at once conveniently (to avoid unnecessary costs)

I have looked around and I found

  1. It seems like "Terraform" is a tool that can be used to launch multiple EC2 instances see here for example. But I don't know how to make sure the instances have identical setup, and how to tune the parameters for different instances

  2. This one talks about how to set up the instances with different names, but not sure if it can be used in my case to tune parameters

There are many similar questions/request as mine, but I couldn't find the solution for exactly my problem, so I would really appreciate if anyone can help! Thanks!

Upvotes: 0

Views: 493

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 270294

Rather than specifying parameters when launching the instances, I would recommend a 'message queue' architecture:

  • Create an Amazon SQS queue
  • Send messages to the queue, with each message representing some 'work' to be performed (that is, with the parameters for a particular task that needs to be done) -- each message could represent a small piece of work, or a large piece of work, it is up to you.
  • Launch your EC2 instances, each with the same software and the same startup script
  • After boot-up, the startup script on each instance runs a program that keeps looping. Inside the loop it should:
    • Retrieve a message from the SQS queue
    • Perform the work associated with the message
    • If no messages were available in the queue, the instance should Terminate itself

This architecture will automatically distribute work amongst the instances, with each instance grabbing more work when it needs it. It will also also work with any number of instances, but more instances will process the work in a shorter elapsed time.

See also: Auto-Stop EC2 instances when they finish a task - DEV Community

Upvotes: 1

Related Questions