user1472972
user1472972

Reputation: 105

AWS ECS unable to run more than 10 number of tasks

I have an ECS Cluster with say 20 registered instances.

I have 3 task definitions to solve a big data problem.

Task 1: Split Task - This starts a docker container and the container definition has an entrypoint to run a script called HPC-Split. This script splits the big data into say 5 parts in a mounted EFS. The number of tasks (count) for this task is 1.

Task 2: Run Task: This starts another docker container and this docker container has an entrypoint to run a script called HPC-script which processes each split part. The number of tasks selected for this is 5, so that this is processed in parallel.

Task 3: Merge Task: This starts a third docker container which has an entrypoint to run a script called HPC-Merge and this merges the different outputs from all the parts. Again, the number of tasks (count) that we need to run for this is 1.

Now AWS service limits say: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service_limits.html The maximum tasks (count) we can run is 10. So we are at the moment able to run only 10 processes in parallel. Meaning, Split the file (1 task runs on one instance), Run the process (task runs on 10 instances), Merge the file (task runs on 1 instance.)

The limit of 10 is limits the level at which we can parallelize our processing and I don't know how to get around. I am surprised about this limit because there is surely a need to run long running processes on more than 10 instances in the cluster.

Can you guys please give me some pointers on how to get around this limit or how to use ECS optimally to run say 20 number of tasks parallely. The spread placement I use is 'One task per host' because the process uses all cores in one host.

How can I architect this better with ECS?

Upvotes: 2

Views: 5946

Answers (2)

Brett Green
Brett Green

Reputation: 3765

If your tasks that do the split work are architected to wait until such work is available somehow (with a queue system of some kind or whatever), I would launch them as a service and simply change the 'Desired Tasks' number from zero to 20 as needed.

When you need the workers, scale the service up to 20 Desired Tasks. Then launch your task to split the work and launch the task that waits for the work to be done. When the workers are all done, you can scale them back down to zero.

This also seems like work better suited for Fargate unless you have extreme memory or disk size needs. Otherwise you'll likely want to pair this with scaling up the EC2-based Cluster as needed and back down when not.

Upvotes: 0

Samuel Karp
Samuel Karp

Reputation: 4682

Number of tasks launched (count) per run-task

This is the maximum number of tasks that can be launched per invocation of the run-task API. To launch more tasks, call the run-task API again.

Upvotes: 7

Related Questions