Preethi Vaidyanathan
Preethi Vaidyanathan

Reputation: 1322

Celery: dynamically allocate concurrency based on worker memory

My celery use case: spin up a cluster of celery workers and send many tasks to that cluster, and then terminate the cluster when all of the tasks have completed (usually ~2 hrs).

I currently have it setup to use the default concurrency, which is not optimal for my use case. I see it is possible to specify a --concurrency argument in celery, which specifies the number of tasks that a worker will run in parallel. This is also not ideal for my use case, because, for example:

Because I use these clusters very often for very different types of tasks, I don't want to have to manually profile the task beforehand and manually set the concurrency each time.

My desired behaviour is have memory thresholds. So for example, I can set in a config file:

min_worker_memory = .6
max_worker_memory = .8

Meaning that the worker will increment concurrency by 1 until the worker crosses over the threshold of using more than 80% memory. Then, it will decrement concurrency by 1. It will keep that concurrency for the lifetime of the cluster unless the worker memory falls below 60%, at which point it will increment concurrency by 1 again.

Are there any existing celery settings that I can leverage to do this, or will I have to implement this logic on my own? max memory per child seems somewhat close to what I want, but this ends in killed processes which is not what I want.

Upvotes: 1

Views: 2322

Answers (1)

DejanLekic
DejanLekic

Reputation: 19822

Unfortunately Celery does not provide an Autoscaler that scales up/down depending on the memory usage. However, being a well-designed piece of software, it gives you an interface that you may implement up to however you like. I am sure with the help of the psutil package you can easily create your own autoscaler. Documentation reference.

Upvotes: 1

Related Questions