Vinod Sai
Vinod Sai

Reputation: 2122

Celery using only 20% of Cpu (at peak)

I'm running a celery + rabbitmq app. I start up a bunch of ec2 machines, but I find that my celery worker machines only use about 15% cpu (peak of 20%). I've configured 2 celery workers per machine.

Shouldn't celery workers be close to using 100% CPU utilization?

MORE INFO: I am not using the celery --concurrency option or eventlet even though I am using multiple workers. By default concurrency is set to 8. My tasks run in php mostly io blocking, so there won't be an issue if we have more processes running in parallel. Is there any way to configure celery to run more number of tasks based on the CPU usage

Upvotes: 3

Views: 3104

Answers (2)

dm03514
dm03514

Reputation: 55972

Shouldn't celery workers be close to using 100% CPU utilization?

Only if you load them up to utilize 100% CPU :)

My tasks run in php mostly io blocking

If your tasks are primarily making IO calls than this is most likely the reason why CPU isn't high. Ie when a process/theads is mostly sitting idle after making an io call and waiting for it to complete.

It's crucial to benchmark your configuration. In practice this could look like:

  • choose an initial level for concurrency (ie the default)
  • Benchmark throughput / resource usage
  • Increase the concurrency Level
  • Benchmark Throughput / resource usage
  • Continue until increasing concurrency no longer provides any benefit

If your worker tasks are IO bound this is a perfect case for eventlet. Since it will allow you to run many many IO bound tasks on a single processor. Ie consider the case where your machine has 64 cores. You should easily be able to run some multiple of this for IO bound tasks but at some point majority of resources will go to process accounting and overhead and context switching.

With eventlet, a single processor could handle hundreds or thousands of concurrent workers:

The prefork pool can take use of multiple processes, but how many is often limited to a few processes per CPU. With Eventlet you can efficiently spawn hundreds, or thousands of green threads. In an informal test with a feed hub system the Eventlet pool could fetch and process hundreds of feeds every second, while the prefork pool spent 14 seconds processing 100 feeds. Note that this is one of the applications async I/O is especially good at (asynchronous HTTP requests). You may want a mix of both Eventlet and prefork workers, and route tasks according to compatibility or what works best.

Upvotes: 2

DejanLekic
DejanLekic

Reputation: 19822

You have two options - to increase concurrency level (using the --concurrency), or to use the (deprecated) auto-scaling option. Most of the time we overutilise on AWS by using concurrency setting number that is 2 * N where N is number of vCPUs on the instance type of your choice. We do not overutilise nodes that are subscribed to the special queue where we send our CPU-bound tasks.

Upvotes: 1

Related Questions