Complete parallel task execution (1 task per listener pod) in Celery

Question

I am working on a distributed processing application that I need to be as fast as possible. I have a deployment where a leader application writes celery tasks to redis and have N listening workers to execute the task.

I have benchmarked my process to where 1 worker processing 1 task is exactly as fast as I need. I then need to x100 for the actual distributed processing use-case. As such, I spin up 110 workers to listen for 100 tasks on the queue. I was expecting to have 100 workers process the 100 tasks entirely in parallel, but in practice, I am seeing some workers pick up multiple tasks from the queue while other workers dont process anything. This defeats the purpose of my benchmarking as I cant get my application worker fast enough to process multiple tasks -- I need this to be done entirely in parallel.

I am executing my task in a group and ultimately the configuration is effectively all default aside from an override I found with the prefetch multiplier and ack_late setting (seen below):

app.conf.update(
    worker_prefetch_multiplier=1,
    task_acks_late=True
)

Is there something else I can set so all the parallel workers take 1 task simultaneously and my total processing time of 100 tasks is approximately my benchmarked time of 1 task (basically within variance)?

Thanks for any insight/help!

I have tried a large amount of celery configuration settings but am struggling to find the perfect combination

Complete parallel task execution (1 task per listener pod) in Celery

Answers (1)

Related Questions