Reputation: 405
I'm here with a performance issue that I can't seem to figure it out.
The problem is that executing tasks is too slow. Based on the celery log most of the tasks are finished under 0.3 seconds.
I noticed that if I stop the workers and start them again the performance increases, almost up to 200 acks / second, then, after a while it becomes much slower, around 40/s.
I'm not sure but I think it might be a broker issue rather than a celery issue. Looking at the log of a couple of workers I noticed that they all seem to execute tasks, then stop for a bit and start again.
It feels like receiving tasks is slow.
Any ideas about what might cause this ? Thanks !
A log example:
Task drones.tasks.blue_drone_process_task[64c0a826-aa18-4226-8a39-3a455e0916a5] succeeded in 0.18421914400005335s: None
10 seconds break
Received task: drones.tasks.blue_drone_process_task[924a1b99-670d-492e-94a1-91d5ff7142b9]
Received task: drones.tasks.blue_drone_process_task[74a9a1d3-aa2b-40eb-9e5a-1420ea8b13d1]
Received task: drones.tasks.blue_drone_process_task[99ae3ca1-dfa6-4854-a624-735fe0447abb]
Received task: drones.tasks.blue_drone_process_task[dfbc0d65-c189-4cfc-b6f9-f363f25f2713]
IMO those tasks should execute so fast that I shouldn't be able to read the log.
My setup is:
I use this setup for web scraping, have 2 queue. Let's call them Requests and Process.
In the Requests queue I URLs that need to be scraped and in the Process queue will find the URL + source code of that page. (max 2.5 MB / source page, I drop it in case it's bigger than that), so all messages in the Process queue are max 2.5MB ± 1KB.
To execute tasks from the Requests queue I use celery with the gevent pool, concurrency 300. (-P gevent -c 300 --without-gossip --without-mingle --without-heartbeat). 4-8 workers like this.
To execute tasks from the Process queue I use the prefork pool (default). (-c 4 --without-gossip --without-mingle --without-heartbeat). 30 workers like this.
Other setup info:
RabbitMQ config:
Celery config:
Tried using a higher prefetch, like 5, 10 and 20 and it did not work.
Upvotes: 1
Views: 1748
Reputation: 405
Managed to figure it out. It was a networking issue. The EC2 instance that I used for the load balancer had a low networking performance. I picked up a new instance type with a better networking performance and it works amazingly fast.
Upvotes: 1