Erik Oosterwaal
Erik Oosterwaal

Reputation: 4384

RabbitMQ Queued messages keep increasing

We have a Windows based Celery/RabbitMQ server that executes long-running python tasks out-of-process for our web application.
What this does, for example, is take a CSV file and process each line. For every line it books one or more records in our database.

This seems to work fine, I can see the records being booked by the worker processes. However, when I check the rabbitMQ server with the management plugin (the web based management tool) I see the Queued messages increasing, and not coming back down. Queued messages chart

Under connections I see 116 connections, about 10-15 per virtual host, all "running" but when I click through, most of them have 'idle' as State. I'm also wondering why these connections are still open, and if there is something I need to change to make them close themselves: enter image description here

Under 'Queues' I can see more than 6200 items with state 'idle', and not decreasing.

So concretely I'm asking if these are normal statistics or if I should worry about the Queues increasing but not coming back down and the persistent connections that don't seem to close...

Other than the rather concise help inside the management tool, I can't seem to find any information about what these stats mean and if they are good or bad.

I'd also like to know why the messages are still visible in the queues, and why they are not removed, as the tasks seem t be completed just fine.

Any help is appreciated.

Upvotes: 4

Views: 6267

Answers (2)

Erik Oosterwaal
Erik Oosterwaal

Reputation: 4384

Answering my own question;

Celery sends a result message back for every task in the calling code. This message is sent back via the same AMPQ queue. This is why the tasks were working, but the queue kept filling up. We were not handling these results, or even interested in them.

I added ignore_result=True to the celery task, so the task does not send result messages back into the queue. This was the main solution to the problem.

Furthermore, the configuration option CELERY_SEND_EVENTS=False was added to speed up celery. If set to TRUE, this option has Celery send events for external monitoring tools.

On top of that CELERY_TASK_RESULT_EXPIRES=3600 now makes sure that even if results are sent back, that they expire after one hour if not picked up/acknowledged.

Finally CELERY_RESULT_PERSISTENT was set to False, this configures celery to not store these result messages on disk. They will vanish when the server crashes, which is fine in our case, as we don't use them.

So in short; if you don't need feedback in your app about if and when the tasks are finished, use ignore_result=True on the celery task, so that no messages are sent back. If you do need that information, make sure you pick up and handle the results, so that the queue stops filling up.

Upvotes: 6

scytale
scytale

Reputation: 12641

If you don't need the reliability then you can make your queues transient.

http://celery.readthedocs.org/en/latest/userguide/optimizing.html#optimizing-transient-queues

CELERY_DEFAULT_DELIVERY_MODE = 'transient'

Upvotes: 0

Related Questions