Reputation: 16777
I'm using latest stable Celery (4) with RabbitMQ within my Django project.
RabbitMQ is running on separate server within local network. And beat periodically just stops to send tasks to worker without any errors, and only restarting it resolves the issue.
There are no exceptions in worker (checked in logs & also I'm using Sentry to catch exceptions). It just stops sending tasks.
Service config:
[Unit]
Description=*** Celery Beat
After=network.target
[Service]
User=***
Group=***
WorkingDirectory=/opt/***/web/
Environment="PATH=/opt/***/bin"
ExecStart=/opt/***/bin/celery -A *** beat --max-interval 30
[Install]
WantedBy=multi-user.target
Is it possible to fix this? Or are there any good alternatives? (Cron seems to be not a best solution).
Upvotes: 4
Views: 2629
Reputation: 693
Your description sounds a lot like this open bug: https://github.com/celery/celery/issues/3409
There are a lot of details there, but the high-level bug description is that if the connection to RabbitMQ is lost that it's unable to regain the connection.
Unfortunately, I can't see that anyone has definitely solved this issue.
You could start by debugging this using this:
ExecStart=/opt/***/bin/celery -A *** beat --loglevel DEBUG --max-interval 30
Upvotes: 4