Reputation: 3728
Im trying to allow users to schedule a periodic task. Im also running multiple celery workers in a container. My command for that container used to look like this:
celery worker -c 4 -B -l INFO -A my.celery.app.celery --scheduler my.celery.scheduler.SchedulerClass
but what happened was that the scheduled task ran 4 times when the time came to run the task.
so i read that you should have a dedicated worker for beat. I changed my command to this one:
celery worker -c 4 -l INFO -A my.celery.app.celery
and added another container exactly like that one that runs the command:
celery -l INFO -B -A my.celery.app.celery --scheduler my.celery.scheduler.SchedulerClass
hoping that now that there is only one beat, there will be no duplicate tasks. But I still get 4 tasks running instead of one.
Any ideas on how this should be done will be helpful
Upvotes: 4
Views: 4522
Reputation: 81
I know this is an old question but I faced this particular issue recently and I wanted to share my findings in case it helps others.
As mentioned in the question, only one instance of celeybeat should be scheduling the tasks, otherwise duplicate tasks will be created. Also, if we have a single celerybeat instance, this can be a single point of failure. The goal is to have an auto-scaling pool of identical hosts running celery workers and celerybeats. Each host has a single celerybeat instance. Celerybeat should successfully start on all hosts and all but one are standbys. If anything happens to the running celerybeat instance, another node's celerybeat automatically should take over scheduling tasks.
Celerybeat does not have a built-in mechanism to handle this behavior. We can use other solutions such as redbeat and single-beat to implement this behavior.
You can find more discussion on this subject, in the link below: https://github.com/celery/celery/issues/251
Upvotes: 3
Reputation: 86
From the documentation:
You can also embed beat inside the worker by enabling the workers -B option, this is convenient if you’ll never run more than one worker node, but it’s not commonly used and for that reason isn’t recommended for production use:
$ celery -A proj worker -B
So you're likely required to run the beat independently, using:
celery -l INFO -A my.celery.app.celery beat --scheduler my.celery.scheduler.SchedulerClass
Upvotes: 3