Reputation: 42769
I need my celeryq workers to pre-load a bunch of data before they start processing tasks. Tasks involve using lots of data from disk, which takes several seconds if not cached, which is too long. So I want the loading to happen before they take their first task. In gunicorn-land this is easy with a pre-fork or post-fork hook, but AFAICT celery doesn't offer anything like this. (I really want to use gevent or eventlet because my tasks use GPU and so they can't fork.) Looking through the celery code there are a bunch of places where something like this might work, but whenever I try them, the workers don't behave correctly. For example, this seems like it should work:
if __name__ == "__main__":
logger.info("Pre-loading data")
warmup_data()
logger.info("CeleryQ worker starting")
worker = app.worker_main(
argv = [
'worker',
'--loglevel=DEBUG',
'--pool=gevent',
'--concurrency=4',
]
)
and it almost works. The problem is tasks get accepted but responses never go back. I'm guessing the tasks are crashing silently for some reason, which is odd in its own right. I've tried a bunch of similar variations on this, and they're not all the same, but nothing quite works.
Upvotes: 1
Views: 544
Reputation: 19822
Find appropriate worker signal and implement handler for it that does the logic you need. Judging by what you wrote in your question you should either implement worker_init
or perhaps celeryd_init
handler. I would go for worker_init
first, and then see how it goes.
Upvotes: 1