Reputation: 10454
I have two celery nodes on 2 machines (n1, n2) and my task enqueue is on another machine (main). The main machine may not know the available node names. My question is whether there is any guarantee that a chain of tasks will run on a single node.
res = chain(generate.s(filePath1, filePath2), mix.s(), sort.s())
the problem is that various tasks are using local data files that are node specific. My guess is that chain is probably like chords which the doc explicitly says that there is no guarantee to run on a single node. and if my guess about chain is right, then my next question is would the following be a good solution as an alternative to chains?
single task = guaranteed single node
@app.task
def my_chain_of_tasks():
celery.current_app.send_task('mymodel.tasks.generate', args=[filePath1, filePath2]).get()
celery.current_app.send_task('mymodel.tasks.mix').get()
# do these 2 in parallel:
res1 = celery.current_app.send_task('mymodel.tasks.sort')
res2 = celery.current_app.send_task('mymodel.tasks.email_in_parallel')
res1.get()
return res2.get()
or is this still going to send the tasks to the message queue and cause the same problem?
Upvotes: 1
Views: 1081
Reputation: 29514
You are calling a .get()
on a task inside another task which is counter productive. Also there is no guarantee that all those tasks will be executed on a single node.
If You want a few tasks to be executed by particular node, you can queue them or route them accordingly.
CELERY_ROUTES = {
'mymodel.task.task1': {'queue': 'queue1'},
'mymodel.task.task2': {'queue': 'queue2'}
}
Now you can start two workers to consume them
celery worker -A your_proj -Q queue1
celery worker -A your_proj -Q queue2
Now all task1
will be executed by worker1 and task2
by worker2.
Docs: http://celery.readthedocs.org/en/latest/userguide/routing.html#manual-routing
Upvotes: 2