Reputation: 4488
I'm new to Celery and I'm trying to understand if it can solve my problem.
I need to start a number of tasks (An
) and then run another task (B
) after these are done. The problem is that tasks An
are added sequentially and I don't want to wait for the last one to be added before I start the first one. Can I configure task B
to execute after tasks An
are done?
Now to the real scenario:
An
- Process a file uploaded by user (Added after each file is
uploaded) B
- do something with the results of processing all
uploaded filesAlternative solutions are welcome as well
Upvotes: 2
Views: 4663
Reputation: 5807
For sure you can do this, celery canvas supports many options, inluding the behaviour you require, running a task after a group of tasks ... it is called "Chords", e.g.:
from celery import chord
from tasks import task_upload1, task_upload2, task_upload3, final_execution
result = chord(task_upload1.s(), task_upload2.s(), task_upload3.s())(final_execution.s())
get_required_result = result.get()
you can refer to this link for more details
Upvotes: 1
Reputation: 5757
With RabbitMQ you can get exact behavior using message acknowledgment and aggregator pattern.
You start worker, that consumes messages (A
) and do some work(process a file uploaded by user in your case), but doesn't sent ack
when finished. Instead it takes next message form queue and if it's A
task again, he is doing the same thing. At some point he will receive task B
and could process all previous A
's results, all atones and send ack
to all of them.
Unfortunately, this scenario can't be done with Celery, because you have to specify all A
tasks and final B
task(chains, chords, callbacks, etc.) on creating time.
Alternatively, you can save Task.id
for each successful A
task in separate queue (not Celery queue) and process this messages, when executing B
task. Celery can fit for this algorithm.
Upvotes: 0