Sergey Luchko
Sergey Luchko

Reputation: 3356

Celery chain's place of passing arguments

1 ) Celery chain.

On the doc I read this:

Here’s a simple chain, the first task executes passing its return value to the next task in the chain, and so on.

>>> from celery import chain

>>> # 2 + 2 + 4 + 8
>>> res = chain(add.s(2, 2), add.s(4), add.s(8))()
>>> res.get()
16

But where exactly is chain item's result passed to next chain item? On the celery server side, or it passed to my app and then my app pass it to the next chain item?

It's important to me, because my results is quite big to pass them to app, and I want to do all this messaging right into celery server.


2 ) Celery group.

>>> g = group(add.s(i) for i in xrange(10))
>>> g(10).get()
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19]

Can I be sure that these tasks will be executed as much as possible together. Will celery give priority certain group since the first task of the group start to be being executed?

For example I have 100 requests and each request run group of task, and I don't want to mix task from different groups between each other. First started request to be processed can be the last completed, while his the last task are waiting for free workers which are busy with tasks from others requests. It seems to be better if group of task will be executed as much as possible together.


I will really appreciate if you can help me.

Upvotes: 1

Views: 4750

Answers (1)

Sanket Sudake
Sanket Sudake

Reputation: 771

1. Celery Chain

Results are passed on celery side using message passing broker such as rabbitmq. Result are stored using result backend(explicitly required for chord execution). You could verify this information by running your celery worker with loglevel 'INFO' and identify how tasks are invoked.

Celery maintains dependency graph once you invoke tasks, so it exactly knows how to chain your tasks.

Consider callbacks where you link two different tasks,

http://docs.celeryproject.org/en/latest/userguide/canvas.html#callbacks

2. Celery Group

When you call tasks in group celery executes(invokes) them in parallel. Celery worker will try to pick up them depending upon workload it can pick up. If you invoke large number of tasks than your worker can handle, it is certainly possible your first few tasks will get executed first then celery worker will pick rest gradually.

If you have very large no. of task to be invoked in parallel better to invoke then in chunks of certain pool size,

You can mention priority of tasks as mentioned in answer

Completion of tasks in group depends on how much time each task takes. Celery tries to do fair task scheduling as much as possible.

Upvotes: 3

Related Questions