user2507194
user2507194

Reputation: 185

How do Celery run multiple tasks so slow with Python?

My celery start with amqp

     -------------- celery@tty-Gazelle-Professional v3.0.19 (Chiastic Slide)
     ---- **** ----- 
     --- * ***  * -- Linux-3.8.0-25-generic-x86_64-with-Ubuntu-13.04-raring
     -- * - **** --- 
     - ** ---------- [config]
     - ** ---------- .> broker:      amqp://guest@localhost:5672//
     - ** ---------- .> app:         proj.celery:0x25ed510
     - ** ---------- .> concurrency: 8 (processes)
     - *** --- * --- .> events:      OFF (enable -E to monitor this worker)
     -- ******* ---- 
     --- ***** ----- [queues]
     -------------- .> celery:      exchange:celery(direct) binding:celery

There is a function:

    def prime(n):
        .....
        .....
        return number_of_primes_below_n

So I made this function as a task in celery and compared to serial computation

The serial:

    [prime(i) for i in xrange(10, 100000)]

The parallel with celery:

    from celery import *
    
    g = group(prime.s(i) for i in xrange(10, 100000))
    res = g.apply_async()

when I apply_async(), in the backend the result showing on the terminal screen very quickly like:

[2013-06-20 16:34:56,238: INFO/MainProcess] Task proj.tasks.do_work[989be06b-c4f3-4876-9311-2f5f813857d5] succeeded in 0.0166230201721s: 99640324
[2013-06-20 16:34:56,241: INFO/MainProcess] Task proj.tasks.do_work[6eaa9b85-7ba2-4397-b6ae-cbb5668633d4] succeeded in 0.0123620033264s: 99740169
[2013-06-20 16:34:56,242: INFO/MainProcess] Task proj.tasks.do_work[1f5f6302-94a3-4937-9914-14690d856a5d] succeeded in 0.00850105285645s: 99780121
[2013-06-20 16:34:56,244: INFO/MainProcess] Task proj.tasks.do_work[b3735842-a49c-48a3-8a9e-fab24c0a6c23] succeeded in 0.0102620124817s: 99820081
[2013-06-20 16:34:56,245: INFO/MainProcess] Task proj.tasks.do_work[98eec31a-52eb-4752-92af-6956c0e6f130] succeeded in 0.00973200798035s: 99880036
[2013-06-20 16:34:56,245: INFO/MainProcess] Task proj.tasks.do_work[011a1e99-b307-480b-9765-b1a472dbfa8c] succeeded in 0.0115168094635s: 99800100
[2013-06-20 16:34:56,245: INFO/MainProcess] Task proj.tasks.do_work[f3e3a89f-de79-4ab0-aab7-0a71fe2ab2f7] succeeded in 0.010409116745s: 99840064
[2013-06-20 16:34:56,246: INFO/MainProcess] Task proj.tasks.do_work[61baef04-03c2-4810-bf6a-ae7aa75b80b4] succeeded in 0.0112910270691s: 99860049

but when I would like to get the result in celery with

    res.get()

it runs very very slow, much slower than serial. What is the problem? Is it because the getting results from celery group is slow? How can I solve the problem?

Upvotes: 2

Views: 3593

Answers (1)

Artem Mezhenin
Artem Mezhenin

Reputation: 5757

If you timeit res.get() operation you'll notice(I hope it's true), that is always about 500 ms. This is because AsyncResult.get have to poll for result every N milliseconds. You can adjust this by providing additional parameter for get, interval:

res.get(interval=0.005)  

You can get more information in documentation and source. Be warn, Celery is not best solution for RPC-like communication, because polling for results cause big performance hit.

My own question

Upvotes: 5

Related Questions