Michel Müller
Michel Müller

Reputation: 5695

celery task registry for asynchronous tasks

I can't figure out how to define celery tasks in a modular way (i.e. not all tasks in the same file) and register them correctly for asynchronous use. I've tried all options I can think off:

Whatever I do, I always end up with 'KeyError' thrown by the task registry, but only when executing with apply_async. The synchronous version always works fine.

If anyone could give me a hint on what I should do to fix this, please share.

Here is a minimal example:

minimal.task1.task

# -*- coding: utf-8 -*-
from celery import Task
from minimal2.celery_app import app
class Task1(Task):
   name = ""
   def run(self, number):
       return number / 2.0

app.tasks.register(Task1())

minimal.task2.task

# -*- coding: utf-8 -*-
from celery import Task
from minimal2.celery_app import app

class Task2(Task):
    name = "minimal2.task2.task.Task2"
    def run(self, number):
        return number * number

app.tasks.register(Task2())

minimal2.celery_app

# -*- coding: utf-8 -*-
from celery import Celery

app = Celery('minimal', backend='amqp', broker='amqp://')
app.autodiscover_tasks(['task1', 'task2'], 'task')

minimal2/start.sh

#!/bin/bash
set -e

start_celery_service() {
    name=$1
    pid_file_path="$(pwd)/${name}.pid"
    if [ -e "${pid_file_path}" ] ; then
        kill $(cat ${pid_file_path}) && :
        sleep 3.0
        rm -f "${pid_file_path}"  # just in case the file was stale
    fi
    celery -A minimal2.celery_app.app worker -l DEBUG --pidfile=${pid_file_path} --logfile="$(pwd)/${name}.log" &
    sleep 3.0
}

prev_dir=$(pwd)
cd "$(dirname "$0")"
cd ../
rabbitmq-server &
start_celery_service "worker1"
cd $prev_dir

testing

from minimal2.task1.task import Task1
print Task1().apply(args=[], kwargs={'number':2}).get()
> 1.0
print Task1().apply_async(args=[], kwargs={'number':2}).get() # (first time: never comes back -> hitting ctrl-c)
print Task1().apply_async(args=[], kwargs={'number':2}).get() # second time
 Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/local/lib/python2.7/site-packages/celery/result.py", line 194, in get
     on_message=on_message,
   File "/usr/local/lib/python2.7/site-packages/celery/backends/base.py", line 470, in wait_for_pending
     return result.maybe_throw(propagate=propagate, callback=callback)
   File "/usr/local/lib/python2.7/site-packages/celery/result.py", line 299, in maybe_throw
     self.throw(value, self._to_remote_traceback(tb))
   File "/usr/local/lib/python2.7/site-packages/celery/result.py", line 292, in throw
     self.on_ready.throw(*args, **kwargs)
   File "/usr/local/lib/python2.7/site-packages/vine/promises.py", line 217, in throw
     reraise(type(exc), exc, tb)
   File "<string>", line 1, in reraise
 celery.backends.base.NotRegistered: ''

#.. same spiel with Task2:
#..
> celery.backends.base.NotRegistered: 'minimal2.task2.task.Task2'

#.. same if I do name = __name__ in Task2:
#..
> celery.backends.base.NotRegistered: 'minimal2.task2.task'

# autodiscover had no effect

I've had the same behavior in Ubuntu inside a docker container as well as macOS, both on the latest celery version on Pypy:

celery report

software -> celery:4.1.0 (latentcall) kombu:4.1.0 py:2.7.13
            billiard:3.5.0.3 py-amqp:2.2.2
platform -> system:Darwin arch:64bit imp:CPython
loader   -> celery.loaders.default.Loader
settings -> transport:amqp results:disabled

Upvotes: 2

Views: 4769

Answers (1)

Farhat Nawaz
Farhat Nawaz

Reputation: 212

If I understand the question correctly, you can use include argument where you're creating your celery app. It'll register all tasks found in the modules mentioned in the include argument. For example:

celery = Celery(app.import_name, broker=app.config['CELERY_BROKER_URL'],
                    CELERY_RESULT_BACKEND=app.config['CELERY_BROKER_URL'],
                    include=['minimal.task1', 'minimal.task2'])

Edit by question poster: In addition, in order to get the correct import naming, the task class' name property needs to be set as follows:

class Task1(Task):
   name = __name__

Essentially, the value of name at task registering time needs to match exactly the name with which the task is imported at client side.

Upvotes: 2

Related Questions