dengar81
dengar81

Reputation: 2525

CeleryD seems to ignore concurrency argument

I've recently upgraded my Django project to Celery 4.4.6 and things aren't going well. My current number one problem is the concurrency of tasks. Because the tasks lock database tables and some are very memory intensive, there's no chance to run eight tasks at the same time. I also only have a 2-processor machine available. However, that is what celery is set on doing.

Previously I was able to run just two tasks at the same time.

The worker is daemonised and only one worker is live (one node). I set the concurrency to two. Here's my /etc/default/celeryd:

#   most people will only start one node:
CELERYD_NODES="worker1"
#   but you can also start multiple and configure settings
#   for each in CELERYD_OPTS
#CELERYD_NODES="worker1 worker2 worker3"
#   alternatively, you can specify the number of nodes to start:
#CELERYD_NODES=3

# Absolute or relative path to the 'celery' command:
CELERY_BIN="/home/ubuntu/dev/bin/python -m celery"
#CELERY_BIN="/virtualenvs/def/bin/celery"

# App instance to use
# comment out this line if you don't use an app
CELERY_APP="match2"
# or fully qualified:
#CELERY_APP="proj.tasks:app"

# Where to chdir at start.
export DJANGO_SETTINGS_MODULE="match2.settings"
CELERYD_CHDIR="/home/ubuntu/dev/match2/match2"

# Extra command-line arguments to the worker
CELERYD_OPTS="--concurrency=2"
# Configure node-specific settings by appending node name to arguments:
#CELERYD_OPTS="--time-limit=300 -c 8 -c:worker2 4 -c:worker3 2 -Ofair:worker1"

# Set logging level to DEBUG
CELERYD_LOG_LEVEL="INFO"

# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/%n.pid"

# Workers should run as an unprivileged user.
#   You need to create this user manually (or you can choose
#   a user/group combination that already exists (e.g., nobody).
CELERYD_USER="ubuntu"
CELERYD_GROUP="users"

# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1

I was very much under the assumption that this line would take care of how many tasks can be executed concurrently: CELERYD_OPTS="--concurrency=2" But it seems it still picks up to eight items from the RabbitMQ message queue.

Any help appreciated.

Upvotes: 0

Views: 364

Answers (1)

dengar81
dengar81

Reputation: 2525

So after some back and forth and joining the Google Group, I finally got an answer:

If you want Celery to behave like a good little worker and not take on another task before finishing the old one, you need to have both in your settings file:

task_acks_late = True
worker_prefetch_multiplier = 1

If you then use this in a Django project, with the old-style uppercased settings (see: https://docs.celeryproject.org/en/stable/userguide/configuration.html#new-lowercase-settings), you turn this into:

CELERY_WORKER_PREFETCH_MULTIPLIER = 1
CELERY_TASK_ACKS_LATE = True

Upvotes: 1

Related Questions