Andrii
Andrii

Reputation: 770

Start SQS celery worker on Elastic Beanstalk

I am trying to start a celery worker on EB but get an error which doesn't explain much.

Command in config file in .ebextensions dir:

03_celery_worker:
  command: "celery worker --app=config --loglevel=info -E --workdir=/opt/python/current/app/my_project/"

The listed command works fine on my local machine (just change workdir parameter).

Errors from the EB:

Activity execution failed, because: /opt/python/run/venv/local/lib/python3.6/site-packages/celery/platforms.py:796: RuntimeWarning: You're running the worker with superuser privileges: this is absolutely not recommended!

and

Starting new HTTPS connection (1): eu-west-1.queue.amazonaws.com (ElasticBeanstalk::ExternalInvocationError)

I have updated celery worker command with parameter --uid=2 and privileges error disappeared but command execution is still failed due to

ExternalInvocationError

Any suggestions what I do wrong?

Upvotes: 0

Views: 1448

Answers (1)

Andrii
Andrii

Reputation: 770

ExternalInvocationError

As I understand it means that listed command cannot be run from EB container commands. It is needed to create a script on the server and run celery from the script. This post describes how to do it.

Update: It is needed to create a config file in .ebextensions directory. I called it celery.config. Link to the post above provides a script which works almost fine. It is needed to make some minor additions to work 100% correct. I had issues with schedule periodic tasks (celery beat). Below are steps on how to fix is:

  1. Install (add to requirements) django-celery beat pip install django-celery-beat, add it to installed apps and use --scheduler parameter when starting celery beat. Instructions are here.

  2. In the script you specify user which run the script. For celery worker it is celery user which was added earlier in the script (if doesn't exist). When I tried to start celery beat I got error PermissionDenied. It means that celery user doesn't have all necessary rights. using ssh I logged in to EB, looked a list of all users (cat /etc/passwd) and decided to use daemon user.

Listed steps resolved celery beat errors. Updated config file with the script is below (celery.config):

files:
  "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh":
    mode: "000755"
    owner: root
    group: root
    content: |
      #!/usr/bin/env bash

      # Create required directories
      sudo mkdir -p /var/log/celery/
      sudo mkdir -p /var/run/celery/

      # Create group called 'celery'
      sudo groupadd -f celery
      # add the user 'celery' if it doesn't exist and add it to the group with same name
      id -u celery &>/dev/null || sudo useradd -g celery celery
      # add permissions to the celery user for r+w to the folders just created
      sudo chown -R celery:celery /var/log/celery/
      sudo chown -R celery:celery /var/run/celery/

      # Get django environment variables
      celeryenv=`cat /opt/python/current/env | tr '\n' ',' | sed 's/%/%%/g' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g'`
      celeryenv=${celeryenv%?}

      # Create CELERY configuration script
      celeryconf="[program:celeryd]
      directory=/opt/python/current/app
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery worker -A config.celery:app --loglevel=INFO --logfile=\"/var/log/celery/%%n%%I.log\" --pidfile=\"/var/run/celery/%%n.pid\"

      user=celery
      numprocs=1
      stdout_logfile=/var/log/celery-worker.log
      stderr_logfile=/var/log/celery-worker.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 60

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=998

      environment=$celeryenv"


      # Create CELERY BEAT configuraiton script
      celerybeatconf="[program:celerybeat]
      ; Set full path to celery program if using virtualenv
      command=/opt/python/run/venv/bin/celery beat -A config.celery:app --loglevel=INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler --logfile=\"/var/log/celery/celery-beat.log\" --pidfile=\"/var/run/celery/celery-beat.pid\"

      directory=/opt/python/current/app
      user=daemon
      numprocs=1
      stdout_logfile=/var/log/celerybeat.log
      stderr_logfile=/var/log/celerybeat.log
      autostart=true
      autorestart=true
      startsecs=10

      ; Need to wait for currently executing tasks to finish at shutdown.
      ; Increase this if you have very long running tasks.
      stopwaitsecs = 60

      ; When resorting to send SIGKILL to the program to terminate it
      ; send SIGKILL to its whole process group instead,
      ; taking care of its children as well.
      killasgroup=true

      ; if rabbitmq is supervised, set its priority higher
      ; so it starts first
      priority=999

      environment=$celeryenv"

      # Create the celery supervisord conf script
      echo "$celeryconf" | tee /opt/python/etc/celery.conf
      echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf

      # Add configuration script to supervisord conf (if not there already)
      if ! grep -Fxq "celery.conf" /opt/python/etc/supervisord.conf
        then
          echo "[include]" | tee -a /opt/python/etc/supervisord.conf
          echo "files: uwsgi.conf celery.conf celerybeat.conf" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Enable supervisor to listen for HTTP/XML-RPC requests.
      # supervisorctl will use XML-RPC to communicate with supervisord over port 9001.
      # Source: https://askubuntu.com/questions/911994/supervisorctl-3-3-1-http-localhost9001-refused-connection
      if ! grep -Fxq "[inet_http_server]" /opt/python/etc/supervisord.conf
        then
          echo "[inet_http_server]" | tee -a /opt/python/etc/supervisord.conf
          echo "port = 127.0.0.1:9001" | tee -a /opt/python/etc/supervisord.conf
      fi

      # Reread the supervisord config
      supervisorctl -c /opt/python/etc/supervisord.conf reread

      # Update supervisord in cache without restarting all services
      supervisorctl -c /opt/python/etc/supervisord.conf update

      # Start/Restart celeryd through supervisord
      supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd
      supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat

    commands:
      01_killotherbeats:
        command: "ps auxww | grep 'celery beat' | awk '{print $2}' | sudo xargs kill -9 || true"
        ignoreErrors: true
      02_restartbeat:
        command: "supervisorctl -c /opt/python/etc/supervisord.conf restart celerybeat"
        leader_only: true

One thing to focus attention on: in my project celery.py file is in the config directory, that is why I write -A config.celery:app when start celery worker and celery beat

Upvotes: 1

Related Questions