Reputation: 11028
I am getting a WorkerLostError
from my worker when Calling a celery task decorated function with the delay method (do_stuff.delay()
) in my web application.
Setup: Django 1.5
, Celery 3.1.9
, Billiard 3.3.0.16
, Heroku
, CloudAMQP
and Psycopg2 2.5.1
.
Heroku logs
app[worker.1]: [2014-03-04 10:29:00,850: INFO/MainProcess] Received task: my_app.tasks.do_stuff[622fe86c-e5f8-49e5-a5f6-2bde446b709f]
app[cloudamqp]: sample#messages.publish=26 sample#messages.deliver_get=26 sample#messages.deliver_no_ack=26
app[worker.1]: [2014-03-04 10:29:18,939: ERROR/MainProcess] Process 'Worker-3' pid:12 exited with exitcode -11
app[worker.1]: [2014-03-04 10:29:18,976: ERROR/MainProcess] Task my_app.tasks.do_stuff[622fe86c-e5f8-49e5-a5f6-2bde446b709f] raised unexpected: WorkerLostError('Worker exited prematurely: signal 11 (SIGSEGV).',)
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/billiard/pool.py", line 1168, in mark_as_worker_lost
app[worker.1]: Traceback (most recent call last):
app[worker.1]: human_status(exitcode)),
app[worker.1]: WorkerLostError: Worker exited prematurely: signal 11 (SIGSEGV).
Then I read this, the last person indicated that upgrading Psycopg2
could fix the issue, so I upgraded from Psycopg2 2.5.1
to 2.5.2
app[worker.1]: R = retval = fun(*args, **kwargs)
app[worker.1]: [2014-03-04 11:25:51,678: INFO/MainProcess] Received task: my_app.tasks.do_stuff[fef47731-3844-4f16-bbfc-0c5c7fd8f8ef]
app[worker.1]: [2014-03-04 11:25:51,705: ERROR/MainProcess] Task my_app.tasks.do_stuff[fef47731-3844-4f16-bbfc-0c5c7fd8f8ef] raised unexpected: OperationalError('co
uld not connect to server: No such file or directory\n\tIs the server running locally and accepting\n\tconnections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?\n',)
app[worker.1]: Traceback (most recent call last):
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/celery/app/trace.py", line 238, in trace_task
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/celery/app/trace.py", line 416, in __protected_call__
app[worker.1]: user = User.objects.get(email=email)
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/manager.py", line 143, in get
app[worker.1]: num = len(clone)
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/query.py", line 90, in __len__
app[worker.1]: self._result_cache = list(self.iterator())
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/query.py", line 301, in iterator
app[worker.1]: for row in compiler.results_iter():
app[worker.1]: return self.run(*args, **kwargs)
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/query.py", line 382, in get
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 839, in execute_sql
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/backends/__init__.py", line 326, in cursor
app[worker.1]: File "/app/my_project/my_app/tasks.py", line 24, in do_stuff
2014-03-04T11:25:51.709228+00:00 app[worker.1]: cursor = util.CursorWrapper(self._cursor(), self)
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/backends/postgresql_psycopg2/base.py", line 182, in _cursor
app[worker.1]: cursor = self.connection.cursor()
app[worker.1]: self.connection = Database.connect(**conn_params)
app[worker.1]: return self.get_query_set().get(*args, **kwargs)
app[worker.1]: for rows in self.execute_sql(MULTI):
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/psycopg2/__init__.py", line 164, in connect
app[worker.1]: OperationalError: could not connect to server: No such file or directory
app[worker.1]: File "/app/.heroku/python/lib/python2.7/site-packages/django/db/models/sql/compiler.py", line 775, in results_iter
app[worker.1]: conn = _connect(dsn, connection_factory=connection_factory, async=async)
app[worker.1]:
app[worker.1]: connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
app[worker.1]: Is the server running locally and accepting
Then I looked up this error, but my Database is storing and retrieving entries fine in my production environment. It's only when the database is used with a celery worker in production that I get this problem. Also I checked this article and my DATABASE_URL
and HEROKU_POSTGRESQL_SILVER_URL
are correct and identical.
Worker command in Procfile
worker: celery -A inonemonth worker -l info
I spun up an Dyno for the worker btw.
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
app = Celery('my_project')
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
settings/base.py
# ...
# Celery works perfectly locally:
#BROKER_URL = 'amqp://localhost:5672//'
# But this fails on production:
BROKER_URL = "amqp://roorlfoj:[email protected]/roorlfoj"
BROKER_POOL_LIMIT = 1 # I've tried this with value 3 a well
my_app/tasks.py
from __future__ import absolute_import
from my_project.celery import app
from my_app.models import MyModel
@app.task
def do_stuff():
return MyModel.objects.all()
Any Idea why the OperationalError is caused and how I can gracefully prevent it?
Upvotes: 0
Views: 686