Reputation: 132
This is my full trace:
Traceback (most recent call last):
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/app/trace.py", line 283, in trace_task
uuid, retval, SUCCESS, request=task_request,
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 256, in store_result
request=request, **kwargs)
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/base.py", line 490, in _store_result
self.set(self.get_key_for_task(task_id), self.encode(meta))
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 160, in set
return self.ensure(self._set, (key, value), **retry_policy)
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 149, in ensure
**retry_policy
File "/home/server/backend/venv/lib/python3.4/site-packages/kombu/utils/__init__.py", line 243, in retry_over_time
return fun(*args, **kwargs)
File "/home/server/backend/venv/lib/python3.4/site-packages/celery/backends/redis.py", line 169, in _set
pipe.execute()
File "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2593, in execute
return execute(conn, stack, raise_on_error)
File "/home/server/backend/venv/lib/python3.4/site-packages/redis/client.py", line 2447, in _execute_transaction
connection.send_packed_command(all_cmds)
File "/home/server/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 532, in send_packed_command
self.connect()
File "/home/pserver/backend/venv/lib/python3.4/site-packages/redis/connection.py", line 436, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 0 connecting to localhost:6379. Error.
[2016-09-21 10:47:18,814: WARNING/Worker-747] Data collector is not contactable. This can be because of a network issue or because of the data collector being restarted. In the event that contact cannot be made after a period of time then please report this problem to New Relic support for further investigation. The error raised was ConnectionError(ProtocolError('Connection aborted.', BlockingIOError(11, 'Resource temporarily unavailable')),).
I really searched for ConnectionError but there was no matching problem with mine.
My platform is ubuntu 14.04. This is a part of my redis config. (I can share if you need the whole redis.conf file. By the way all parameters are closed on LIMITS section.)
# By default Redis listens for connections from all the network interfaces
# available on the server. It is possible to listen to just one or multiple
# interfaces using the "bind" configuration directive, followed by one or
# more IP addresses.
#
# Examples:
#
# bind 192.168.1.100 10.0.0.1
bind 127.0.0.1
# Specify the path for the unix socket that will be used to listen for
# incoming connections. There is no default, so Redis will not listen
# on a unix socket when not specified.
#
# unixsocket /var/run/redis/redis.sock
# unixsocketperm 755
# Close the connection after a client is idle for N seconds (0 to disable)
timeout 0
# TCP keepalive.
#
# If non-zero, use SO_KEEPALIVE to send TCP ACKs to clients in absence
# of communication. This is useful for two reasons:
#
# 1) Detect dead peers.
# 2) Take the connection alive from the point of view of network
# equipment in the middle.
#
# On Linux, the specified value (in seconds) is the period used to send ACKs.
# Note that to close the connection the double of the time is needed.
# On other kernels the period depends on the kernel configuration.
#
# A reasonable value for this option is 60 seconds.
tcp-keepalive 60
This is my mini redis wrapper:
import redis
from django.conf import settings
REDIS_POOL = redis.ConnectionPool(host=settings.REDIS_HOST, port=settings.REDIS_PORT)
def get_redis_server():
return redis.Redis(connection_pool=REDIS_POOL)
And this is how i use it:
from redis_wrapper import get_redis_server
# view and task are working in different, indipendent processes
def sample_view(request):
rs = get_redis_server()
# some get-set stuff with redis
@shared_task
def sample_celery_task():
rs = get_redis_server()
# some get-set stuff with redis
Package versions:
celery==3.1.18
django-celery==3.1.16
kombu==3.0.26
redis==2.10.3
So the problem is that; this connection error occurs after some time of starting celery workers. And after first seem of that error, all the tasks ends with this error until i restart all of my celery workers. (Interestingly, celery flower also fails during that problematic period)
I suspect of my redis connection pool usage method, or redis configuration or less probably network issues. Any ideas about the reason? What am i doing wrong?
(PS: I will add redis-cli info results when i will see this error today)
UPDATE:
I temporarily solved this problem by adding --maxtasksperchild parameter to my worker starter command. I set it to 200. Ofcourse it is not the proper way to solve this problem, it is just a symptomatic cure. It basically refreshes the worker instance periodically (closes old process and creates new one when old one reached 200 task) and refreshes my global redis pool and connections. So i think i should focus on global redis connection pool usage way and i'm still waiting for new ideas and comments.
Sorry for my bad English and thanks in advance.
Upvotes: 3
Views: 2731
Reputation: 349
Have you enabled the rdb background save method in redis ??
if so check for the size of the dump.rdb
file in /var/lib/redis
.
Sometimes the file grows in size and fill the root
directory and the redis instance cannot save to that file anymore.
You can stop the background save process by issuing
config set stop-writes-on-bgsave-error no
command on redis-cli
Upvotes: -1