Mahmoud Abdelkader
Mahmoud Abdelkader

Reputation: 24999

Intermittent issues while accessing external http services using gevent

First off, the versions:

We recently upgraded our servers that are running behind gunicorn to use the gevent asynchronous workers instead of just normal sync workers. Everything works great, but we're now experiencing an issue when attempting to access a 3rd party service over http and I just have no idea how to track down what might be the issue.

A brief stack trace looks like the following:

File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/sessions.py", line 295, in post
  return self.request('post', url, data=data, **kwargs)
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/sessions.py", line 252, in request
  r.send(prefetch=prefetch)
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/models.py", line 625, in send
  raise ConnectionError(sockerr)
ConnectionError: [Errno 66] unknown

Another different stack trace but we think it's the same issue:

File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/requests/packages/urllib3/connectionpool.py", line 94, in connect
  sock = socket.create_connection((self.host, self.port), self.timeout)
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/gevent/socket.py", line 637, in create_connection
  for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/home/deploy/.virtualenvs/bapp/lib/python2.7/site-packages/gevent/socket.py", line 769, in getaddrinfo
  raise
DNSError: [Errno 66] unknown

At first, I thought it could be potentially something related to a libevent-dns, from this google groups issue. I checked our /etc/resolv.conf, and there is only one dns resolution service:

[me@host:~]$ cat /etc/resolv.conf
; generated by /sbin/dhclient-script
nameserver 10.3.0.2

I looked up what ERRNO66 is: https://github.com/libevent/libevent/blob/master/include/event2/dns.h#L162,"/** An unknown error occurred */". I'm not having much luck finding that helpful..sounds like it couldn't talk to the dns server?

I thought it might have to do something with python-requests, see how enable requests async mode? since python-requests depends on urllib3, which is implemented in terms of httplib; but, it turns out the author of gevent removed the httplib patch in this commit earlier this year without any comments as to why.

Does anyone have any ideas on how to approach debugging this issue or might shed some light on what's happening here?

Thanks in advance!

Update - 12:50PM PDT

After some conversations on freenode, the #gevent and the #gunicorn channel seem to shed some more insight:

#gevent

#gunicorn

Sounds like the general advice is to ditch gevent v0.13.7 and upgrade to gevent 1.0b.

I'll follow up on if that fixes this issue. Meanwhile, anyone that can shed advice, I'd much appreciate it.

Update #2 - 4 days in production, 1:15PM PDT

Looks like the upgrade to gevent has solved this issue -- I'll add my answer and accept it if no one else chimes in, but only after a week without incidents in production.

Upvotes: 8

Views: 1231

Answers (1)

Mahmoud Abdelkader
Mahmoud Abdelkader

Reputation: 24999

Upgrading to gevent 1.0b has eliminated the issue.

Upvotes: 3

Related Questions