Shan
Shan

Reputation: 1054

python requests session failed to read the response after reading a big (more than 50mb) response content

When using python requests to access some rest api, I am using request's session object. I faced a issue, when the first request is reading large content (more than 50mb) then the subsequent http request fails on the same session object. But if I didnt use the Session object then everything works fine... I have explained the code below...

import requests       # version 2.3.0  # python version 2.7

headers = {"Authorization":"Bearer sometoken"}

sess = requests.Session()
sess.verify = False
host = "https://somehost/endpoint/"
res = sess.get(url = host+'obj1/28/content', headers = headers)
print res  # this result received successfully with 200 response status code

url = host + 'obj2/1/content'
res = sess.get(url = url, headers=headers)  # the process running here continuously running     here. I need to kill the process to exit.
print "content ", res.content # this line never gets executed...

After killing the process , stack trace....

  File "/opt/lib/python2.7/site-packages/requests/sessions.py", line 556, in send
    r = adapter.send(request, **kwargs)
  File "/opt/lib/python2.7/site-packages/requests/adapters.py", line 391, in send
    r.content
  File "/opt/lib/python2.7/site-packages/requests/models.py", line 690, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "/opt/lib/python2.7/site-packages/requests/models.py", line 628, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/opt/lib/python2.7/site-packages/requests/packages/urllib3/response.py", line 240, in stream
    data = self.read(amt=amt, decode_content=decode_content)
  File "/opt/lib/python2.7/site-packages/requests/packages/urllib3/response.py", line 187, in read
    data = self._fp.read(amt)
  File "/opt/lib/python2.7/httplib.py", line 567, in read
    s = self.fp.read(amt)
  File "/opt/lib/python2.7/httplib.py", line 1313, in read
    return s + self._file.read(amt - len(s))
  File "/opt/lib/python2.7/socket.py", line 380, in read
    data = self._sock.recv(left)
  File "/opt/lib/python2.7/ssl.py", line 242, in recv
    return self.read(buflen)
  File "/opt/lib/python2.7/ssl.py", line 161, in read
    return self._sslobj.read(len)

But the same http requests with out Session object works fine.

print requests.get( host+'obj1/28/content', headers = headers, verify = False)
print requests.get( host+'obj2/1/content', headers = headers, verify = False)

Upvotes: 1

Views: 3878

Answers (1)

Patrick Collins
Patrick Collins

Reputation: 10574

From the requests docs:

Excellent news — thanks to urllib3, keep-alive is 100% automatic within a session! Any requests that you make within a session will automatically reuse the appropriate connection!

Note that connections are only released back to the pool for reuse once all body data has been read; be sure to either set stream to False or read the content property of the Response object.

Sounds like the large request is holding up that connection, or, as abarnert suggests, there's an issue with the server. Try setting stream=False, or access the content of that first res object so that requests knows that it can free up that connection.

EDIT: This looks like the issue. When you call requests.get, you set verify = False explicity. This is unnecessary, since the default for requests.get is False.

However, your lockup is in adapter.send(request, **kwargs). So it looks like an HTTPAdapter object is at fault. adapter.send has the following signature:

 send(request, stream=False, timeout=None, verify=True, cert=None, proxies=None)

with verify=True as the default.

This sounds like a bug in requests, but my guess is that the verify parameter isn't getting passed down from the Session. The signature for sess.request is:

request(method, url, params=None, data=None, headers=None, cookies=None, files=None, auth=None, timeout=None, allow_redirects=True, proxies=None, hooks=None, stream=None, verify=None, cert=None)

where verify=None rather than False, so maybe that means that it's getting overriden somewhere.

Try explicitly setting verify=False in sess.get.

Upvotes: 2

Related Questions