Reputation: 128307
I've got some Python code that makes requests using the requests library and occasionally experiences an IncompleteRead
error. I'm trying to update this code to handle this error more gracefully and would like to test that it works, so I'm wondering how to actually trigger the conditions under which IncompleteRead
is raised.
I realize I can do some mocking in a unit test; I'd just like to actually reproduce the circumstances (if I can) under which this error was previously occurring and ensure my code is able to deal with it properly.
Upvotes: 1
Views: 1309
Reputation: 154715
By looking at the places where raise IncompleteRead
appears at https://github.com/python/cpython/blob/v3.8.0/Lib/http/client.py, I think the standard library's http.client
module (named httplib
back in Python 2) raises this exception in only the following two circumstances:
Content-Length
header, orIf you install Flask (pip install Flask
), you can paste this into a file to create a test server you can run with endpoints that artificially create both of these circumstances:
from flask import Flask, make_response
app = Flask(__name__)
@app.route('/test')
def send_incomplete_response():
response = make_response('fourteen chars')
response.headers['Content-Length'] = '10000'
return response
@app.route('/test_chunked')
def send_chunked_response_with_wrong_sizes():
# Example response based on
# https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding
# but with the stated size of the second chunk increased to 900
resp_text = """7\r\nMozilla\r\n900\r\nDeveloper\r\n7\r\nNetwork\r\n0\r\n\r\n"""
response = make_response(resp_text)
response.headers['Transfer-Encoding'] = 'chunked'
return response
app.run()
and then test them with http.client
:
>>> import http.client
>>>
>>> conn = http.client.HTTPConnection('localhost', 5000)
>>> conn.request('GET', '/test')
>>> response = conn.getresponse()
>>> response.read()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/http/client.py", line 467, in read
s = self._safe_read(self.length)
File "/usr/lib/python3.8/http/client.py", line 610, in _safe_read
raise IncompleteRead(data, amt-len(data))
http.client.IncompleteRead: IncompleteRead(14 bytes read, 9986 more expected)
>>>
>>> conn = http.client.HTTPConnection('localhost', 5000)
>>> conn.request('GET', '/test_chunked')
>>> response = conn.getresponse()
>>> response.read()
Traceback (most recent call last):
File "/usr/lib/python3.8/http/client.py", line 571, in _readall_chunked
value.append(self._safe_read(chunk_left))
File "/usr/lib/python3.8/http/client.py", line 610, in _safe_read
raise IncompleteRead(data, amt-len(data))
http.client.IncompleteRead: IncompleteRead(28 bytes read, 2276 more expected)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/http/client.py", line 461, in read
return self._readall_chunked()
File "/usr/lib/python3.8/http/client.py", line 575, in _readall_chunked
raise IncompleteRead(b''.join(value))
http.client.IncompleteRead: IncompleteRead(7 bytes read)
In real life, the most likely reason this might happen sporadically is if a connection was closed early by the server. For example, you can also try running this Flask server, which sends a response body very slowly, with a total of 20 seconds of sleeping:
from flask import Flask, make_response, Response
from time import sleep
app = Flask(__name__)
@app.route('/test_generator')
def send_response_with_delays():
def generate():
yield 'foo'
sleep(10)
yield 'bar'
sleep(10)
yield 'baz'
response = Response(generate())
response.headers['Content-Length'] = '9'
return response
app.run()
If you run that server in a terminal, then initiate a request to it and start reading the response like this...
>>> import http.client
>>> conn = http.client.HTTPConnection('localhost', 5000)
>>> conn.request('GET', '/test_generator')
>>> response = conn.getresponse()
>>> response.read()
... and then flick back to the terminal running your server and kill it (e.g. with CTRL-C, on Unix), then you'll see your .read()
call error out with a familiar message:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/http/client.py", line 467, in read
s = self._safe_read(self.length)
File "/usr/lib/python3.8/http/client.py", line 610, in _safe_read
raise IncompleteRead(data, amt-len(data))
http.client.IncompleteRead: IncompleteRead(6 bytes read, 3 more expected)
Other, less probable causes include your server systematically generating an incorrect Content-Length
header (maybe due to some broken handling of Unicode), or your Content-Length
header (or the lengths included in a chunked
message) being corrupted in transit.
Okay, that covers the standard library. What about Requests? Requests by default defers its work to urllib3
which in turn defers to http.client
, so you might expect the exception from http.client
to simply bubble up when using Requests. However, life is more complicated than that, for two reasons:
Both urllib3
and requests
catch exceptions in the layer beneath them and raise their own versions. For instance, there are urllib3.exceptions.IncompleteRead
and requests.exceptions.ChunkedEncodingError
.
The current handling of Content-Length
checking across all three of these modules is horribly broken, and has been for years. I've done my best to explain it in detail at https://github.com/psf/requests/issues/4956#issuecomment-573325001 if you're interested, but the short version is that http.client
won't check Content-Length
if you call .read(123)
instead of just .read()
, that urllib3
may or may not check depending upon various complicated details of how you call it, and that Requests - as a consequence of the previous two issues - currently doesn't check it at all, ever. However, this hasn't always been the case; there have been some attempts to fix it made and unmade, so perhaps at some point in the past - like when this question was asked in 2016 - the state of play was a bit different. Oh, and for extra confusion, while urllib3
has its own version it still sometimes lets the standard library's IncompleteRead
exception bubble up, just to mess with you.
Hopefully, point 2 will get fixed in time - I'm having a go right now at nudging it in that direction. Point 1 will remain a complication, but the conditions that trigger these exceptions - whether the underlying http.client.IncompleteRead
or the urllib3
or requests
alternatives - should remain as I describe at the start of this answer.
Upvotes: 1
Reputation: 76929
Adding a second answer, more to the point this time. I took a dive into some source code, and found information that may help
The IncompleteRead
exception bubbles up from httplib
, part of the python standard library. Most likely, it comes from this function:
def _safe_read(self, amt):
"""
Read the number of bytes requested, compensating for partial reads.
Normally, we have a blocking socket, but a read() can be interrupted
by a signal (resulting in a partial read).
Note that we cannot distinguish between EOF and an interrupt when zero
bytes have been read. IncompleteRead() will be raised in this
situation.
This function should be used when <amt> bytes "should" be present for
reading. If the bytes are truly not available (due to EOF), then the
IncompleteRead exception can be used to detect the problem.
"""
So, either the socket was closed before the HTTP response was consumed, or the reader tried to get too many bytes out of it. Judging by search results (so take this with a grain of salt), there is no other arcane situation that can make this happen.
The first scenario can be debugged with strace
. If I'm reading this correctly, the 2nd scenario can be caused by the requests
module, if:
Content-Length
header is present that exceeds the actual amount of data sent by the server.This function raises the Exception
:
def _update_chunk_length(self):
# First, we'll figure out length of a chunk and then
# we'll try to read it from socket.
if self.chunk_left is not None:
return
line = self._fp.fp.readline()
line = line.split(b';', 1)[0]
try:
self.chunk_left = int(line, 16)
except ValueError:
# Invalid chunked protocol response, abort.
self.close()
raise httplib.IncompleteRead(line)
Try checking the Content-Length
header of your buffered responses, or the chunk format of your chunked responses.
To produce the error, try:
Content-Length
Upvotes: 1
Reputation: 76929
When testing code that relies on external behavior (such as server responses, system sensors, etc) the usual approach is to fake the external factors instead of working to produce them.
Create a test version of the function or class you're using to make HTTP requests. If you're using requests
directly across your codebase, stop: direct coupling with libraries and external services is very hard to test.
You mention that you want to make sure your code can handle this exception, and you'd rather avoid mocking for this reason. Mocking is just as safe, as long as you're wrapping the modules you need to mock all across your codebase. If you can't mock to test, you're missing layers in your design (or asking too much of your testing suite).
So, for example:
class FooService(object):
def make_request(*args):
# use requests.py to perform HTTP requests
# NOBODY uses requests.py directly without passing through here
class MockFooService(FooService):
def make_request(*args):
raise IncompleteRead()
The 2nd class is a testing utility written solely for the purpose of testing this specific case. As your tests grow in coverage and completeness, you may need more sophisticated language (to avoid incessant subclassing and repetition), but it's usually good to start with the simplest code that will read easily and test the desired cases.
Upvotes: 0