Juda Xovex
Juda Xovex

Reputation: 33

get http raw (unparsed) response in http.client or python-requests

I'm using Python to making HTTP requests. I need to raw HTTP response that looks like this:

HTTP/1.1 200 OK
Date: Mon, 19 Jul 2004 16:18:20 GMT
Server: Apache
Last-Modified: Sat, 10 Jul 2004 17:29:19 GMT
ETag: "1d0325-2470-40f0276f"
Accept-Ranges: bytes
Content-Length: 9328
Connection: close
Content-Type: text/html

<HTML>
<HEAD>
... the rest of the home page...

In python-requests I tried response.raw, but it's NOT raw HTTP response and it's just raw body.

Is there any way to achieve this goal without using socket?

P.S. I don't want to rebuild the raw response using parsed parts.

Upvotes: 2

Views: 2382

Answers (2)

Alan Hamlett
Alan Hamlett

Reputation: 3274

response.raw does what you want

Answered here:

https://stackoverflow.com/a/56492298/1290627

Upvotes: -1

Martijn Pieters
Martijn Pieters

Reputation: 1121864

requests doesn't have the status line and headers in raw form. You never need these in raw form, a RFC compliant response can be reconstructed trivially from the data you do have. requests uses the urllib3 library as its basis, and that library, in turn, uses the Python standard library http.client module. That module doesn't give you the raw data either.

Instead, the status line and headers are parsed directly into the constituent parts, in http.client.HTTPResponse._read_status() and http.client.parse_headers() (the latter delegating to the email.parser.Parser().parsestr() method to parse the headers into a http.client.HTTPMessage() instance). Only the results of these parse operations are used.

You could try to wrap the urllib3 connection object (via the get_connection() hook implemented on a requests transport adapter). Connection objects have a .connect() method with supporting methods that create socket objects, and if you were to wrap those in a file-like object and then peeked at the .readline() call data, you could capture and store the raw data there.

However, if you are debugging a broken HTTP server, I'd not bother with trying to bend requests and its stack to your will here. Just use curl --include --raw <url> on the command line instead (with perhaps --verbose added).

Another option would be to use the http.client library directly, make the connection, send your outgoing headers with HTTPConnection.request(), then not use getresponse() but just read directly from conn.sock.

Upvotes: 1

Related Questions