Reputation: 424
I'm working with socket operations and have coded a basic interception proxy in python. It works fine, but some hosts return 400 bad request responses.
These requests do not look malformed though. Here's one:
GET http://www.baltour.it/ HTTP/1.1
Host: www.baltour.it
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Same request, raw:
GET http://www.baltour.it/ HTTP/1.1\r\nHost: www.baltour.it\r\nUser-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0\r\nAccept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8\r\nAccept-Language: en-US,en;q=0.5\r\nAccept-Encoding: gzip, deflate\r\nConnection: keep-alive\r\n\r\n
The code I use to send the request is the most basic socket operation (though I don't think the problem lies there, it works fine with most hosts)
socket_client.send(request_raw)
while socket_client.recv is used to get the response (but no problems here, the response is well-formed, though its status is 400).
Any ideas?
Upvotes: 0
Views: 1049
Reputation: 1125138
When not talking to a proxy, you are not supposed to put the http://hostname
part in the HTTP header; see section 5.1.2 of the HTTP 1.1 RFC 2616 spec:
The most common form of Request-URI is that used to identify a resource on an origin server or gateway. In this case the absolute path of the URI MUST be transmitted (see section 3.2.1, abs_path) as the Request-URI, and the network location of the URI (authority) MUST be transmitted in a Host header field.
(emphasis mine); abs_path
is the absolute path part of the request URI, not the full absolute URI itself.
E.g. the server expects you to send:
GET / HTTP/1.1
Host: www.baltour.it
A receiving server should be tolerant of the incorrect behaviour, however. The server seems to violate the RFC as well here too. Further on in the same section it reads:
To allow for transition to absoluteURIs in all requests in future versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI form in requests, even though HTTP/1.1 clients will only generate them in requests to proxies.
Upvotes: 1