Reputation: 21914
I am getting an HTTP 400 - Bad Request
error while making an XML-RPC
over HTTPS
in Python 3.8.
It seems like this issue is happening when we are supplying the Host
header in the HTTPS
request, without skip_host=True
in the putrequest
(doc) call before that. Are both these info's -- skip_host
argument and Host
header, mutually exclusive? If so, which one should I use?
import http.client
connection = http.client.HTTPSConnection("duckduckgo.com", "443")
connection.putrequest("GET", "/") # needs skip_host=True if Host has to be supplied
connection.putheader("User-Agent", "Python/3.8")
connection.putheader("Host", "duckduckgo.com") # needs skip_host=True to work
connection.endheaders()
response = connection.getresponse()
print(response.status, response.reason)
Update: This issue doesn't happen with all HTTPS servers, as mentioned in the official docs.
Upvotes: 2
Views: 992
Reputation: 1979
Leaving skip_host
to its default value, i.e., False
, and specifying a Host
header using putheader
results in sending the Host
header twice (and in this example with different values). This can be checked by setting the debuglevel
to a positive value.
>>> import http.client
>>> connection = http.client.HTTPSConnection("duckduckgo.com", "443")
>>> connection.set_debuglevel(1)
>>> connection.putrequest("GET", "/")
>>> connection.putheader("User-Agent", "Python/3.8")
>>> connection.putheader("Host", "duckduckgo.com")
>>> connection.endheaders()
send: b'GET / HTTP/1.1\r\nHost: duckduckgo.com:443\r\nAccept-Encoding: identity\r\nUser-Agent: Python/3.8\r\nHost: duckduckgo.com\r\n\r\n'
>>>
>>> response = connection.getresponse()
reply: 'HTTP/1.1 400 Bad Request\r\n'
header: Server header: Date header: Content-Type header: Content-Length header: Connection header: X-XSS-Protection header: X-Content-Type-Options header: Referrer-Policy header: Expect-CT
In http.client
's code it is mentioned that sending the Host
header twice can be "confusing" for some web servers. See the following comment in putrequest
:
if not skip_host:
# this header is issued *only* for HTTP/1.1
# connections. more specifically, this means it is
# only issued when the client uses the new
# HTTPConnection() class. backwards-compat clients
# will be using HTTP/1.0 and those clients may be
# issuing this header themselves. we should NOT issue
# it twice; some web servers (such as Apache) barf
# when they see two Host: headers
Your code will work either by adding skip_host=True
or by not explicitly specifying a Host
header. Both result in sending the Host
header once.
>>> import http.client
>>> connection = http.client.HTTPSConnection("duckduckgo.com", "443")
>>> connection.putrequest("GET", "/", skip_host=True)
>>> connection.putheader("User-Agent", "Python/3.8")
>>> connection.putheader("Host", "duckduckgo.com")
>>> connection.endheaders()
>>> response = connection.getresponse()
>>> print(response.status, response.reason)
200 OK
>>> # OR
>>> connection = http.client.HTTPSConnection("duckduckgo.com", "443")
>>> connection.putrequest("GET", "/")
>>> connection.putheader("User-Agent", "Python/3.8")
>>> connection.endheaders()
>>> response = connection.getresponse()
>>> print(response.status, response.reason)
200 OK
As to which one to use, the docs seem to suggest that unless you have a reason to specify a Host
header (using putheader
) you can rely on the module's automatic sending of the Host
header, i.e., leave skip_host
to its default value, i.e., False
, and do not specify a Host
header using putheader
.
Upvotes: 4