Géry Ogam
Géry Ogam

Reputation: 8027

Missing Host header in HTTP requests from the requests Python library

Where is the HTTP/1.1 mandatory Host header field in HTTP request messages generated by the requests Python library?

import requests

response = requests.get("https://www.google.com/")
print(response.request.headers)

Output:

{'User-Agent': 'python-requests/2.22.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}

Upvotes: 11

Views: 31523

Answers (1)

DeepSpace
DeepSpace

Reputation: 81594

The HOST header is not being added to the request by requests by default. If it is not explicitly added then the decision is delegated to the underlying http module.

See this section of http/client.py:

(if 'Host' header is explicitly provided in requests.get then skip_host is True)

    if self._http_vsn == 11:
        # Issue some standard headers for better HTTP/1.1 compliance

        if not skip_host:
            # this header is issued *only* for HTTP/1.1
            # connections. more specifically, this means it is
            # only issued when the client uses the new
            # HTTPConnection() class. backwards-compat clients
            # will be using HTTP/1.0 and those clients may be
            # issuing this header themselves. we should NOT issue
            # it twice; some web servers (such as Apache) barf
            # when they see two Host: headers

            # If we need a non-standard port,include it in the
            # header.  If the request is going through a proxy,
            # but the host of the actual URL, not the host of the
            # proxy.

            netloc = ''
            if url.startswith('http'):
                nil, netloc, nil, nil, nil = urlsplit(url)

            if netloc:
                try:
                    netloc_enc = netloc.encode("ascii")
                except UnicodeEncodeError:
                    netloc_enc = netloc.encode("idna")
                self.putheader('Host', netloc_enc)
            else:
                if self._tunnel_host:
                    host = self._tunnel_host
                    port = self._tunnel_port
                else:
                    host = self.host
                    port = self.port

                try:
                    host_enc = host.encode("ascii")
                except UnicodeEncodeError:
                    host_enc = host.encode("idna")

                # As per RFC 273, IPv6 address should be wrapped with []
                # when used as Host header

                if host.find(':') >= 0:
                    host_enc = b'[' + host_enc + b']'

                if port == self.default_port:
                    self.putheader('Host', host_enc)
                else:
                    host_enc = host_enc.decode("ascii")
                    self.putheader('Host', "%s:%s" % (host_enc, port)) 

As a result we do not see the 'Host' header when inspecting the headers that requests sent to the server.

If we send a request to http://httpbin/get and print the response we can see the Host header was indeed sent.

import requests

response = requests.get("http://httpbin.org/get")
print('Response from httpbin/get')
print(response.json())
print()
print('response.request.headers')
print(response.request.headers)

Outputs

Response from httpbin/get
{'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate', 
 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.20.0'},
 'origin': 'XXXXXX', 'url': 'https://httpbin.org/get'}

response.request.headers
{'User-Agent': 'python-requests/2.20.0', 'Accept-Encoding': 'gzip, deflate', 
 'Accept': '*/*', 'Connection': 'keep-alive'}

Upvotes: 9

Related Questions