Reputation: 311
Today I actually needed to retrieve data from the http-header response. But since I've never done it before and also there is not much you can find on Google about this. I decided to ask my question here.
So actual question: How does one print the http-header response data in python? I'm working in Python3.5 with the requests module and have yet to find a way to do this.
Upvotes: 33
Views: 108841
Reputation: 464
import requests
site = "https://www.google.com"
headers = requests.get(site).headers
print(headers)
print(headers["domain"])
Upvotes: 7
Reputation: 737
I'm using the urllib module, with the following code:
from urllib import request
with request.urlopen(url, data) as f:
print(f.getcode()) # http response code
print(f.info()) # all header info
resp_body = f.read().decode('utf-8') # response body
Upvotes: 4
Reputation: 4346
Here's how you get just the response headers using the requests library like you mentioned (implementation in Python3):
import requests
url = "https://www.google.com"
response = requests.head(url)
print(response.headers) # prints the entire header as a dictionary
print(response.headers["Content-Length"]) # prints a specific section of the dictionary
It's important to use .head()
instead of .get()
otherwise you will retrieve the whole file/page like the rest of the answers mentioned.
If you wish to retrieve a URL that requires authentication you can replace the above response
with this:
response = requests.head(url, auth=requests.auth.HTTPBasicAuth(username, password))
Upvotes: 14
Reputation: 57
import pprint
import requests
res = requests.request("GET", "https://google.com")
pprint.PrettyPrinter(indent=2).pprint(dict(res.headers))
Upvotes: 1
Reputation: 37
its very easy u can type
print(response.headers)
or my fav
print(requests.get('url').headers)
also u can use
print(requests.get('url').content)
Upvotes: 2
Reputation: 87
Try to use req.headers
and that's all. You will get the response headers ;)
Upvotes: 0
Reputation: 3335
Update: Based on comment of OP, that only the response headers are needed. Even more easy as written in below documentation of Requests module:
We can view the server's response headers using a Python dictionary:
>>> r.headers
{
'content-encoding': 'gzip',
'transfer-encoding': 'chunked',
'connection': 'close',
'server': 'nginx/1.0.4',
'x-runtime': '148ms',
'etag': '"e1ca502697e5c9317743dc078f67693f"',
'content-type': 'application/json'
}
And especially the documentation notes:
The dictionary is special, though: it's made just for HTTP headers. According to RFC 7230, HTTP Header names are case-insensitive.
So, we can access the headers using any capitalization we want:
and goes on to explain even more cleverness concerning RFC compliance.
The Requests documentation states:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly. When streaming a download, the above is the preferred and recommended way to retrieve the content.
It offers as example:
>>> r = requests.get('https://api.github.com/events', stream=True)
>>> r.raw
<requests.packages.urllib3.response.HTTPResponse object at 0x101194810>
>>> r.raw.read(10)
'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
But also offers advice on how to do it in practice by redirecting to a file etc. and using a different method:
Using Response.iter_content will handle a lot of what you would otherwise have to handle when using Response.raw directly
Upvotes: 24
Reputation: 439
How about something like this:
import urllib2
req = urllib2.Request('http://www.google.com/')
res = urllib2.urlopen(req)
print res.info()
res.close();
If you are looking for something specific in the header:
For Date: print res.info().get('Date')
Upvotes: 11