sthenault
sthenault

Reputation: 15125

python requests unexpectedly decode application/x-gzip content

I'm attempting to retrieve https://donneespubliques.meteofrance.fr/donnees_libres/Txt/Synop/Archive/synop.201803.csv.gz using python requests.

However, when accessing response's content I get the csv data, not the gzipped-csv data like I would have expected. It's not clear to me why.

>>> url
'https://donneespubliques.meteofrance.fr/donnees_libres/Txt/Synop/Archive/synop.201803.csv.gz'
>>> resp = requests.get(url)
>>> resp.headers
{'Date': 'Thu, 19 Apr 2018 10:48:11 GMT', 'Server': 'MFWS', 
 'Last-Modified': 'Sat, 31 Mar 2018 21:10:09 GMT', 
 'ETag': '"3066bd-a2dce-568bbc81bee40"', 
 'Accept-Ranges': 'bytes', 
 'Content-Length': '667086', 
 'Content-Type': 'application/x-gzip', 
 'Content-Encoding': 'gzip',
 'Content-Disposition': 'attachment', 
 'Keep-Alive': 'timeout=5, max=300',
 'Connection': 'Keep-Alive'}
>>> resp.content[:100]
b'numer_sta;date;pmer;tend;cod_tend;dd;ff;t;td;u;vv;ww;w1;w2;n;nbas;hbas;cl;cm;ch;pres;niv_bar;geop;te'
>>> requests.__version__
'2.18.4'

If I access the same URL with e.g. curl I get the gzipped content as expected:

$ curl https://donneespubliques.meteofrance.fr/donnees_libres/Txt/Synop/Archive/synop.201803.csv.gz  -s > data
$ file data
data: gzip compressed data, was "synop.201803.csv", last modified: Sat Mar 31 21:10:08 2018, from Unix

Is this a requests feature I don't get? A server misconfiguration?

Upvotes: 0

Views: 854

Answers (1)

Joost
Joost

Reputation: 3729

It's a feature indeed, see here.

Encoded Data?

Requests automatically decompresses gzip-encoded responses, and does its best to decode response content to unicode when possible.

You can get direct access to the raw response (and even the socket), if needed as well.

Upvotes: 1

Related Questions