funkifunki
funkifunki

Reputation: 1169

How to download .gz files with requests in Python without decoding it?

I am downloading a file using requests:

import requests

req = requests.get(url, stream=True)
with open(local_filename, 'wb') as f:
    for chunk in req.iter_content(chunk_size=1024):
        if chunk:
            f.write(chunk)
            f.flush()

The problem with gzip files is that they being automatically decoded by requests, hence i get the unpacked file on disk, while i need the original file.

Is there a way to tell requests not to do this?

Upvotes: 12

Views: 20585

Answers (2)

Boban P.
Boban P.

Reputation: 233

import requests

r = requests.get(url, stream=True)
with open(local_filename, 'wb') as f:
    for chunk in r.raw.stream(1024, decode_content=False):
        if chunk:
            f.write(chunk)

This way, you will avoid automatic decompress of gzip-encoded response, save it to file as it's received from web server, chunk by chunk.

Upvotes: 12

Dan Lenski
Dan Lenski

Reputation: 79762

As discussed in the comments above, this seems to have solved the issue:

From the docs for the requests module:

Requests automatically decompresses gzip-encoded responses ... You can get direct access to the raw response (and even the socket), if needed as well.

Searching the docs for "raw responses" yields requests.Response.raw, which gives a file-like representation of the raw response stream.

Upvotes: 8

Related Questions