LeDerp
LeDerp

Reputation: 623

Use python Requests to download an compressed tar.gzfile and unzip it using tar

I need to use request call to download a tar gz file, I found out that requests.get automatically decompresses the file, I tried to use the solution given here but when I try to decompress it using tar it says it is not in gzip format.

I tried the following approaches:

response = requests.get(url,auth=(user, key),stream=True)
if response.status_code == 200:
    with open(target_path, 'wb') as f:
        f.write(response.raw)

if response.status_code == 200:
    with open(target_path, 'wb') as f:
        f.write(response.raw)

raw = response.raw
with open(target_path, 'wb') as out_file:
    while True:
        chunk = raw.read(1024, decode_content=True)
        if not chunk:
            break
        out_file.write(chunk) 

All of the above while decompressing throw the error:

$ tar -xvzf /tmp/file.tar.gz -C /

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

Note: Cannot use urllib.open as I need authentication etc and I have to use requests library

Upvotes: 16

Views: 24988

Answers (2)

Intrastellar Explorer
Intrastellar Explorer

Reputation: 2401

For a built-ins only solution (no 3rd party requests), one can use urllib.request.Request:

from urllib import request

url = "https://pypi.python.org/packages/source/x/xlrd/xlrd-0.9.4.tar.gz"
target_path = "xlrd-0.9.4.tar.gz"

# https://docs.python.org/3/library/urllib.request.html#urllib.request.Request
# NOTE: Transfer-Encoding: chunked (streaming) will be auto-selected
with request.urlopen(request.Request(url), timeout=15.0) as response:
    if response.status == 200:
        with open(target_path, "wb") as f:
            f.write(response.read())

Upvotes: 0

Frans
Frans

Reputation: 837

You just need to change f.write(response.raw) to f.write(response.raw.read())

Try the code below, this should give you a correct tar gz file.

import requests

url = 'https://pypi.python.org/packages/source/x/xlrd/xlrd-0.9.4.tar.gz'
target_path = 'xlrd-0.9.4.tar.gz'

response = requests.get(url, stream=True)
if response.status_code == 200:
    with open(target_path, 'wb') as f:
        f.write(response.raw.read())

Upvotes: 29

Related Questions