Gustavo Amarante
Gustavo Amarante

Reputation: 242

Download ZIP file from the web (Python)

I am trying to download a ZIP file using from this website. I have looked at other questions like this, tried using the requests and urllib but I get the same error:

urllib.error.HTTPError: HTTP Error 302: The HTTP server returned a redirect error that would lead to an infinite loop. The last 30x error message was: Found

Any ideas on how to open the file straight from the web?

Here is some sample code

from urllib.request import urlopen
response = urlopen('http://www1.caixa.gov.br/loterias/_arquivos/loterias/D_megase.zip')

Upvotes: 0

Views: 1062

Answers (2)

KevinS
KevinS

Reputation: 239

Works for me using the Requests library

import requests

url = 'http://www1.caixa.gov.br/loterias/_arquivos/loterias/D_megase.zip'

response = requests.get(url)

# Unzip it into a local directory if you want
import zipfile, io

zip = zipfile.ZipFile(io.BytesIO(response.content))
zip.extractall("/path/to/your/directory")

Note that sometimes trying to access web pages programmatically leads to 302 responses because they only want you to access the page via a web browser. If you need to fake this (don't be abusive), just set the 'User-Agent' header to be like a browser. Here's an example of making a request look like it's coming from a Chrome browser.

user_agent = 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1667.0 Safari/537.36'
headers = {'User-Agent': user_agent}
requests.get(url, headers=headers)

There are several libraries (e.g. https://pypi.org/project/fake-useragent/) to help with this for more extensive scraping projects.

Upvotes: 0

vidstige
vidstige

Reputation: 13085

The linked url will redirect indefinitely, that's why you get the 302 error.

You can examine this yourself over here. As you can see the linked url immediately redirects to itself creating a single-url loop.

Upvotes: 1

Related Questions