Boris
Boris

Reputation: 41

Downloading large files in Python

In python 2.7.3, I try to create a script to download a file over the Internet. I use the urllib2 module.

Here, what I have done :

import urllib2

HTTP_client = urllib2.build_opener()
#### Here I can modify HTTP_client headers
URL = 'http://www.google.com'
data = HTTP_client.open(URL)
with open ('file.txt','wb') as f:
        f.write(data.read())

OK. That's work perfectly.

The problem is when I want to save big files (hundreds of MB). I think that when I call the 'open' method, it downloads the file in memory. But, what about large files ? It will not save 1 GB of data in memory !! What happen if i lost connection, all the downloaded part is lost.

How to download large files in Python like wget does ? In wget, it downloads the file 'directly' in hard disk. We can see the file growning up in size.

I'm surprised there is no method 'retrieve' for doing stuff like

HTTP_client.retrieve(URL, 'filetosave.ext')

Upvotes: 1

Views: 6219

Answers (1)

user3868300
user3868300

Reputation:

To resolve this, you can read chunks at a time and write them to file.

req = urllib2.urlopen(url)
CHUNK = 16 * 1024
with open(file, 'wb') as fp:
  while True:
    chunk = req.read(CHUNK)
    if not chunk: break
    fp.write(chunk)

Upvotes: 2

Related Questions