Jack
Jack

Reputation: 342

Check for `urllib.urlretrieve(url, file_name)` Completion Status

How do I check to see if urllib.urlretrieve(url, file_name) has completed before allowing my program to advance to the next statement?

Take for example the following code snippet:

import traceback
import sys
import Image
from urllib import urlretrieve

try:
        print "Downloading gif....."
        urlretrieve(imgUrl, "tides.gif")
        # Allow time for image to download/save:
        time.sleep(5)
        print "Gif Downloaded."
    except:
        print "Failed to Download new GIF"
        raw_input('Press Enter to exit...')
        sys.exit()

    try:
        print "Converting GIF to JPG...."
        Image.open("tides.gif").convert('RGB').save("tides.jpg")
        print "Image Converted"
    except Exception, e:
        print "Conversion FAIL:", sys.exc_info()[0]
        traceback.print_exc()
        pass

When the download of 'tides.gif' via urlretrieve(imgUrl, "tides.gif") takes longer than time.sleep(seconds) resulting in an empty or not-complete file, Image.open("tides.gif") raises an IOError (due to a tides.gif file of size 0 kB).

How can I check the status of urlretrieve(imgUrl, "tides.gif"), allowing my program to advance only after the statement has been successfully completed?

Upvotes: 5

Views: 7105

Answers (5)

Aominé
Aominé

Reputation: 500

you can try this below :

import time

# ----------------------------------------------------
# Wait until the end of the download
# ----------------------------------------------------

valid=0
while valid==0:
    try:
        with open("tides.gif"):valid=1
    except IOError:
        time.sleep(1)

print "Got it !"

# ----------------------------------------------------
# //////////////////////////////////////////////////
# ----------------------------------------------------

Upvotes: 0

keithhackbarth
keithhackbarth

Reputation: 10156

The selected answer doesn't work with big files. Here is the correct solution:

import sys
import time
import urllib


def reporthook(count, block_size, total_size):
    if int(count * block_size * 100 / total_size) == 100:
        print 'Download completed!'

def save(url, filename):
    urllib.urlretrieve(url, filename, reporthook)

Upvotes: 0

stderr
stderr

Reputation: 8732

Requests is nicer than urllib but you should be able to do this to synchronously download the file:

import urllib
f = urllib.urlopen(imgUrl)
with open("tides.gif", "wb") as imgFile:
    imgFile.write(f.read())
# you won't get to this print until you've downloaded
# all of the image at imgUrl or an exception is raised
print "Got it!"

The downside of this is it will need to buffer the whole file in memory so if you're downloading a lot of images at once you may end up using a ton of ram. It's unlikely, but still worth knowing.

Upvotes: 5

Meitham
Meitham

Reputation: 9670

I would use python requests from http://docs.python-requests.org/en/latest/index.html instead of plain urllib2. requests is synchronous by default so it won't progress to the next line of code without getting your image first.

Upvotes: 2

tabchas
tabchas

Reputation: 1402

I found a similar question here: Why is "raise IOError("cannot identify image file")"showing up only part of the time?

To be more specific, look at the answer to the question. The user points to a couple of other threads that explain exactly how to solve the problem in multiple ways. The first one, which you may be interested in, includes a progress bar display.

Upvotes: 0

Related Questions