JohnJ
JohnJ

Reputation: 7056

gevent and socksipy

I was wondering if anyone has tried using [gevent][1] and [socksipy][2] for concurrent downloads.

Upvotes: 1

Views: 292

Answers (1)

Balthazar Rouberol
Balthazar Rouberol

Reputation: 7180

I've used gevent for downloading ~12k pictures from yfrog, instagram, twitpic, etc. The cumulated size of the pictures was around 1.5Gb, and it took ~20 minutes to download them all, on my home wifi.

To do so, I implemented an image_download function which sole purpose was to download a picture from a given URL, and then asynchronously mapped an URLs list on the image_download function, using a gevent.Pool.

from gevent import monkey
monkey.patch_socket()  # See http://www.gevent.org/gevent.monkey.html
import gevent

NB_WORKERS = 50

def image_download(url):
    # retrieve image

def parallel_image_download(urls):  # urls is of type list
    """ Activate NB_WORKERS Greenlets to asynchronously download the images. """
    pool = gevent.Pool(NB_WORKERS)
    return pool.map(image_download, urls)

NB: I settled on 50 parallel workers after a couple of tries. Passed 50, the total runtime did not increase.

Upvotes: 3

Related Questions