Reputation: 351
I have a link to a folder which has enormous number of files that I want to download. I started downloading it single file at a time, however it's taking a very long time. Is there a way to spawn some multi-threaded processes to download maybe a batch of files simultaneously. Probably like process1 downloads the first 20 files in the folder, process2 downloads the next 20 simultaneously and so on.
Right now, I'm doing as follows:
import urllib, os
os.chdir('/directory/to/save/the/file/to')
url = 'http://urltosite/folderthathasfiles
urllib.urlretrieve(url)
Upvotes: 4
Views: 6059
Reputation: 5613
You can define a function
that takes the link
and a list
of filenames
then it will loop through the list
and download files
, then create a thread
for each list
and have it target the function
. For example:
def download_files(url, filenames):
for filename in filenames:
urllib.urlretrieve(os.path.join(url,filename))
# then create the lists and threads
url = 'test.url'
files = [[file1, file2, file3....], [file21, file22, file23...]...]
for lst in files:
threading.Thread(target=download_files, args=(url, lst)).start()
Upvotes: 4