Reputation: 9
I have made a code below to download files using pySmartDL. I would like to download more than one file at a time. Tried to implement it using multi process. But second process starts only when first finishes. Code is below:
import time
from multiprocessing import Process
from pySmartDL import SmartDL, HashFailedException
def down():
dest='/home/faheem/Downloads'
obj = SmartDL(url_100mb_file,dest, progress_bar=False,fix_urls=True)
obj.start(blocking=False)
#cnt=1
while not obj.isFinished():
print("Speed: %s" % obj.get_speed(human=True))
print("Already downloaded: %s" % obj.get_dl_size(human=True))
print("Eta: %s" % obj.get_eta(human=True))
print("Progress: %d%%" % (obj.get_progress()*100))
print("Progress bar: %s" % obj.get_progress_bar())
print("Status: %s" % obj.get_status())
print("\n"*2+"="*50+"\n"*2)
print("SIZE=%s"%obj.filesize)
time.sleep(2)
if obj.isSuccessful():
print("downloaded file to '%s'" % obj.get_dest())
print("download task took %ss" % obj.get_dl_time(human=True))
print("File hashes:")
print(" * MD5: %s" % obj.get_data_hash('md5'))
print(" * SHA1: %s" % obj.get_data_hash('sha1'))
print(" * SHA256: %s" % obj.get_data_hash('sha256'))
data=obj.get_data()
else:
print("There were some errors:")
for e in obj.get_errors():
print(str(e))
return
if __name__ == '__main__':
#jobs=[]
#for i in range(5):
print 'Link1'
url_100mb_file = ['https://softpedia-secure-download.com/dl/45b1fc44f6bfabeddeb7ce766c97a8f0/58b6eb0f/100255033/software/office/Text%20Comparator%20(v1.2).rar']
Process(target=down()).start()
print'link2'
url_100mb_file = ['https://www.crystalidea.com/downloads/macsfancontrol_setup.exe']
Process(target=down()).start()
Here link2 starts downloading when link1 finishes, but I need both download to perform concurrently. I would like to implement this method to perform upto 10 downloads at a time. So is it good to use multiprocessing? Is there any other better memory efficient method. I am a beginner in these codes, so kindly define the answer easily.. Regards
Upvotes: 0
Views: 2132
Reputation: 40884
Since your program is I/O-bound, you can use multi-processing or mult-threading.
Just in case, I'd like to remind the classical pattern for problems like this. Have a queue of URLs from which worker processes / threads pull URLs for processing, and have a status queue where the workers push their progress reports or errors.
A thread pool or a process pull greatly simplifies things, compared to manual control.
Upvotes: 0
Reputation: 4866
You can also use python module Thread
. Here is a little snippet on how it works:
import threading
import time
def func(i):
time.sleep(i)
print i
for i in range(1, 11):
thread = threading.Thread(target = func, args=(i,))
thread.start()
print "Launched thread " + str(i)
print "Done"
Run this snippet and you will get a perfect idea on how it works.
Knowing that, you can actually run your code, passing as an argument to the function the url
to use in each thread.
Hope that helps
Upvotes: 1
Reputation: 2121
The particular library you're using appears to already support non-blocking downloads so why no just do the following? Non-blocking means it'll run in a seperate process.
from time import sleep
from pySmartDL import SmartDL
links = [['https://softpedia-secure download.com/dl/45b1fc44f6bfabeddeb7ce766c97a8f0/58b6eb0f/100255033/software/office/Text%20Comparator%20(v1.2).rar'],['https://www.crystalidea.com/downloads/macsfancontrol_setup.exe']]
objs = [SmartDL(link, progress_bar=False) for link in links]
for obj in objs:
obj.start(blocking=False)
while not all(obj.isFinished() for obj in objs):
sleep(1)
Upvotes: 0