How can I download many small files quickly? (Not Bandwidth limited)

Question

I need to download ~50 CSV files in python. Based on the Google Chrome network stats, the download takes only 0.1 seconds, while the request takes about 7 seconds to process.

I am currently using headless Chrome to make the requests. I tried multithreading, but from what I can tell, the browser doesn't support that (it can't make another request before the first request finishes processing). I don't think Multiprocessing is an option as this script will be hosted on a virtual server.

My next idea is to use the requests module instead of headless Chrome, but I am having issues connecting to the company network without a browser. Will this work, though? Any other solutions? Could I do something with multiple driver instances or multiple tabs on a single driver?Thanks!

Here's my code:

from Multiprocessing.pool import ThreadPool
driver=ChromeDriver()
Login(driver)

def getFile(item):
    driver.get(url.format(item))

updateSet=blah
pool= ThreadPool(len(updateSet))
for item in updateSet:
    pool.apply_async(getFile,(item,))

pool.close()
pool.join()

How can I download many small files quickly? (Not Bandwidth limited)

Answers (1)

Related Questions