How to find if a file has been downloaded completely using python?

Question

We are having a python script which automates the batch processing of time-series image data downloaded from the internet. The current script requires all data to be downloaded before execution. This consumes more time. We want to modify the script by writing a scheduler which will call the script whenever a single data is completely downloaded. How to find that a file has been downloaded completely using python?

Jack Taylor · Accepted Answer

If you download the file with Python, then you can just do the image processing operation after the file download operation finishes. An example using requests:

import requests
import mymodule # The module containing your custom image-processing function

for img in ("foo.png", "bar.png", "baz.png"):
    response = requests.get("http://www.example.com/" + img)
    image_bytes = response.content
    mymodule.process_image(image_bytes)

However, with the sequential approach above you will be spending a lot of time waiting for responses from the remote server. To make this faster, you can download and process multiple files at once using aysncio and aiohttp. There's a good introduction to downloading files this way in Paweł Miech's blog post Making 1 million requests with python-aiohttp. The code you need will look something like the example at the bottom of that blog post (the one with the semaphore).

How to find if a file has been downloaded completely using python?

Answers (1)

Related Questions