JeyJ
JeyJ

Reputation: 4090

python download file into memory and handle broken links

I'm using the following code to download a file into memory :

        if 'login_before_download' in obj.keys():
            with requests.Session() as session:
                session.post(obj['login_before_download'], verify=False)
                request = session.get(obj['download_link'], allow_redirects=True)

        else:
            request = requests.get(obj['download_link'], allow_redirects=True)

        print("downloaded {} into memory".format(obj[download_link_key]))
        file_content = request.content

obj is a dict that contains the download_link and another key that indicates if I need to login to the page to create a cookie.

The problem with my code is that if the url is broken and there isnt any file to download I'm still getting the html content of the page instead of identifying that the download failed.

Is there any way to identify that the file wasnt downloaded ?

Upvotes: 0

Views: 204

Answers (1)

JeyJ
JeyJ

Reputation: 4090

I found the following solution in this url :

import requests

def is_downloadable(url):
    """
    Does the url contain a downloadable resource
    """
    h = requests.head(url, allow_redirects=True)
    header = h.headers
    content_type = header.get('content-type')
    if 'text' in content_type.lower():
        return False
    if 'html' in content_type.lower():
        return False
    return True

print is_downloadable('https://www.youtube.com/watch?v=9bZkp7q19f0')
# >> False
print is_downloadable('http://google.com/favicon.ico')
# >> True

Upvotes: 1

Related Questions