Reputation: 4090
I'm using the following code to download a file into memory :
if 'login_before_download' in obj.keys():
with requests.Session() as session:
session.post(obj['login_before_download'], verify=False)
request = session.get(obj['download_link'], allow_redirects=True)
else:
request = requests.get(obj['download_link'], allow_redirects=True)
print("downloaded {} into memory".format(obj[download_link_key]))
file_content = request.content
obj is a dict that contains the download_link and another key that indicates if I need to login to the page to create a cookie.
The problem with my code is that if the url is broken and there isnt any file to download I'm still getting the html content of the page instead of identifying that the download failed.
Is there any way to identify that the file wasnt downloaded ?
Upvotes: 0
Views: 204
Reputation: 4090
I found the following solution in this url :
import requests
def is_downloadable(url):
"""
Does the url contain a downloadable resource
"""
h = requests.head(url, allow_redirects=True)
header = h.headers
content_type = header.get('content-type')
if 'text' in content_type.lower():
return False
if 'html' in content_type.lower():
return False
return True
print is_downloadable('https://www.youtube.com/watch?v=9bZkp7q19f0')
# >> False
print is_downloadable('http://google.com/favicon.ico')
# >> True
Upvotes: 1