David542
David542

Reputation: 110093

Check if an s3 url exists

I have the following url, which exists:

https://s3-us-west-1.amazonaws.com/premiere-avails/458ca3ce-c51e-4f69-8950-7af3e44f0a3d__chapter025.jpg

But this one does not:

https://s3-us-west-1.amazonaws.com/premiere-avails/459ca3ce-c51e-4f69-8950-7af3e44f0a3d__chapter025.jpg

Is there a way to check a url to see if it is valid, without downloading the file (it may be a 1GB file)? Note that I do not want to use boto to see if the key exists, I would like to use an HTTP request.

Upvotes: 3

Views: 3631

Answers (3)

Ramon Orraca
Ramon Orraca

Reputation: 111

I'd use the requests Python library, the function would look like this:

import requests

def check_url(url):
    """
    Checks if the S3 link exists.

    Parameters:
        url (str): link to check if exists.

    Returns:
        bool: True if exists, False otherwise
    """
    request = requests.head(url)
    if request.status_code == 200:
        return True
    else:
        return False

The requests.head() function returns a requests.Response() object from which you can get a lot of different values. If you want to check if the request's status code is less than 400 you could use request.ok == True instead of comparing request.status_code == 200. Also, function to request the head—requests.head()—can also take on parameters such as a timeout; docs for this function here.

Upvotes: 1

elyase
elyase

Reputation: 40963

Try this:

import httplib
from urlparse import urlparse

def url_exists(url):
    _, host, path, _, _, _ = urlparse(url)
    conn = httplib.HTTPConnection(host)
    conn.request('HEAD', path)
    return conn.getresponse().status < 400

Upvotes: 7

garnaat
garnaat

Reputation: 45846

You could use curl. The --head option would send a HEAD request rather than a GET so it would not return the body even if it did exist.

curl --head https://s3-us-west-1.amazonaws.com/premiere-avails/458ca3ce-c51e-4f69-8950-7af3e44f0a3d__chapter025.jpg

Upvotes: 1

Related Questions