TieDad
TieDad

Reputation: 9899

Is there a python method to validate existence of file or URL?

I'm writing a Python script that needs to validate existence of a file. The file can be either a full path like /home/xxx/file.txt or a URL http://company.com/xxx/file.txt.

Is there a python method that can validate existence of various schema of path?

Upvotes: 3

Views: 3319

Answers (3)

mhawke
mhawke

Reputation: 87084

What do you plan to do with the file?

If you need to use the file you might be better off opening it, lest it disappear before you use it. There can be security issues if you test first and then open as the two operations can not be made atomic. It's possible that the file could be removed, created, or otherwise interfered with before your code opens it.

If you simply want to know whether a path exists at the time you test for it use os.path.exists(). Otherwise, if you want to actually do something with the file, call open() on it.

For URLs you need to access it... either GET it with urlopen() or use requests. You could also try sending a HEAD request to determine whether a resource exists without downloading it content. This is useful if you were checking a resource that returns a lot of data, like an image or music file. The requests module makes that easy:

import requests

r = requests.head(url, allow_redirects=True)
if r.status_code == 200:
    # resource apparently exists

The allow_redirects is necessary for HEAD requests, e.g.

import requests

url = 'http://www.google.com'
r = requests.head(url)
print(r.status_code)
# 302
r = requests.head(url, allow_redirects=True)
print(r.status_code)
# 200

Upvotes: 6

ShadowRanger
ShadowRanger

Reputation: 155448

I'm answering the question you didn't ask, and telling you: Don't do this.

You rarely want to just validate existence, because usually, if it exists, you want to use it. Checking, then opening is a pattern open to race conditions (you check, the file exists, some other program deletes it, you try to open it for reading, kaboom). Typically, the correct way to check if a file (or any other resource you wish to use) is available is to try to open it, and handle the exception if it turns out not to exist.

The general pattern is called EAFP (easier to ask forgiveness than seek permission) and it's much safer for race-prone activities like this than the opposite pattern you're trying to use, LBYL (look before you leap).

So if you want to check if a file exists, call open on it. If you want to check if a URL exists, try to urlopen it. This does more than just validate existence, it also lets you know important stuff like "is it a file-like thing?", "do I have permission to read the contents?", etc. which otherwise requires checking multiple flags and can still tell you the wrong answer if you ask the question incorrectly (e.g. it rarely matters if it's a file, as long as you can read data from it, but checking isfile excludes stuff like, say, named pipes created by bash process substitution that mostly act like files).

Upvotes: 0

TheDmOfJoes
TheDmOfJoes

Reputation: 140

Here's what I've used in the past to handle checking if a URL exists, but if you're just looking for a file then use the methods suggested in your comments.

import requests
    request = requests.get('http://company.com/')
    if request.status_code == 200:
        print('We are dandy.')
    else:
        print('No existe.')

Upvotes: 1

Related Questions