redrah
redrah

Reputation: 1204

How to determine if file is remote in Python

I want to determine whether or not a file is located on a local hard drive or a drive mounted from the network in OSX. So I'd be looking to produce code a bit like the following:

file_name = '/Somewhere/foo.bar'
if is_local_file(file_name):
    do_local_thing()
else:
    do_remote_thing()

I've not been able to find anything that works like is_local_file() in the example above. Ideally I'd like to use an existing function if there is one but failing that how could I implement it myself? The best I've come up with is the following but this treats mounted dmgs as though they're remote which isn't what I want. Also I suspect I might be reinventing the wheel!

def is_local_file(path):
    path = path.split('/')[1:]
    for index in range(1,len(path)+1):
        if os.path.ismount('/' + '/'.join(path[:index])):
            return False
    return True

I have two functions which generate checksums, one of which uses multiprocess which incurs an overhead to start off with but which is faster for large files if the network connection is slow.

Upvotes: 7

Views: 2928

Answers (2)

ecatmur
ecatmur

Reputation: 157374

You could use your existing code (or try the solution at How to find the mountpoint a file resides on?) to find the mountpoint of the file; then read /proc/mounts to find the device and filesystem; /proc/mounts has format

device mountpoint filesystem options...

You can use the filesystem field to automatically exclude known network filesystems e.g. afs, cifs, nfs, smbfs. Otherwise you can look at the device; as a basic heuristic, if the device is a device node (stat.S_ISBLK) or none then the filesystem is probably local; if it is in URI style (host:/path) then it is probably remote; if it is an actual file then the filesystem is a disk image and you'll need to recurse.

Upvotes: 1

msw
msw

Reputation: 43497

"I have two functions which generate checksums, one of which uses multiprocess which incurs an overhead to start off with but which is faster for large files if the network connection is slow."

Then what you're really looking for is_local_file() to tell you is only a proxy measure for "will file access be slower than I'd like?". As a proxy measure, it is a relatively poor indicator of what you really want to know for all the confounding reasons noted above (local but virtualized disks, remote but screamingly fast NAS, etc.)

Since you are asking a question that is nearly impossible to answer programatically, it is better to provide an option, as with the -jobs option on make which explicitly says "parallelize this run".

Upvotes: 2

Related Questions