Reputation: 5534
I recently asked this question and got a wonderful answer to it involving the os.walk
command. My script is using this to search through an entire drive for a specific folder using for root, dirs, files in os.walk(drive):
. Unfortunately, on a 600 GB drive, this takes about 10 minutes.
Is there a better way to invoke this or a more efficient command to be using? Thanks!
Upvotes: 1
Views: 1271
Reputation: 5475
scandir.walk(path)
gives 2-20 times faster results then the os.walk(path)
.
you can use this module pip install scandir
here is docs for scandir
Upvotes: 0
Reputation: 365975
If you're just looking for a small constant improvement, there are ways to do better than os.walk
on most platforms.
In particular, walk
ends up having to stat
many regular files just to make sure they're not directories, even though the information is (Windows) or could be (most *nix systems) already available from the lower-level APIs. Unfortunately, that information isn't available at the Python level… but you can get to it via ctypes
or by building a C extension library, or by using third-party modules like scandir
.
This may cut your time to somewhere from 10% to 90%, depending on your platform and the details of your directory layout. But it's still going to be a linear search that has to check every directory on your system. The only way to do better than that is to access some kind of index. Your platform may have such an index (e.g., Windows Desktop Search or Spotlight); your filesystem may as well (but that will require low-level calls, and may require root/admin access), or you can build one on your own.
Upvotes: 3