rob
rob

Reputation: 329

What is the fastest way to check whether a directory is empty in Python

I work on a windows machine and want to check if a directory on a network path is empty.

The first thing that came to mind was calling os.listdir() and see if it has length 0.

i.e

def dir_empty(dir_path):
    return len(os.listdir(dir_path)) == 0

Because this is a network path where I do not always have good connectivity and because a folder can potentially contain thousands of files, this is a very slow solution. Is there a better one?

Upvotes: 7

Views: 7227

Answers (6)

NOUSER
NOUSER

Reputation: 21

On Windows OS there is PathIsDirectoryEmptyA. We can use it to check if folder is empty or not.

def is_dir_empty(path:str)->bool:
    import ctypes
    shlwapi = ctypes.OleDLL('shlwapi')
    return shlwapi.PathIsDirectoryEmptyA(path.encode('utf-8'))

Upvotes: 2

rob
rob

Reputation: 329

The fastest solution I found so far:

def dir_empty(dir_path):
    return not any((True for _ in os.scandir(dir_path)))

Or, as proposed in the comments below:

def dir_empty(dir_path):
    return not next(os.scandir(dir_path), None)

On the slow network I was working on this took seconds instead of minutes (minutes for the os.listdir() version). This seems to be faster, as the any statement only evaluates the first True statement.

Upvotes: 9

David Pi
David Pi

Reputation: 172

Since the OP is asking about the fastest way, I thought using os.scandir and returns as soon as we found the first file should be the fastest. os.scandir returns an iterator. We should avoid creating a whole list just to check if it is empty.

The test directory contains about 100 thousands files:

from pathlib import Path    
import os

path = 'jav/av'
len(os.listdir(path))

>>> 101204

Then start our test:

def check_empty_by_scandir(path):
    with os.scandir(path) as it:
        return not any(it)
    
def check_empty_by_listdir(path):
    return not os.listdir(path)

def check_empty_by_pathlib(path):
    return not any(Path(path).iterdir())


%timeit check_empty_by_scandir(path)
>>> 179 µs ± 878 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

%timeit check_empty_by_listdir(path)
>>> 28 ms ± 185 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit check_empty_by_pathlib(path)
>>> 27.6 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

As we can see, check_empty_by_listdir and check_empty_by_pathlib is about 155 times slower than check_empty_by_scandir. The result from os.listdir() and Path.iterdir() is identical because Path.iterdir() uses os.listdir() in the background, creating a whole list in memory.

Additionally, as people point out, reading os.stat is not an option, which returns 4096 on empty directories in linux.

Upvotes: 4

MCC
MCC

Reputation: 119

Using os.stat:

is_empty = os.stat(dir_path).st_size == 0

Using Python's pathlib:

from pathlib import Path

is_empty = Path(dir_path).stat().st_size == 0

Upvotes: -1

user459872
user459872

Reputation: 24582

From Python 3.4 onwards you can use pathlib.iterdir() which will yield path objects of the directory contents:

>>> from pathlib import Path
>>>
>>> def dir_empty(dir_path):
...     path = Path(dir_path)
...     has_next = next(path.iterdir(), None)
...     if has_next is None:
...             return True
...     return False

Upvotes: 6

Amadan
Amadan

Reputation: 198324

listdir gives a list. scandir gives an iterator, which may be more performant.

def dir_empty(dir_path):
    try:
        next(os.scandir(dir_path))
        return False
    except StopIteration:
        return True

Upvotes: 3

Related Questions