user5004049
user5004049

Reputation: 761

Python fnmatch filter for patterns that include subdirectory

I'm trying to use fnmatch filter to find files given a pattern. However, if my pattern is something like subdir/t*.txt my code

for path, subdirs, files in os.walk(root):
    for fn in fnmatch.filter(files, pattern):
        print 'found match'

will never reach the print statement. From what I can see, it'll never find a match because files is only the basename, and will not include subdirectories. Is there a good way to match patterns that include a subdirectory? It should still work for patterns like *.txt though.

The only solution I have been able to come up with are clunky, with lots of if statements and extra for loops (i.e., checking if the pattern is a path, then creating all the possible paths from subdirectories then checking with fnmatch). Wondering if there is an elegant solution. Thanks in advance.

Upvotes: 1

Views: 3539

Answers (2)

martineau
martineau

Reputation: 123393

It's possible by making fnmatch.filter() compare the full path of each file to when it contains subdirectories if you include a leading wildcard character to the pattern as show. Since doing this requires significantly more processing, it's probably worth checking whether it's necessary as shown.

root = ...
pattern = '*/subdir/subdir2/t*.txt'  # note leading wildcard character

if not os.path.dirname(pattern):  # no subdirectories in pattern
    # no need to compare full paths
    for path, subdirs, files in os.walk(root):
        for fn in fnmatch.filter(files, pattern):
            print('found match - path: "{}", fn: "{}"'.format(path, fn))
else:
    for path, subdirs, files in os.walk(root):
        # must compare full file paths to pattern when it contains directories
        filepaths = (os.path.join(path, file) for file in files)  # generator
        for fp in fnmatch.filter(filepaths, pattern):
            fn = os.path.basename(fp)
            print('match found - path: "{}", fn: "{}"'.format(path, fn))

Upvotes: 1

user5004049
user5004049

Reputation: 761

I realized martineau's answer was still not quite what I needed because while it works great when I have a pattern like subdir/pattern, when I have something like subdir/subdir2/pattern, it doesn't quite work because subdir = os.path.split(path)[1] will get the individual subdir (in this case subdir2), but subdir_pat is subdir/subdir2, so it won't find any matches.

What I ended up doing was change this line

subdir = os.path.split(path)[1]  # isolate subdirectory name

to the following:

subdir = os.path.replace(root, '')  # I'm not too sure what is a better way to replace paths..

since I figured that when I check for subdirectories, it will be under root.

Any further feedback is appreciated.

Upvotes: 0

Related Questions