Reputation: 761
I'm trying to use fnmatch filter to find files given a pattern. However, if my pattern is something like subdir/t*.txt
my code
for path, subdirs, files in os.walk(root):
for fn in fnmatch.filter(files, pattern):
print 'found match'
will never reach the print statement. From what I can see, it'll never find a match because files is only the basename, and will not include subdirectories. Is there a good way to match patterns that include a subdirectory? It should still work for patterns like *.txt
though.
The only solution I have been able to come up with are clunky, with lots of if
statements and extra for
loops (i.e., checking if the pattern is a path, then creating all the possible paths from subdirectories then checking with fnmatch
). Wondering if there is an elegant solution. Thanks in advance.
Upvotes: 1
Views: 3539
Reputation: 123393
It's possible by making fnmatch.filter()
compare the full path of each file to when it contains subdirectories if you include a leading wildcard character to the pattern as show. Since doing this requires significantly more processing, it's probably worth checking whether it's necessary as shown.
root = ...
pattern = '*/subdir/subdir2/t*.txt' # note leading wildcard character
if not os.path.dirname(pattern): # no subdirectories in pattern
# no need to compare full paths
for path, subdirs, files in os.walk(root):
for fn in fnmatch.filter(files, pattern):
print('found match - path: "{}", fn: "{}"'.format(path, fn))
else:
for path, subdirs, files in os.walk(root):
# must compare full file paths to pattern when it contains directories
filepaths = (os.path.join(path, file) for file in files) # generator
for fp in fnmatch.filter(filepaths, pattern):
fn = os.path.basename(fp)
print('match found - path: "{}", fn: "{}"'.format(path, fn))
Upvotes: 1
Reputation: 761
I realized martineau's answer was still not quite what I needed because while it works great when I have a pattern like subdir/pattern
, when I have something like subdir/subdir2/pattern
, it doesn't quite work because subdir = os.path.split(path)[1]
will get the individual subdir (in this case subdir2
), but subdir_pat
is subdir/subdir2
, so it won't find any matches.
What I ended up doing was change this line
subdir = os.path.split(path)[1] # isolate subdirectory name
to the following:
subdir = os.path.replace(root, '') # I'm not too sure what is a better way to replace paths..
since I figured that when I check for subdirectories, it will be under root.
Any further feedback is appreciated.
Upvotes: 0