Reputation: 2488
I have a bunch of directories that follow this naming convention:
foo
foo.v2
foo_v01
foo_v02
foo_v03
bar
bar.v3
bar_v01
bar_v02
I am looking for a regex expression to only glob original directories (foo and foo_v01; bar and bar_v01). I'm using the Path.glob(pattern)
from pathlib
to glob the files. I would like to glob the original directories specifically by the name, not by timestamp.
Upvotes: 0
Views: 53
Reputation: 6877
Glob patterns (which utilize fnmatch under the hood) are not regular expressions and are much more limited than regular expressions.
Here's an alternative approach, actually using regular expressions to perform the filtering:
import os
import re
ROOT_DIR = "./dirs"
FILTER_RE = r"[._]v\d*[2-9]$"
filtered_dirs = [d for d in os.listdir(ROOT_DIR) if not re.search(FILTER_RE, d)]
print(sorted(filtered_dirs))
And here's the output:
$ ls dirs
bar bar.v3 bar_v01 bar_v02 foo foo.v2 foo_v01 foo_v02 foo_v03
$ python3 filter_dirs.py
['bar', 'bar_v01', 'foo', 'foo_v01']
Upvotes: 0
Reputation: 817
This works for your examples (if it doesn't work for others, please add them to your question)
r'^(?!\w+_v0[2-9])(\w+)$'
Explanation:
(\w+)
means that it should match any combination of letters, underscore, and numbers, one or more times.
(?!\w+_v0[2-9])
means that if it matches any combination the same as above, followed by _v0<any_number_from_2_to_9>
(versions above 1), it should discard the match.
Upvotes: 1