cooldood3490
cooldood3490

Reputation: 2488

Python Glob Orignal Directories using Regex

I have a bunch of directories that follow this naming convention:

foo
foo.v2
foo_v01
foo_v02
foo_v03
bar
bar.v3
bar_v01
bar_v02

I am looking for a regex expression to only glob original directories (foo and foo_v01; bar and bar_v01). I'm using the Path.glob(pattern) from pathlib to glob the files. I would like to glob the original directories specifically by the name, not by timestamp.

Upvotes: 0

Views: 53

Answers (2)

chuckx
chuckx

Reputation: 6877

Glob patterns (which utilize fnmatch under the hood) are not regular expressions and are much more limited than regular expressions.

Here's an alternative approach, actually using regular expressions to perform the filtering:

import os
import re

ROOT_DIR = "./dirs"
FILTER_RE = r"[._]v\d*[2-9]$"

filtered_dirs = [d for d in os.listdir(ROOT_DIR) if not re.search(FILTER_RE, d)]

print(sorted(filtered_dirs))

And here's the output:

$ ls dirs
bar  bar.v3  bar_v01  bar_v02  foo  foo.v2  foo_v01  foo_v02  foo_v03

$ python3 filter_dirs.py
['bar', 'bar_v01', 'foo', 'foo_v01']

Upvotes: 0

Lucas Abbade
Lucas Abbade

Reputation: 817

This works for your examples (if it doesn't work for others, please add them to your question)

r'^(?!\w+_v0[2-9])(\w+)$'

Explanation:

(\w+) means that it should match any combination of letters, underscore, and numbers, one or more times.

(?!\w+_v0[2-9]) means that if it matches any combination the same as above, followed by _v0<any_number_from_2_to_9> (versions above 1), it should discard the match.

Upvotes: 1

Related Questions