Jabda
Jabda

Reputation: 1792

Python regex optional capture group with positive lookahead

trying to find certain folder patterns. i can have a simple if statement but now I am wondering, Can this be done in one regex pattern?

folders:

name
name_a01
name_a02
..
name_a20

name_dontuse_a10 < dont want this pattern

pattern = re.match(".*name(_a[0-9])?", dir)

the above matches correctly but it matches both name and name_dontuse_a10

pattern.group(1)

return None for both those folder so that doesn't help me much.

I cant predict what the unwanted folders will be named but I want both the base name folder and any name_a## folder. I think I need a postie lookahead but unsure how to use that with an optional capture group.

Upvotes: 1

Views: 589

Answers (2)

okovko
okovko

Reputation: 1901

The best solution is to first extract the filename out of the path, so you don't have to deal with it in your regex. normpath removes any trailing / and basename extracts the filename. So for dir1/dir2/name/ you get name.

import os, re

dir = ...
name = os.path.basename(os.path.normpath(dir))

pattern = re.match("name(_a\d+)?", name)

Note that what was happening in your original solution was that .* was matching the entire string, leading to an unexpected behavior.

Upvotes: 1

Mako212
Mako212

Reputation: 7292

Try using this one:

pattern = re.match(".*name(_a[0-9]*)?$", dir)

I just added $ to match the end of the string after the first underscore. I also added [0-9]* to match zero or more digits.

Live Example:

https://regex101.com/r/MSldc6/2/

Upvotes: 1

Related Questions