Matching filenames with regex in python

Question

I'm looking for a regex command to match file names in a folder. I already got all the filenames in a list. Now I want to match a pattern in a loop (file is the string to match):

./test1_word1_1.1_1.2_1.3.csv

with:

match = re.search(r'./{([\w]+)}_word1_{([0-9.]+)}_{([0-9.]+)}_{([0-9.]+)}*',file)

I used to get regex working but in this special case it simple doesn't work. Can you help me with that?

I want to continue with the match of regex the following way (I've written the outcome here):

match[0] = test1
match[1] = 1.1
match[2] = 1.2
match[3] = 1.3

The curly brackets are my fault. They don't make sense at all. Sorry

Best regards, sebastian

Wiktor Stribiżew · Accepted Answer

You may use

r'\./([^\W_]+)_word1_([0-9.]+)_([0-9.]+)_([0-9]+(?:\.[0-9]+)*)'

See the regex demo

Details:

\. - a literal dot (if it is unescaped it matches any char other than a line break char)
/ - a / symbol (no need escaping it in a Python regex pattern)
([^\W_]+) - Group 1 matching 1 or more letters or digits (if you want to match a chunk containing _, keep your original (\w+) pattern)
_word1_ - a literal substring
([0-9.]+) - Group 1 matching 1 or more digits and/or . symbols
_ - an underscore
([0-9.]+) - Group 2 matching 1 or more digits and/or . symbols
_ - an underscore
([0-9]+(?:\.[0-9]+)*) - Group 3 matching 1 or more digits, then 0+ sequences of a . and 1 or more digits

Python demo:

import re
rx = r"\./([^\W_]+)_word1_([0-9.]+)_([0-9.]+)_([0-9]+(?:\.[0-9]+)*)"
s = "./test1_word1_1.1_1.2_1.3.csv"
m = re.search(rx, s)
if m:
    print("Part1: {}
Part2: {}
Part3: {}
Part4: {}".format(m.group(1), m.group(2), m.group(3), m.group(4) ))

Output:

Part1: test1
Part2: 1.1
Part3: 1.2
Part4: 1.3

Matching filenames with regex in python

Answers (2)

Related Questions