a1426
a1426

Reputation: 256

re.MULTILINE flag is interfering with the end of line $ operator

Sorry if this is a duplicate/basic question, I couldn't find any similar questions.

I have the following multiline string

my_txt = """
foo.exe\n
bar.exec\n
abab.exe\n
"""

(The newlines aren't actually written in my code, I put them there for clarity). I want to match every file that ends with a .exe, (not .exec). My regex was initially:

my_reg = re.compile(".+[.](?=exe$)")
my_matches = my_reg.finditer(my_txt)

I hoped that it would first find every character, go back until it found the ., and then check if the characters exe and a newline followed. Only one match was found, and that was: abab.exe. I tried to mess around a bit, and changed the first line: my_reg = re.compile(".+[.](?=exe$)",flags=re.MULTILINE). This time, it successfully ran, returning

foo.
abab.

I thought re.MULTILINE wasn't supposed to interfere with the $ operator, or am I wrong about the $ operator/misusing something? Thanks in advance!

Upvotes: 0

Views: 23

Answers (1)

jdaz
jdaz

Reputation: 6063

You do need the multiline flag, otherwise $ will only match the absolute end of your input. You just need to match exe instead of using a lookahead:

my_reg = re.compile(".+[.]exe$", re.MULTILINE)

Output:

['foo.exe', 'abab.exe']

Demo

If you are trying to match the filename without the extension, you can put the period inside the lookahead:

my_reg = re.compile(r".+(?=\.exe$)", re.MULTILINE)

Output:

['foo', 'abab']

Demo

Upvotes: 1

Related Questions