Max Coplan
Max Coplan

Reputation: 1503

Python regex works without optional group, but breaks with optional group

Given an input:

line = " say hi /* comment"

and a regex:

regex = re.compile(r'\s*(?P<command>.*?)/[/*]')
result = regex.search(line)
print(result.group('command'))

This will successfully print say hi.

However, switching the last part to an optional group:

regex = re.compile(r'\s*(?P<command>.*?)(/[/*])?')

now doesn't print anything. It's not the regex doesn't match it at all, because result isn't None.

Why is it that it works when the regex is not optional, but stops working when it is optional, and how would I go about solving it?

Upvotes: 1

Views: 221

Answers (3)

Jacky Wang
Jacky Wang

Reputation: 3490

In the above example, ?P<command>.*? use the non-greedy qualifiers *?, +?, ??, or {m,n}?, which match as little text as possible. See Greedy versus Non-Greedy for more details

And since the (/[/*])? is optional, the command group could be match nothing.

If you want to regex a line without comment, use the following

\s*(?P<command>.*?)(?:/[/*]|$)

to match

" say hi /* comment"
" say hi ..."

Upvotes: 0

pwxcoo
pwxcoo

Reputation: 3273

regex.search() will find first match substring. You can use regex.findall() in this case.

regex.search() docs said:

If there is more than one match, only the first occurrence of the match will be returned

Because /[/*] optional, it has no constraint to match / character. So it can even match first space, it can match any part.

You can check this regular expression online in regex101. you can find its process and results.

Upvotes: 1

YusufUMS
YusufUMS

Reputation: 1493

Maybe it is not the answer you are looking for, but it may help:

regex = re.compile(r'\s*(?P<command>.*?)/[/*](?P<optional>.*)')
result = regex.search(line)
print(result.group('command','optional'))

output:

('say hi ', ' comment')

For details click here

Upvotes: 1

Related Questions