Reputation: 7155
>>> re.findall(r"(?:do|re|mi)+", "mimi")
['mimi']
>>> re.findall(r"(do|re|mi)+", "mimi")
['mi']
According to my understanding of the definitions, it should produce the same answer. The only difference between (...)
and (?:...)
should be whether or not we can use back-references later. Am I missing something?
(...)
Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the \number special sequence, described below. To match the literals '(' or ')', use ( or ), or enclose them inside a character class: [(] [)].
(?:...)
A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.
Upvotes: 1
Views: 626
Reputation: 36282
The (?:...)
does no grouping. So, in
re.findall(r"(?:do|re|mi)+", "mimi")
it returns one value for each match on the whole regular expression, in this case twice the mi
string, so a list with one element, mimi
.
The (...)
does grouping and findall()
will return a value for each parenthesized string matched. In
re.findall(r"(do|re|mi)+", "mimi")
matches mi
and saves it as group 1, then continues and matches again mi
, but inside the same parenthesized string, so overwrites the group 1, and in the end it returns the value from group 1, which it's only second mi
.
Upvotes: 4
Reputation: 425378
Although the matching is the same, the non-grouping version (?...)
:
Upvotes: 1
Reputation: 43942
No, you're not missing anything. The use of (?:...)
is to be able to group things without putting unneeded items in the backreferences/matched substrings.
Upvotes: 1