tsh
tsh

Reputation: 967

python regex: duplicate names in named groups

Is there a way to use same name in regex named group in python? e.g.(?P<n>foo)|(?P<n>bar).

Use case: I am trying to capture type and id with this regex:
/(?=videos)((?P<type>videos)/(?P<id>\d+))|(?P<type>\w+)/?(?P<v>v)?/?(?P<id>\d+)?
from this strings:

For now I am getting error: redefinition of group name 'id' as group 6; was group 3

Upvotes: 15

Views: 6967

Answers (2)

SCGH
SCGH

Reputation: 967

You could easily transform

match(r'(?P<n>foo)|(?P<n>bar)', s)

into

match(r'(?P<n>foo)', s) or match(r'(?P<n>bar)', s)

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626950

The answer is: Python re does not support identically named groups.

Python PyPi regex module supports identically named named capturing groups:

The same name can be used by more than one group, with later captures ‘overwriting’ earlier captures. All of the captures of the group will be available from the captures method of the match object.

And here is a live Python 2.7 demo:

import regex
s = "foo bar"
rx = regex.compile(r"(?P<n>foo)|(?P<n>bar)")
print([x.group("n") for x in rx.finditer(s)])
// => ['foo', 'bar']

Also, in other cases, when you want to match several alternatives and capture just parts into one group, you may resort to a branch reset feature:

Branch reset

(?|...|...)

Capture group numbers will be reused across the alternatives, but groups with different names will have different group numbers.

Examples:

>>> regex.match(r"(?|(first)|(second))", "first").groups()
('first',)
>>> regex.match(r"(?|(first)|(second))", "second").groups()
('second',)

Note that there is only one group.

Upvotes: 26

Related Questions