Reputation: 51

Capture multiple named groups in any order with regex

I have some regex in named groups such as (P?<a>A), (P?B), (P?<c>C). Then I have a sentence like some_word A C B with random order for A, B and C. I need to match those groups only if some_word appear in front of them. If this is the case, I would like to have an output like this : {a : "A", b : "B", c : "C"}.

I tried with the regex some_word ((?P<a>A)\s|(?PB)\s|(?P<c>C)\s){3}, but it does not work, as the group names have to be unique.

The only solution I have found is by using the regex some_word (?P<a>A|B|C)\s(?PA|B|C)\s(?P<c>A|B|C). It handles the permutation between A, B and C, but I lose the link {a : "A", b : "B", c : "C"}.

Thank you for your help !

Upvotes: 0

Answers (3)

sln

Reputation: 2706

If you are looking to match from some_word up until the last A,B, or C
in random order something like this works.
This will match the minimum string after some_word up until the first set
that includes A, B or C at least once.

some_word(?:(?=(?P<a>A)()|(?P<b>B)()|(?P<c>C)()|.).)+?(?=\2\4\6)

https://regex101.com/r/Gu5TnB/1

Upvotes: 0

Wiktor Stribiżew

Reputation: 626691

You can use the second approach but restrict each group pattern with a negative lookahead to avoid matching repeated contents:

import re
text = 'some_word B C A'
for x in re.finditer(r'some_word\s+(?:(?P<a>A|B|C)\s+(?!(?P=a))(?P<b>A|B|C)\s+(?!(?P=a)|(?P=b))(?P<c>A|B|C))', text):
    print( x.group("a") )
    print( x.group("b") )
    print( x.group("c") )

See the Python demo, output:

B
C
A

To make sure the values are not equal, you can add the whitespace boundaries to the lookaheads:

r'some_word\s+(?:(?P<a>A|B|C)\s+(?!(?P=a)(?!\S))(?P<b>A|B|C)\s+(?!(?:(?P=a)|(?P=b))(?!\S))(?P<c>A|B|C))'

Upvotes: 1

Alireza

Reputation: 2123

You can use this pattern: (?<=some_word)(?=.*(?P<a>A).*)(?=.*(?PB).*).*(?P<c>C).*

See Regex Demo

Code:

import re

pattern = "(?<=some_word)(?=.*(?P<a>A).*)(?=.*(?P<b>B).*).*(?P<c>C).*"
text = "some_word A C B"
matches = re.search(pattern, text)
print(matches.groupdict())

Output:

{'a': 'A', 'b': 'B', 'c': 'C'}

Upvotes: 1

Capture multiple named groups in any order with regex

Answers (3)

Related Questions