Reputation: 9745
I have a string with repeated parts:
s = '[1][2][5] and [3][8]'
And I want to group the numbers into two lists using re.match
. The expected result is:
{'x': ['1', '2', '5'], 'y': ['3', '8']}
I tried this expression that gives a wrong result:
re.match(r'^(?:\[(?P<x>\d+)\])+ and (?:\[(?P<y>\d+)\])+$', s).groupdict()
# {'x': '5', 'y': '8'}
It looks like re.match
keeps the last match only. How do I collect all the parts into a list instead of the last one only?
Of course, I know that I could split the line on ' and '
separator and use re.findall
for the parts instead, but this approach is not general enough because it gives some issues for more complex strings so I would always need to think about correct splitting separately all the time.
Upvotes: 1
Views: 59
Reputation: 163362
If you want to use the named capture groups, you can write the pattern like this repeating the digits between the square brackets inside the named group.
Then you can get the digits from the groupdict using re.findall on the values and first check if there is a match for the pattern:
^(?P<x>(?:\[\d+])+) and (?P<y>(?:\[\d+])+)$
See a regex demo
Example
import re
s = '[1][2][5] and [3][8]'
m = re.match(r'^(?P<x>(?:\[\d+])+) and (?P<y>(?:\[\d+])+)$', s)
if m:
dct = {k: re.findall(r"\d+", v) for k, v in m.groupdict().items()}
print(dct)
Output
{'x': ['1', '2', '5'], 'y': ['3', '8']}
Upvotes: 1
Reputation: 521259
We can use regular expressions here. First, iterate the input string looking for matches of the type [3][8]
. For each match, use re.findall
to generate a list of number strings. Then, add a key whose value is that list. Note that we maintain a list of keys and pop each one when we use it.
import re
s = '[1][2][5] and [3][8]'
keys= ['x', 'y']
d = {}
for m in re.finditer('(?:\[\d+\])+', s):
d[keys.pop(0)] = re.findall(r'\d+', m.group())
print(d) # {'y': ['3', '8'], 'x': ['1', '2', '5']}
Upvotes: 1