Reputation: 14982
I have a regex like this '^(a|ab|1|2)+$'
and want to get all sequence for this...
for example for re.search(reg, 'ab1') I want to get ('ab','1')
Equivalent result I can get with '^(a|ab|1|2)(a|ab|1|2)$'
pattern,
but I don't know how many blocks been matched with (pattern)+
Is this possible, and if yes - how?
Upvotes: 5
Views: 1403
Reputation: 21
I think you don't need regexpes for this problem, you need some recursial graph search function
Upvotes: 2
Reputation: 458
Your original expression does match the way you want to, it just matches the entire string and doesn't capture individual groups for each separate match. Using a repetition operator ('+', '*', '{m,n}'), the group gets overwritten each time, and only the final match is saved. This is alluded to in the documentation:
If a group matches multiple times, only the last match is accessible.
Upvotes: 3
Reputation: 5289
try this:
import re
r = re.compile('(ab|a|1|2)')
for i in r.findall('ab1'):
print i
The ab
option has been moved to be first, so it will match ab
in favor of just a
.
findall method matches your regular expression more times and returns a list of matched groups. In this simple example you'll get back just a list of strings. Each string for one match. If you had more groups you'll get back a list of tuples each containing strings for each group.
This should work for your second example:
pattern = '(7325189|7325|9087|087|18)'
str = '7325189087'
res = re.compile(pattern).findall(str)
print(pattern, str, res, [i for i in res])
I'm removing the ^$
signs from the pattern because if findall has to find more than one substring, then it should search anywhere in str. Then I've removed +
so it matches single occurences of those options in pattern.
Upvotes: 4