Reputation: 67
I use
re.compile(r"(.+?)\1+").findall('44442(2)2(2)44')
can get
['4','2(2)','4']
, but how can I get
['4444','2(2)2(2)','44']
by using regular expression?
Thanks
Upvotes: 3
Views: 91
Reputation:
No change to your pattern needed. Just need to use to right function for the job. re.findall
will return a list of groups if there are capturing groups in the pattern. To get the entire match, use re.finditer instead, so that you can extract the full match from each actual match object.
pattern = re.compile(r"(.+?)\1+")
[match.group(0) for match in pattern.finditer('44442(2)2(2)44')]
Upvotes: 4
Reputation: 198314
With minimal change to OP's regular expression:
[m[0] for m in re.compile(r"((.+?)\2+)").findall('44442(2)2(2)44')]
findall
will give you the full match if there are no groups, or groups if there are some. So given that you need groups for your regexp to work, we simply add another group to encompass the full match, and extract it afterwards.
Upvotes: 3
Reputation: 41987
You can do:
[i[0] for i in re.findall(r'((\d)(?:[()]*\2*[()]*)*)', s)]
Here the Regex is:
((\d)(?:[()]*\2*[()]*)*)
which will output a list of tuples containing the two captured groups, and we are only interest din the first one hence i[0]
.
Example:
In [15]: s
Out[15]: '44442(2)2(2)44'
In [16]: [i[0] for i in re.findall(r'((\d)(?:[()]*\2*[()]*)*)', s)]
Out[16]: ['4444', '2(2)2(2)', '44']
Upvotes: 0