Reputation: 11
I have the following string:
This$#is% Matrix# %!
I am trying to catch on substrings where special symbols/spaces occur between alphanumeric characters. For eg, my goal is to find these 2 set of substrings: This$#is
(special symbols #, $ between 'This' and 'is') and is% Matrix
(special symbol % and whitespace between 'is' and 'Matrix').
My regex findall is as follows:
match = re.findall(r'([\w]{1,})([\s\W]{1,})([\w]{1,})', temp)
It is returning me: [('This', '$#', 'is')]
but not the second part ('is% Matrix')
. Is there anything I am doing wrong?
If I change my string to 'is% Matrix' and apply the same regex pattern, I get this: [('is', '% ', 'Matrix')]
.
Upvotes: 0
Views: 953
Reputation: 106455
You can use positive lookahead on the part you would like to have overlapping matches:
match = re.findall(r'([\w]{1,})([\s\W]{1,})(?=([\w]{1,}))', temp)
match
becomes:
[('This', '$#', 'is'), ('is', '% ', 'Matrix')]
Demo: https://regex101.com/r/2PJmlX/1
Upvotes: 1