Reputation: 367
I have a simple regex query.
Here is the input:
DLWLALDYVASQASV
The desired output are the positions of the bolded characters. DLWLALDYVASQASV
So it would be D:6, Y:7, S:10.
I am using python, so I know I can use span()
or start()
to obtain the start positions of a match. But if I try to use something like: DY.{2}S
It will match the characters in between and only give me the position of the first (and last in the case of span) character of the match.
Is there a function or a way to retrieve the position of each specified character, not including the characters in-between?
Upvotes: 0
Views: 36
Reputation: 601
match = re.search(r'(D)(Y)..(S)', 'DLWLALDYVASQASV')
print([match.group(i) for i in range(4)])
>>> ['DYVAS', 'D', 'Y', 'S']
print([match.span(i) for i in range(4)])
>>> [(6, 11), (6, 7), (7, 8), (10, 11)]
print([match.start(i) for i in range(4)])
>>> [6, 6, 7, 10]
You can take subexpressions of regular expression into brackets and then access the corresponding substrings via the match object. See the documentation of Match object for more details.
Upvotes: 1