Reputation: 487
i would like to perform regex matching, but exclude from the result the characters used for the matching. consider this sample code
import re
string="15 83(B): D17"
result=re.findall(r'\(.+\)',string)
print(result)
here's what i get: ['(B)']
here's what i'd like to have: ['B']
i'm after a generic solution to exclude pattern characters used to start/end a match from the result, not just a solution for this precise case. for example instead of just ( and ) i could have used more complex patterns to start/end matching, and i still would not want to see them as part of the result.
Upvotes: 0
Views: 39
Reputation: 7840
Lookaheads and lookbehinds can be used to assert that characters are present before and after a certain position without including them in a match.
>>> string = "15 83(B): D17"
>>> re.findall(r'(?<=\().+(?=\))', string)
['B']
Here, (?<=\()
is a positive lookbehind that asserts that an open parenthesis character comes immediately before this position. (?=\))
is a positive lookahead that asserts that a close parenthesis character comes immediately after.
Search the re
module documentation for the terms "lookahead" and "lookbehind" for more information.
Upvotes: 0
Reputation: 824
The correct regex is it
r"(?<=\()(.*)(?=\))"
U can change ( and ) for anything
Upvotes: 0
Reputation: 784998
You need to use a capturing group for the text you need in output like this:
>>> string="15 83(B): D17"
>>> print re.findall(r'\((.*?)\)', string)
['B']
(.*?)
is capturing group to match and capture 0 or more characters, non-greedy
In general you can replace starting (
and ending )
with anything you have as before and after your match.
Upvotes: 2