Reputation: 604
I was working a regex expression in Python to extract groups. I am correctly extracting the 3 groups I want (symbol, num, atom). However, the 'symbol' group should not have the '[' or ']' as I am using 'non-capturing' notation (?:..)
per python's docs (https://docs.python.org/3/library/re.html).
Am I understanding non-capturing wrong, or is this a bug?
Thanks!
import re
result = re.match(r'(?P<symbol>(?:\[)(?P<num>[0-9]{0,3})(?P<atom>C)(?:\]))', '[12C]')
print(result.groups())
# ('[12C]', '12', 'C')
# expected: ('12C', '12', 'C')
Upvotes: -1
Views: 53
Reputation: 23674
Move the checks for \[
and \]
outside of the capture for P<symbol>
. Moving them out of the capture will also mean you also don't need to use the non-capturing groups notation. e.g.
>>> import re
>>> result = re.match(r'\[(?P<symbol>(?P<num>[0-9]{0,3})(?P<atom>C))]', '[12C]')
>>> result.groups()
('12C', '12', 'C')
Upvotes: 2