Reputation: 14768
I'm working on an html extraction, and I'm using regular expressions in minor duties. I'm using python re
module, and I'd like to avoid inner groups to be returned when defining the group is necessary -- or, at least, when it seems to be needed.
As an example, consider the string:
line = u" 07.49 (43 votes) "
And the expression:
expr = lambda x: re.findall("(\d+(\.\d{1,2})?)\D*(\d+)", x)
The return of the application is:
expr(line)
[(u'7.49', u'.49', u'43')]
And I'd like to have the following as result:
expr(line)
[(u'7.49', u'43')]
But I need to define the inner group (\.\d{1,2})?
in "(\d+(\.\d{1,2})?)\D*(\d+)"
, as the decimal part of the number may not appear.
Is there a way to avoid this extra group?
Upvotes: 0
Views: 87
Reputation: 798626
Absolutely. Use the non-capturing group instead.
(\d+(?:\.\d{1,2})?)\D*(\d+)
Upvotes: 2