Rubens
Rubens

Reputation: 14768

Avoiding inner group in regex return

I'm working on an html extraction, and I'm using regular expressions in minor duties. I'm using python re module, and I'd like to avoid inner groups to be returned when defining the group is necessary -- or, at least, when it seems to be needed.

As an example, consider the string:

line = u" 07.49 (43 votes) "

And the expression:

expr = lambda x: re.findall("(\d+(\.\d{1,2})?)\D*(\d+)", x)

The return of the application is:

expr(line)
[(u'7.49', u'.49', u'43')]

And I'd like to have the following as result:

expr(line)
[(u'7.49', u'43')]

But I need to define the inner group (\.\d{1,2})? in "(\d+(\.\d{1,2})?)\D*(\d+)", as the decimal part of the number may not appear.

Is there a way to avoid this extra group?

Upvotes: 0

Views: 87

Answers (1)

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798626

Absolutely. Use the non-capturing group instead.

(\d+(?:\.\d{1,2})?)\D*(\d+)

Upvotes: 2

Related Questions