Chris Cris
Chris Cris

Reputation: 215

Nested regular expression

I have a string with alphanumeric values. The numeric values are variables. The alphabetic values are always 'abc' and 'ghi', but I don't know their order. The numeric values come always after the alphabetic values.

Valid examples of this kind of string are:

a = 'abc10ghi1450'
b = 'abc11ghi9285'
c = 'ghi1abc9'
...

Now I want to store the numbers after 'abc' and 'ghi' into appropriate variables and what I'm doing is:

>>> import re
>>> string = 'abc10ghi44'
>>> abc = re.search('abc\d+', string).group(0)
>>> abc = re.search('\d+', abc).group(0)
>>> ghi = re.search('ghi\d+', string).group(0)
>>> ghi = re.search('\d+', ghi).group(0)
>>> print abc, ghi
10, 44

For each variable I'm using 2 regexes and I don't like it; is there a smarter way to do the same thing?

Upvotes: 3

Views: 36

Answers (1)

jonrsharpe
jonrsharpe

Reputation: 122032

Yes, make a capturing group around the digits and use that:

>>> import re
>>> string = 'abc10ghi44'
>>> re.search('abc(\d+)', string).group(1)
'10'

Note parentheses around \d+ and 1 in the group call.


Alternatively, use a positive lookbehind:

>>> re.search('(?<=abc)\d+', string).group(0)
'10'

Upvotes: 5

Related Questions