Guillaume
Guillaume

Reputation: 1032

repeating previous regex

I have a line (and a arbitrary number of them) 0 1 1 75 55

I can get it by doing

x = re.search("\d+\s+\d+\s+(\d+)\s+(\d+)\s+(\d+)", line)
    if x != None:
        print(x.group(1))
        print(x.group(2))
        print(x.group(3))

But there must be a neater way to write this. I was looking at the docs for something to repeat the previous expression and found (exp){m times}.

So I try

x = re.search("(\d+\s+){5}", line)

and then expect x.group(1) to be 0, 2 to be 1, 3 to be 1 and so on but x.group(1) ouputs 55 (the last number). Im sort of confused. Thanks.

Also on a side note. Do you guys have any recommendations for online tutorials (or free to download books) on regex?

Upvotes: 1

Views: 147

Answers (4)

eyquem
eyquem

Reputation: 27575

import re

line = '0 1 2 75 55'

x = re.search('\\s+'.join(5*('(\\d+)',)), line)

if x:
    print '\n'.join(x.group(3,4,5))

Bof

Or, with idea of Sven Marnach:

print '\n'.join(line.split()[2:5])

Upvotes: 0

Spaceghost
Spaceghost

Reputation: 6985

Have you considered findall which repeats the search until the input string is exhausted and returns all matches in a list?

>>> import re
>>> line = '0 1 1 75 55'
>>> x = re.findall("(\d+)", line)
>>> print x
['0', '1', '1', '75', '55']

Upvotes: 2

Sven Marnach
Sven Marnach

Reputation: 601539

In your regular expression, there is only one group, since you have only one pair of parentheses. This group will return the last match, as you found out yourself.

If you want to use regular expressions, and you know the number of integers in a line in advance, I would go for

x = re.search("\s+".join(["(\d+)"] * 5), line)

in this case.

(Note that

x = re.search("(\d+\s+){5}", line)

requires a space after the last number.)

But for the example you gave I'd actually use

line = "0 1 1 75 55"
int_list = map(int, line.split())

Upvotes: 1

PleaseStand
PleaseStand

Reputation: 32072

Repetition of capturing groups does not work, and won't any time soon (in the sense of having the ability to individually actually access the matched parts) – you will just have to write the regex the long way or use a string method such as .split(), avoiding regex altogether.

Upvotes: 3

Related Questions