Joel G Mathew
Joel G Mathew

Reputation: 8061

Python capture group from string, with regex

I'm new to python, coming from a basic knowledge of perl. I'm trying to capture a substring with regex.

>>> a='Question 73 of 2943'
>>> import re
>>> re.match("Question.*(\d+)\s+of", a).group(0)
'Question 73 of'
>>> re.match("Question.*(\d+)\s+of", a).group(1)
'3'

What I wanted to do was to catch 73 in the group. I assumed that the parenthesis would do that.

Upvotes: 1

Views: 94

Answers (3)

rsiemens
rsiemens

Reputation: 615

.* is greedy. What this means is it will continue to match any character (except for line terminators) 0 or more times. That means the (\d+) capture group you have set up will never happen. What you can do is make the .* part lazy by adding a ? so your regex would look like...

re.match(r"Question.*?(\d+)\s+of", a)

The difference between lazy and greedy regex is well explained here

Upvotes: 1

digitake
digitake

Reputation: 856

Your .* part will capture any character included a digit. Better to use except.

Question[^\d]*(\d+)\s+of

that should give you 73

Upvotes: 0

Nuruddin Iminokhunov
Nuruddin Iminokhunov

Reputation: 789

If you would like to capture 73 only, you can do re.search(r'\d+', a).group() which stops searching for a match after finding the first match.

Upvotes: 0

Related Questions