Reputation: 53853
I am just learning regex and I'm a bit confused here. I've got a string from which I want to extract an int with at least 4 digits and at most 7 digits. I tried it as follows:
>>> import re
>>> teststring = 'abcd123efg123456'
>>> re.match(r"[0-9]{4,7}$", teststring)
Where I was expecting 123456, unfortunately this results in nothing at all. Could anybody help me out a little bit here?
Upvotes: 14
Views: 45075
Reputation: 2019
You can also use:
re.findall(r"[0-9]{4,7}", teststring)
Which will return a list of all substrings that match your regex, in your case ['123456']
If you're interested in just the first matched substring, then you can write this as:
next(iter(re.findall(r"[0-9]{4,7}", teststring)), None)
Upvotes: 3
Reputation: 30273
@ExplosionPills is correct, but there would still be two problems with your regex.
First, $
matches the end of the string. I'm guessing you'd like to be able to extract an int in the middle of the string as well, e.g. abcd123456efg789
to return 123456
. To fix that, you want this:
r"[0-9]{4,7}(?![0-9])"
^^^^^^^^^
The added portion is a negative lookahead assertion, meaning, "...not followed by any more numbers." Let me simplify that by the use of \d
though:
r"\d{4,7}(?!\d)"
That's better. Now, the second problem. You have no constraint on the left side of your regex, so given a string like abcd123efg123456789
, you'd actually match 3456789
. So, you need a negative lookbehind assertion as well:
r"(?<!\d)\d{4,7}(?!\d)"
Upvotes: 25
Reputation: 191729
.match
will only match if the string starts with the pattern. Use .search
.
Upvotes: 9