kramer65
kramer65

Reputation: 53853

Python regex for int with at least 4 digits

I am just learning regex and I'm a bit confused here. I've got a string from which I want to extract an int with at least 4 digits and at most 7 digits. I tried it as follows:

>>> import re
>>> teststring = 'abcd123efg123456'
>>> re.match(r"[0-9]{4,7}$", teststring)

Where I was expecting 123456, unfortunately this results in nothing at all. Could anybody help me out a little bit here?

Upvotes: 14

Views: 45075

Answers (3)

galarant
galarant

Reputation: 2019

You can also use:

re.findall(r"[0-9]{4,7}", teststring)

Which will return a list of all substrings that match your regex, in your case ['123456']

If you're interested in just the first matched substring, then you can write this as:

next(iter(re.findall(r"[0-9]{4,7}", teststring)), None)

Upvotes: 3

Andrew Cheong
Andrew Cheong

Reputation: 30273

@ExplosionPills is correct, but there would still be two problems with your regex.

First, $ matches the end of the string. I'm guessing you'd like to be able to extract an int in the middle of the string as well, e.g. abcd123456efg789 to return 123456. To fix that, you want this:

r"[0-9]{4,7}(?![0-9])"
            ^^^^^^^^^

The added portion is a negative lookahead assertion, meaning, "...not followed by any more numbers." Let me simplify that by the use of \d though:

r"\d{4,7}(?!\d)"

That's better. Now, the second problem. You have no constraint on the left side of your regex, so given a string like abcd123efg123456789, you'd actually match 3456789. So, you need a negative lookbehind assertion as well:

r"(?<!\d)\d{4,7}(?!\d)"

Upvotes: 25

Explosion Pills
Explosion Pills

Reputation: 191729

.match will only match if the string starts with the pattern. Use .search.

Upvotes: 9

Related Questions