Reputation: 17511
I want matched only the strings that represent numbers between 0...9999
import re
NUMERIC = re.compile("\d{,4}")
NUMERIC.match("324234")
nr =NUMERIC.match("324234")
nr.group(0)
Tried the above but it matches the first 4 digits from the string, even if the string has 5 digits.
Regex to match the numbers that have between 1 and 4 digits from this string represention of an integer number?
Upvotes: 0
Views: 5123
Reputation: 363517
Anchors do the trick of not matching too much:
>>> pattern = re.compile("^\d{1,4}$")
>>> pattern.match("0").group()
'0'
>>> pattern.match("42").group()
'42'
>>> pattern.match("777").group()
'777'
>>> pattern.match("2012").group()
'2012'
>>> pattern.match("65535").group()
------------------------------------------------------------
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
Note the {1,4}
-- I'm assuming you don't want to match the empty string. However, this will not match 00001
, which certainly is in range.
A more robust alternative to regular expressions is to leverage Python's built-in integer parsing:
def parse_4digit_number(s):
i = int(s)
if 0 <= i <= 9999:
return i
else:
raise ValueError("{0} is out of range".format(i))
Upvotes: 3
Reputation: 4572
^ is start of line $ is end of line
You likely want words... not whole lines... so
\< = start of word
\> = end of word
\b is word boundry...
\< and > aren't supported in many languages...
so
\b(\d{1,4})\b
however that will match 22.33 as two separate matches.
You could avoid that by doing something like this.
(?:^|\s)(\d{1,4})(?:\s|$)
However that would miss
super duper 3333,and
So you would have to add "," or other puntuation to the list of trailing characters...
(?:^|\s)(\d{1,4})(?:\s|$|[,:;?])
However that brings us back to...
There were people numbering 5. Today...
The 5 would get missed! How do you tell the difference between that and "there were 55.55 percent of people"
Upvotes: 2