Eduard Florinescu
Eduard Florinescu

Reputation: 17511

Regex to match a range number of digits from a string representin an integer number?

I want matched only the strings that represent numbers between 0...9999

import re
NUMERIC  = re.compile("\d{,4}")
NUMERIC.match("324234")
nr =NUMERIC.match("324234")
nr.group(0)

Tried the above but it matches the first 4 digits from the string, even if the string has 5 digits.

Regex to match the numbers that have between 1 and 4 digits from this string represention of an integer number?

Upvotes: 0

Views: 5123

Answers (2)

Fred Foo
Fred Foo

Reputation: 363517

Anchors do the trick of not matching too much:

>>> pattern = re.compile("^\d{1,4}$")
>>> pattern.match("0").group()
'0'
>>> pattern.match("42").group()
'42'
>>> pattern.match("777").group()
'777'
>>> pattern.match("2012").group()
'2012'
>>> pattern.match("65535").group()
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

Note the {1,4} -- I'm assuming you don't want to match the empty string. However, this will not match 00001, which certainly is in range.

A more robust alternative to regular expressions is to leverage Python's built-in integer parsing:

def parse_4digit_number(s):
    i = int(s)
    if 0 <= i <= 9999:
        return i
    else:
        raise ValueError("{0} is out of range".format(i))

Upvotes: 3

John Sobolewski
John Sobolewski

Reputation: 4572

^ is start of line $ is end of line

You likely want words... not whole lines... so

\< = start of word 
\> = end of word
\b is word boundry...

\< and > aren't supported in many languages...

so

\b(\d{1,4})\b

however that will match 22.33 as two separate matches.

You could avoid that by doing something like this.

(?:^|\s)(\d{1,4})(?:\s|$)

However that would miss

super duper 3333,and 

So you would have to add "," or other puntuation to the list of trailing characters...

(?:^|\s)(\d{1,4})(?:\s|$|[,:;?])

However that brings us back to...

There were people numbering 5. Today...

The 5 would get missed! How do you tell the difference between that and "there were 55.55 percent of people"

Upvotes: 2

Related Questions