Woody Pride
Woody Pride

Reputation: 13955

Regular expression to check string is integer

An HTML form returns me a string of a number entered by a user. How do I use regular expressions to see if it is capable of being a number or not. I do not simply want to strip away commas and see if it can be cast to int, nor do I like the locale.atoi method as the strings will evalaute to numbers even if they are nonsense (e.g. locale.atoi('01,0,0') evaluates to 100).

NB this validation only occurs if the string contains commas

The re pattern should be:

1st character is 1-9 (not zero) 2nd and 3rd characters are 0-9 Then 3 digits 1-9 and a comma repeated between 0 and 2 times (999,999,999,999 is largest number possible in the program) Then finally 3 digits 1-9

compiled = re.compile("[1-9][0-9]{0,2},(\d\d\d,){0,2}[0-9]{3}")

which is not matching the end of the string correctly, for example:

re.match(compiled, '123,456,78') 

is matching. What have I done wrong?

Upvotes: 3

Views: 2328

Answers (2)

zx81
zx81

Reputation: 41838

More Compact

I would suggest something more compact:

^[1-9][0-9]{0,2}(?:,[0-9]{3}){0,3}$

See the demo

  • The ^ asserts that we are at the beginning of the string
  • [1-9] matches our first digit
  • [0-9]{0,2} matches up to two additional digits
  • (?:,[0-9]{3}) matches a comma and three digits...
  • between 0 and three times
  • $ asserts that we are at the end of the string

To validate, you could do:

if re.search("^[1-9][0-9]{0,2}(?:,[0-9]{3}){0,3}$", subject):
    # Successful match
else:
    # Match attempt failed

Upvotes: 1

aquavitae
aquavitae

Reputation: 19114

If you want to match the full string, make sure to specify stand and end in your regex, i.e.:

re.compile(r"^[1-9][0-9]{0,2},(\d\d\d,){0,2}[0-9]{3}$")

Also, as you will notice, I used a raw string (r prefix) to avoid escaping \.

Edit

Just to explain what's going on with your regex, the smallest substring it will match is where the first set of digits is matched zero times, and the second set is also matched zero times:, i.e. "[1-9][0-9]{0},(\d\d\d,){0}[0-9]{3}" which is the same as [0-9]{3}. Since this can match anywhere is the string, it could match "123" or "456".

Upvotes: 1

Related Questions