Leo
Leo

Reputation: 1934

Parsing a number with commas with Javascript regex

I'm trying to parse numbers between 1 and 10,000,000 which can be straight digits (e.g. 123456) or with separating commas (1,234,567) between groups of 3 digits. The commas could also be spaces (1 234 567) or periods (1.234.567) but consistently used. I have written the following:

<script type="text/javascript">  
  var re = /(\d{1,3})[ |\,|\.]?(\d{3})(?:[ |\,|\.]?(\d{3}))?/i;
  function testStr(input) {
    var str = input.value;
    var newstr = str.replace(re, '[1]: $1\n[2]: $2\n[3]: $3');
    alert(newstr);
 }  
 </script>  

This works well, except that it also parses input such as 1234,567,890 or 1,234,5678 The groups of 4 consecutive digits should not be allowed. Why is this happening? Thanks for any help.

Upvotes: 2

Views: 103

Answers (1)

CertainPerformance
CertainPerformance

Reputation: 371193

One option is

^(\d{1,3})(?:([ ,.]?)(\d{3})(?:\2(\d{3}))?)?$

The idea is to capture the separator used (if any - if no separator, then the empty string is captured). Then, later, when at the point where a separator is expected, backreference the same separator that was found before, to ensure that all separators are the same, whether they're spaces, commas, periods, or nothing at all. Also, if you need to parse numbers between 1 and 10,000,000, then you should put everything past the initial (\d{1,3}) in an optional group.

Note that commas and periods do not need to be escaped in a character set, and | in a character set indicates a literal pipe character - just use [ ,.] instead.

Also use ^ and $ anchors to ensure you start at the very beginning of the string and match till the end of the string (forcing the match to fail otherwise).

https://regex101.com/r/2dFk0f/1

(\d{1,3}) - One to three digits, followed by an optional big non-capturing group of

([ ,.]?)(\d{3})(?:\2(\d{3}))?, which is:

([ ,.]?) - Capture the separator used

(\d{3}) - Repeat three digits

(?:\2(\d{3})? - If the number is 1m or greater, a separator is expected, so backreference the separator that was captured before, followed by three more digits. (If the number is less than 1m, then this optional group won't match)

Upvotes: 1

Related Questions