carlcaroline
carlcaroline

Reputation: 65

Regex to match prices with different digit

I am trying to write a regex with go to match prices that composed by at least five digits with comma on the price.

For example:

"10,400,000","10,900,000","500,000",

I was trying the following expression: "((\d+)(,)(\d+))(,)..."(,) which matches only for the sequence that has eight digits (two commas).

For example:

"10,400,000", valid
"10,900,000", valid
"500,000", invalid

I don't think it would be efficient if I process it twice (one for the number with two commas and the other with one comma). how can I make the expression for the whole pattern?

Thank u

Upvotes: 1

Views: 144

Answers (4)

Reilas
Reilas

Reputation: 6266

Possibly, I've misunderstood the question.
Why would "500,000" be invalid, it has "... at least five digits with comma on the price ..."?

Try the following match pattern.

\"\d\d\d?(?:,\d{3})+\"

Upvotes: 0

Darin
Darin

Reputation: 2368

Assuming your example "10,400,000","10,900,000","500,000", is typical of what you are working with, then it looks like you probably have 2 options.

  1. You can get the matches that capture the double quotes with the number, see this Regex101.
    • "(?:[1-9]\d{0,2}(?:,\d{3}){2,}|[1-9]\d{1,2},\d{3})"
  2. The second option depends on what can be done in the Go language. If you can capture the groups instead of the matches, then the double quotes can be excluded, see this Regex101.
    • (?:")([1-9]\d{0,2}(?:,\d{3}){2,}|[1-9]\d{1,2},\d{3})(?:")

Upvotes: 1

Cary Swoveland
Cary Swoveland

Reputation: 110735

Observe that we need to either match a number between 10,000 and 999,999 or a number 1,000,000 or larger, with commas in the right places.

To match a number between 10,000 and 999,999 we can use

^[1-9]\d{1,2},\d{3}$

and to match a number 1,000,000 or larger we can use

^[1-9]\d{0,2}(?:,\d{3}){2,}$

We therefore merely need to construct an alternation comprised of these two expressions.

^(?:[1-9]\d{1,2},\d{3}|[1-9]\d{0,2}(?:,\d{3}){2,})$

Demo


This expression can be broken down as follows.

^          # match beginning of the string
(?:        # non-capture group
  [1-9]    # match the leading digit, any digit other than zero
  \d{1,2}  # match between 1 and 2 digits
  ,\d{3}   # match a comma followed by 3 digits
  |        # or
  [1-9]    # match the leading digit, any digit other than zero
  \d{0,2}  # match between 0 and 2 digits
  (?:      # begin a non-capture group
    ,\d{3} # match a comma followed by 3 digits
  ){2,}    # end the non-capture group and executed it >= 2 times
)          # end the non-capture group 
$          # match the end of the string

Lastly, we can factor out the match of the leading digit.

^[1-9](?:\d{1,2},\d{3}|\d{0,2}(?:,\d{3}){2,})$

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163577

You might use:

^(?:\d{2,3}|\d{1,3},\d{3})(?:,\d{3})+$

The pattern matches:

  • ^ Start of string
  • (?: Non capture group for the alternatives
    • \d{2,3} Match 2-3 digits
    • | Or
    • \d{1,3} Match 1-3 digits
    • ,\d{3} Match a comma and 3 digits
  • ) Close the non capture group
  • (?:,\d{3})+ Repeat 1+ times matching , and 3 digits
  • $ End of string

See a regex demo.


Without accepting leading zeroes, you can start the match with [1-9] and account for it in the following quantifier:

^(?:[1-9]\d{1,2}|[1-9]\d{0,2}(?:,\d{3})+)(?:,\d{3})+$

See another regex demo

Upvotes: 2

Related Questions