makoto
makoto

Reputation: 23

Using a regex to match only decimal numbers but I keep matching non-single-digit numbers

  check_dec=^[0-9]*\.+[0-9]+
  input=0
  
  echo "Please enter a digit: "
  read input
 
  if [[ $input =~ $check_dec ]]
  then
           echo "The value is a decimal"
  else
          echo "The value is an integer"
  fi

I am trying to store a regex in the variable "check_dec" to validate the users input. I am having trouble understanding why any non-single-digit value I enter enters the "if" statement and reads "The value is a decimal"

As an example, when I type (44, 100, 1000, etc.), the output is "The value is a decimal" However, when I enter a single digit number the output reads "The value is an integer"

I'm brand new to regex/bash and don't fully understand . Any help would be appreciated.

Upvotes: 2

Views: 1228

Answers (3)

anubhava
anubhava

Reputation: 785196

Your regex:

check_dec=^[0-9]*\.+[0-9]+

Has 2 main problems:

  1. No quoting thus effectively making it "=^[0-9]*.+[0-9]+". Note that in regex dot matches any character hence this (incorrect) regex means:
  • Match 0 or more digits i.e. [0-9]*
  • Match 1+ of any character i.e. .+
  • Match 1+ of digits [0-9]+

So clearly this regex will require at least 2 characters in input and it should end with digits

  1. As evident from previous comment that other than quoting issue regex itself is not correct.

Correct regex to match decimal numbers like 123.45 would be:

check_dec='^[0-9]*\.[0-9]+'

Note no quantifier + after dot allowing only one dot in number and quotes around.

Upvotes: 1

user1934428
user1934428

Reputation: 22217

There are several problems with setting your regex, one is biting you now, the other ones could bite you later. You do a

check_dec=^[0-9]*\.+[0-9]+

If you had done an

echo $check_dec 

afterwards, you would have seen that check_dec contains the string ^[0-9]*.+[0-9]+ instead of ^[0-9]*\.+[0-9]+ .... The reason is that bash interprets the \ as an (unnecessary) attempt to escape the following character.

bash also tries to interpret your * and [...] as bash wildcards for expansion. If you by chance had a file named ^7xxx.+8+ in your working directory, check_dec would contain this filename instead of your carefully crafted pattern.

You need to quote the regexp to ensure that bash keeps its fingers from it:

check_dec='^[0-9]*[.]+[0-9]+'

What still is odd, is that you seem to allow more than one decimal point. For instance, the string ......22 would qualify as a decimal according to your pattern, while the string foo_bar would be an integer.

Upvotes: 0

slebetman
slebetman

Reputation: 113906

You need to escape the \ either by:

check_dec=^[0-9]*\\.+[0-9]+

or by:

check_dec='^[0-9]*\.+[0-9]+'

The reason is the \ character has special meaning in bash:

$ echo ^[0-9]*\.+[0-9]+
^[0-9]*.+[0-9]+               <---- note: missing \
$ echo ^[0-9]*\\.+[0-9]+
^[0-9]*\.+[0-9]+
$ echo '^[0-9]*\.+[0-9]+'
^[0-9]*\.+[0-9]+

Upvotes: 0

Related Questions