vdenotaris
vdenotaris

Reputation: 13627

Regex validation doesn't work in bash

I'd like to use the following regex in order to validate project version numbers:

(?!\.)(\d+(\.\d+)+)([-.][A-Z]+)?(?![\d.])$

DEMO

Valid inputs:

I'm trying to use a script as follows:

#!/bin/bash

r=true;
p="(?!\.)(\d+(\.\d+)+)([-.][A-Z]+)?(?![\d.])$"
while [ $r == true ]; do
    echo "get_v: "
    read v;
    if [[ $v =~ $p ]]; then
        echo "ok";
        r=false
    else
        echo "nok"
    fi
done

But it returns me NOK also by using valid inputs.

What am I doing wrong?

Upvotes: 0

Views: 296

Answers (2)

Bohemian
Bohemian

Reputation: 424973

Your regex uses look-arounds, which bash doesn't support. But you don't need look-arounds, once your regex has some "problems" fixed.

The look-arounds can be removed without changing what matches.

The leading look ahead is impossible to fail:

(?!\.)(\d+...

Because the regex start with a digit, it's unnecessary to assert it isn't a dot.

The trailing look ahead is also impossible to fail:

 (?![\d.])$

The end-of-input can't be a digit.

You also have unnecessary brackets. With all the unnecessary parts removed, we get:

\d+(\.\d+)+([-.][A-Z]+)?

But bash doesn't support \d, so try using [0-9] instead of \d:

[0-9]+(\.[0-9]+)+([-.][A-Z]+)?

That should work.

Upvotes: 1

clt60
clt60

Reputation: 63892

Bash doesn't supports pcre - perl regular expressions. It supports extended regular expr - ERE.

You can check your string with grep -P like:

while read -r ver
do
    res=$(grep -oP '(?!\.)(\d+(\.\d+)+)([-.][A-Z]+)?(?![\d.])$' <<<"$ver")
    echo "ver:$ver status:$?  result:=$res="
done <<EOF | column -t
1
1-release
1.2
1.2-dev1
1.0.0-release
1.0.0.3
Q
x-release
1.x
z.2
1.x-dev1
1.x.0-dev3
.1.0-dev3
v1
v1.3-SNAPSHOT
EOF

However, recheck your regex, because the above prints:

ver:1              status:1  result:==
ver:1-release      status:1  result:==
ver:1.2            status:0  result:=1.2=
ver:1.2-dev1       status:1  result:==
ver:1.0.0-release  status:1  result:==
ver:1.0.0.3        status:0  result:=1.0.0.3=
ver:Q              status:1  result:==
ver:x-release      status:1  result:==
ver:1.x            status:1  result:==
ver:z.2            status:1  result:==
ver:1.x-dev1       status:1  result:==
ver:1.x.0-dev3     status:1  result:==
ver:.1.0-dev3      status:1  result:==
ver:v1             status:1  result:==
ver:v1.3-SNAPSHOT  status:0  result:=1.3-SNAPSHOT=

I would use

r='((?<=\A)|(?<=\s))v?\d+(\.\d+)*(-\w+)?(?=(\s|\z))'

e.g.:

r='((?<=\A)|(?<=\s))v?\d+(\.\d+)*(-\w+)?(?=(\s|\z))'
while IFS= read -r ver
do
    res=$(grep -oP "$r" <<<"$ver")
    printf "ver:%-15.15s status:%s result:=%s=\n" "$ver" $?  "$res"
done <<EOF
1
 1-release
  1.2
1.2-dev1
1.0.0-release
1.0.0.3
Q
x-release
1.x
z.2
1.x-dev1
1.x.0-dev3
.1.0-dev3
v1
v1.3-SNAPSHOT
EOF

prints:

ver:1               status:0 result:=1=
ver: 1-release      status:0 result:=1-release=
ver:  1.2           status:0 result:=1.2=
ver:1.2-dev1        status:0 result:=1.2-dev1=
ver:1.0.0-release   status:0 result:=1.0.0-release=
ver:1.0.0.3         status:0 result:=1.0.0.3=
ver:Q               status:1 result:==
ver:x-release       status:1 result:==
ver:1.x             status:1 result:==
ver:z.2             status:1 result:==
ver:1.x-dev1        status:1 result:==
ver:1.x.0-dev3      status:1 result:==
ver:.1.0-dev3       status:1 result:==
ver:v1              status:0 result:=v1=
ver:v1.3-SNAPSHOT   status:0 result:=v1.3-SNAPSHOT=

if you don't want allow v1.1 - e.g. v at the beginning remove the v? from the regex.

If you want more restrictive regex, use the

r='((?<=\A)|(?<=\s))\d+(\.\d+){1,2}(-[A-Z]+)?(?=(\s|\z))'

will allows only 2 or 3 numbers and only uppercase after the -.

and finally, if you want pure bash - use ERE like the next:

r='^[0-9]+(\.[0-9]+)+(-[A-Z]+)?$'
while read -r ver
do
    [[ $ver =~ $r ]] && echo "$ver: ok" || echo "$ver: no"
done <<EOF | column -t
1
1-RELEASE
1.2
1.2-DEV
1.2-DEV2
1.0.0-RELEASE
1.0.0  
1.0.0.3
Q
x-RELEASE
1.x
z.2
1.x-DEV
1.x.0-DEV
.1.0-DEV
v1
v1.3-SNAPSHOT
EOF

prints

1:              no
1-RELEASE:      no
1.2:            ok
1.2-DEV:        ok
1.2-DEV2:       no
1.0.0-RELEASE:  ok
1.0.0:          ok
1.0.0.3:        ok
Q:              no
x-RELEASE:      no
1.x:            no
z.2:            no
1.x-DEV:        no
1.x.0-DEV:      no
.1.0-DEV:       no
v1:             no
v1.3-SNAPSHOT:  no

Upvotes: 5

Related Questions