aarelovich
aarelovich

Reputation: 5576

Unexpected behavior in a regular expression in bash

I created this regular expression and tested it out successfully

https://regex101.com/r/a7qvuw/1

However the regular expression behaves differently in this bash code that I wrote

# Splitting by colon
IFS=';' read -ra statements <<< $contents

# Splitting by the = sign.
regex="\s*(.*?)\s*=\s*(.*)\b"
for i in "${statements[@]}"; do
   if [[ $i =~ $regex ]]; then
      key=${BASH_REMATCH[1]} 
      params=${BASH_REMATCH[2]}
      echo "KEY: $key| PARAMS: $params"          
   fi
done  

The variable $contents has the text as is used in the link. The problem is that the $key has a space at its end, while the regular expression I tried matches the words without the space.

I get output like this:

KEY: vclock_spec | PARAMS: clk_i 1 1

As you can see there is a space between vclock_spec and the | which should not be there. What am I doing wrong?

Upvotes: 1

Views: 119

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627400

As @Cyrus mentioned, lazy quantifiers are not supported in Bash regex. They act as greedy ones.

You may fix your pattern to work in Bash using

regex="\s*([^=]*\S)\s*=\s*(.*)\b"
           ^^^^^^^

The [^=]* matches zero or more symbols other then = and \S matches any non-whitespace (maybe [^\s=] will be more precise here as it matches any char but a whitespace (\s) and =, but it looks like regex="\s*([^=]*[^\s=])\s*=\s*(.*)\b" yields the same results).

Upvotes: 1

Related Questions