Kartik M
Kartik M

Reputation: 109

How to extract variable from string in Bash using Regex?

The string look like

str1= "the value of var1=test, the value of var2=testing, the final value of var3=testing1"

Upto now I split the string by IFS=','

IFS="," read -r -a final <<< "$str1"

Then assigning values to variables

var1="${final[0]#"var1="}"

How to assign the variable values in shortest way using regex?

Expected Output

var1=test
var2=testing
var3=testing1

Upvotes: 4

Views: 9813

Answers (2)

Charles Duffy
Charles Duffy

Reputation: 295291

#!/usr/bin/env bash
str1="the value of var1=test, the value of var2=testing, the final value of var3=testing1"
re='(^|[[:space:]])([[:alpha:]][[:alnum:]]*)=([^, ]+)([, ]|$)(.*)'

remaining=$str1
while [[ $remaining =~ $re ]]; do
  varname=${BASH_REMATCH[2]}
  value=${BASH_REMATCH[3]}
  remaining=${BASH_REMATCH[5]}
  printf -v "$varname" %s "$value"
done

# show current values to demonstrate that variables were really assigned
declare -p var1 var2 var3

This works because =~ stores each match group in your regex in a different position in the BASH_REMATCH variable, so we're able to pick out the groups with the names and values and perform an indirect assignment (printf -v varname %s "$value" stores value in varname).

The regex has a fair bit going on, so let's break it down piece-by-piece:

  • (^|[[:space:]]) ensures that we only match content at the beginning of the string or preceded by a space.
  • ([[:alpha:]][[:alnum:]]*)=([^, ]+) matches only assignments where the left-hand side is a valid variable name (a letter, optionally followed by characters that can be either letters or numbers). Because your sample data has commas following values, we know a comma can't be allowed in a value, so we disallow both commas and spaces from being considered part of a value.
  • ([, ]|$) allows a variable to terminate either with a comma or space following, or at the end of input.
  • (.*)' matches any remaining content we haven't yet processed, so that content can be run against the regex on the time cycle through the loop.

Upvotes: 6

David C. Rankin
David C. Rankin

Reputation: 84521

This is actually one where grep with -E extended regex matching can help, e.g.

grep -E -o 'var[0-9]*[[:blank:]]*=[[:blank:]]*[^,[:blank:]]+' <<< $str

Results in:

var1=test
var2=testing
var3=testing1

The [[:blank:]]* on either side of the '=' just allows for spaces on either side, if present. If there is never a chance of that, you can shorten it to grep -E -o 'var[0-9]*=[^,[:blank:]]+'.

Edit Per-Comment

To store it in var1, simply:

var1=$(grep -E -o 'var[0-9]*[[:blank:]]*=[[:blank:]]*[^,[:blank:]]+' <<< $str)

(or better, store each combination in an array, or create an associative array from the variable names and values themselves) For example, to store all of the var=val combinations in an associative array you could do:

str="the value of var1=test, the value of var2=testing, the final value of var3=testing1"
declare -A array
while read -r line; do 
    array[${line%=*}]=${line#*=}
done < <(grep -E -o 'var[0-9]*[[:blank:]]*=[[:blank:]]*[^,[:blank:]]+' <<< $str)
for i in ${!array[@]}; do
    echo "$i => ${array[$i]}"
done

Example Output

var1 => test
var3 => testing1
var2 => testing

Upvotes: 3

Related Questions