android_su
android_su

Reputation: 1687

regex in bash expression

I have 2 questions about regex in bash expression.

1.non-greedy mode

local temp_input='"a1b", "d" , "45"'
if [[ $temp_input =~ \".*?\" ]]
then
    echo ${BASH_REMATCH[0]}
fi

The result is

"a1b", "d" , "45"

In java

String str = "\"a1b\", \"d\" , \"45\"";
Matcher m = Pattern.compile("\".*?\"").matcher(str);
while (m.find()) {
    System.out.println(m.group());
}

I can get the result below.

"a1b"
"d"
"45"

But how can I use non-greedy mode in bash?
I can understand why the \"[^\"]\" works.
But I don't understand why does the \".
?\" do not work.

2.global matches

local temp_input='abcba'
if [[ $temp_input =~ b ]]
then
    #I wanna echo 2 b here. 
    #How can I set the global flag?
fi

How can I get all the matches?
ps:I only wanna use regex.

For the second question, sorry for the confusing.
I want to echo "b" and "b", not count "b".

Help!

Upvotes: 1

Views: 7127

Answers (3)

user251764
user251764

Reputation: 11

This is my first post, and I am very amateur at bash, so apologies if I haven't understood the question, but I wrote a function for non-greedy regex using entirely bash:

regex_non_greedy () {
    local string="$1"
    local regex="$2"
    local replace="$3"

    while [[ $string =~ $regex ]]; do
        local search=${BASH_REMATCH}
        string=${string/$search/$replace}
    done

    printf "%s" "$string"
}

Example invocation:

regex_non_greedy "all cats are grey and green" "gre+." "white"

Which returns:

all cats are white and white

Upvotes: 1

konsolebox
konsolebox

Reputation: 75478

For your first question, an alternative is this:

[[ $temp_input =~ \"[^\"]*\" ]]

For your second question, you can do this:

temp_input=abcba
t=${temp_input//b}
echo "$(( (${#temp_input} - ${#t}) / 1 )) b"

Or for convenience place it on a function:

function count_matches {
    local -i c1=${#1} c2=${#2}
    if [[ c2 -gt 0 && c1 -ge c2 ]]; then
        local t=${1//"$2"}
        echo "$(( (c1 - ${#t}) / c2 )) $2"
    else
        echo "0 $2"
    fi
}

count_matches abcba b

Both produces output:

2 b

Update:

If you want to see the matches you can use a function like this. You can also try other regular expressions not just literals.

function find_matches {
    MATCHES=() 
    local STR=$1 RE="($2)(.*)"
    while [[ -n $STR && $STR =~ $RE ]]; do
        MATCHES+=("${BASH_REMATCH[1]}")
        STR=${BASH_REMATCH[2]}
    done
}

Example:

> find_matches abcba b
> echo "${MATCHES[@]}"
b b

> find_matches abcbaaccbad 'a.'
> echo "${MATCHES[@]}"
ab aa ad

Upvotes: 3

nickie
nickie

Reputation: 5808

  1. Your regular expression matches the string starting with the first quotation mark (before ab) and ending with the last quotation mark (after ef). This is greedy, even though your intention was to use a non-greedy match (*?). It seems that bash uses POSIX.2 regular expression (check your man 7 regex), which does not support a non-greedy Kleene star.

    If you want just "ab", I'd suggest a different regular expression:

    if [[ $temp_input =~ \"[^\"]*\" ]]
    

    which explicitly says that you don't want quotation marks inside your strings.

  2. I don't understand what you mean. If you want to find all matches (and there are two occurrences of b here), I think you cannot do it with a single ~= match.

Upvotes: 2

Related Questions