pihentagy
pihentagy

Reputation: 6285

multiline regexp matching in bash

I would like to do some multiline matching with bash's =~

#!/bin/bash
str='foo = 1 2 3
bar = what about 42?
boo = more words
'
re='bar = (.*)'
if [[ "$str" =~ $re ]]; then
        echo "${BASH_REMATCH[1]}"
else
        echo no match
fi

Almost there, but if I use ^ or $, it will not match, and if I don't use them, . eats newlines too.

EDIT:

sorry, values after = could be multi-word values.

Upvotes: 21

Views: 26739

Answers (2)

x-yuri
x-yuri

Reputation: 18873

You need to use newlines in a re to match in multiline strings:

s=$'aa\nbb\ncc'
nl=$'\n'
if [[ $s =~ (^|$nl)cc($|$nl) ]]; then
    echo found
fi

Upvotes: 5

I could be wrong, but after a quick read from here, especially Note 2 at the end of the page, bash can sometimes include the newline character when matching with the dot operator. Therefore, a quick solution would be:

#!/bin/bash
str='foo = 1
bar = 2
boo = 3
'
re='bar = ([^\
]*)'
if [[ "$str" =~ $re ]]; then
        echo "${BASH_REMATCH[1]}"
else
        echo no match
fi

Notice that I now ask it match anything except newlines. Hope this helps =)

Edit: Also, if I understood correctly, the ^ or $ will actually match the start or the end (respectively) of the string, and not the line. It would be better if someone else could confirm this, but it is the case and you do want to match by line, you'll need to write a while loop to read each line individually.

Upvotes: 17

Related Questions