texasflood
texasflood

Reputation: 1633

Lazy regex operator doesn't work in bash

echo "$(expr "title: Purple Haze       artist: Jimi Hendrix" : 'title:\s*\(.*\?\)\s*artist.*' )"

prints

Purple Haze             

With the trailing whitespace, even though I am using the ? lazy operator.

I've tested this on https://regex101.com/ and it works as expected, what's different about bash?

Upvotes: 4

Views: 1671

Answers (2)

Tom Fenech
Tom Fenech

Reputation: 74685

As Gilles points out, you're not using bash regular expressions. To do so, you could use the regex match operator =~ like this:

re='title:[[:space:]]*(.*[^[:space:]])[[:space:]]*artist.*'
details='title: Purple Haze       artist: Jimi Hendrix'
[[ $details =~ $re ]] && echo "${BASH_REMATCH[1]}"

Rather than using a lazy match, this uses a non-space character at the end of the capture group, so the trailing space is removed. The first capture group is stored in ${BASH_REMATCH[1]}.

At the expense of cross-platform portability, it is also possible to use the shorthand \s and \S instead of [[:space:]] and [^[:space:]]:

re='title:\s*(.*\S)\s*artist.*'

Upvotes: 2

You aren't using bash's regexp matching, you're using expr. expr does not have a “? lazy operator”, it only implements basic regular expressions (with a few extensions in the Linux version, such as \s for whitespace, but that doesn't include Perl-like lazy operators). (Neither does bash, for that matter.)

If you don't want .* to include trailing space, specify that it must end with a character that isn't a space:

'title:\s*\(.*\S\)\s*artist.*'

Upvotes: 6

Related Questions