PerseP
PerseP

Reputation: 1197

Regular expression with POSIX bracket expressions not working in bash

I have this regular expression that works in Rubular

value[[:space:]]*=[[[:digit:]]\.]+>([[[:alpha:]][[:space:]]*\/]+)

on this text:

<option value =12.34.567>London</option>
<option value =89.12.345>New York / San Francisco</option>

It gives the result:

Match 1
1.  12.34.567
2.  London
Match 2
1.  89.12.345
2.  New York / San Francisco

Which is what I want. But when i use the regular expression in a bash script:

#!/usr/bin/env bash

regex="value[[:space:]]*=([[[:digit:]]\.]+)>([[[:alpha:]][[:space:]]*\/]+)"

while read line
do
    echo $line
    if [[ $line =~ $regex ]]; then
        echo ${BASH_REMATCH}
    fi
done < test.html

It doesn't work (test.html has the html sample from above.)

From testing I think it gets stuck in the grouping

[[[:digit:]]\.]+

Does bash treat the regular expressions in a different way than ruby?

Upvotes: 0

Views: 471

Answers (1)

Avinash Raj
Avinash Raj

Reputation: 174716

I suggest you to change the regex as,

regex="value[[:space:]]*=([[:digit:].]+)>([[:alpha:][:space:]*/]+)"

DEMO

    [[:digit:].]
    ^   ^    ^^^
    |   |    |||-> end of char class
 start digit |-> DOT
            OR

In pcre, the above would be written as [\d.]

Upvotes: 2

Related Questions