Reputation: 1197
I have this regular expression that works in Rubular
value[[:space:]]*=[[[:digit:]]\.]+>([[[:alpha:]][[:space:]]*\/]+)
on this text:
<option value =12.34.567>London</option>
<option value =89.12.345>New York / San Francisco</option>
It gives the result:
Match 1
1. 12.34.567
2. London
Match 2
1. 89.12.345
2. New York / San Francisco
Which is what I want. But when i use the regular expression in a bash script:
#!/usr/bin/env bash
regex="value[[:space:]]*=([[[:digit:]]\.]+)>([[[:alpha:]][[:space:]]*\/]+)"
while read line
do
echo $line
if [[ $line =~ $regex ]]; then
echo ${BASH_REMATCH}
fi
done < test.html
It doesn't work (test.html has the html sample from above.)
From testing I think it gets stuck in the grouping
[[[:digit:]]\.]+
Does bash treat the regular expressions in a different way than ruby?
Upvotes: 0
Views: 471
Reputation: 174716
I suggest you to change the regex as,
regex="value[[:space:]]*=([[:digit:].]+)>([[:alpha:][:space:]*/]+)"
[[:digit:].]
^ ^ ^^^
| | |||-> end of char class
start digit |-> DOT
OR
In pcre, the above would be written as [\d.]
Upvotes: 2