Reputation: 2843
I am trying to parse a subtitle file. And a sample string looks like :
00:00:01,000 --> 00:00:04,074
I have this regex :
#!/bin/bash
while read line
do
if [[ "$line" =~ ^[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}* ]]
then
echo $line
fi
done < $1
This regex works and echoes the line. But when I extend the pattern in the if statement to :
if [[ "$line" =~ ^[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}*--* ]]
then it doesn't work anymore.
Likewise, this regex works :
while read line
do
if [[ "$line" =~ [0-9]{2}*[0-9]{2}*[0-9]{2}*[0-9]{3}*--\>*[0-9]{2}*[0-9]{2}*[0-9]{2}*[0-9]{3}* ]]
then
echo $line
fi
done < $1
But, if I place ^
at the beginning of the pattern (as in the first case), or if I use :
s and the ,
s it doesn't work any more.
I don't understand why such strange behavior it exhibits. Can anyone help ?
Upvotes: 0
Views: 3169
Reputation: 70085
*
doesn't work quite like it does for file matching at the command line. It means "0 or more of the previous character" rather than "0 or more of any character." You need to precede it with .
to have it match 0 or more of any character (because .
is a special character in regex that matches any character).
This will match your line and is perhaps the regex you ultimately want:
if [[ "$line" =~ ^[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\ ?--\>\ ?[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}$ ]];
Upvotes: 3