gaganbm
gaganbm

Reputation: 2843

Bash regex to match colon separated integers

I am trying to parse a subtitle file. And a sample string looks like :

00:00:01,000 --> 00:00:04,074

I have this regex :

#!/bin/bash
while read line
do      
    if [[ "$line" =~ ^[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}* ]]
    then
            echo $line
    fi           
done < $1

This regex works and echoes the line. But when I extend the pattern in the if statement to :

if [[ "$line" =~ ^[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}*--* ]]

then it doesn't work anymore.

Likewise, this regex works :

while read line
do
       if [[ "$line" =~ [0-9]{2}*[0-9]{2}*[0-9]{2}*[0-9]{3}*--\>*[0-9]{2}*[0-9]{2}*[0-9]{2}*[0-9]{3}* ]]
        then
                echo $line
        fi

done < $1

But, if I place ^ at the beginning of the pattern (as in the first case), or if I use :s and the ,s it doesn't work any more.

I don't understand why such strange behavior it exhibits. Can anyone help ?

Upvotes: 0

Views: 3169

Answers (1)

Trott
Trott

Reputation: 70085

* doesn't work quite like it does for file matching at the command line. It means "0 or more of the previous character" rather than "0 or more of any character." You need to precede it with . to have it match 0 or more of any character (because . is a special character in regex that matches any character).

This will match your line and is perhaps the regex you ultimately want:

if [[ "$line" =~ ^[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}\ ?--\>\ ?[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}$ ]];

Upvotes: 3

Related Questions