Enes
Enes

Reputation: 33

Subtitle's Time Editor with Regular Expressions

I have a subtitle in my string

string subtitle = Encoding.ASCII.GetString(srt_text);

srt_text is a byte array. I am converting it to string as you can see. subtitle starts and finish with

Starts:
1
00:00:40,152 --> 00:00:43,614
Out west there was this fella,

2
00:00:43,697 --> 00:00:45,824
fella I want to tell you about,

Finish:
1631
01:52:17,016 --> 01:52:20,019
Catch ya later on
down the trail.

1632
01:52:20,102 --> 01:52:24,440
Say, friend, you got any more
of that good Sarsaparilla?

Now I want to take times and put them into array. I tried

Regex rgx = new Regex(@"^(?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9],[0-9][0-9][0-9]$", RegexOptions.IgnoreCase);
Match m = rgx.Match(subtitle);

I am thinking I can just find times but didn't put into array. Assume 'times' is my string array. I want to array output like that

    times[0] = "00:00:40,152"
    times[1] = "00:00:43,614"
    ...
    times[n-1] = "01:52:20,102"
    times[n] = "01:52:24,440"

It have to keep going when subtitle is finish. All times might be in.

I am open for your advise. How can I do this? I am new probably have a lot of mistakes. I apoligize. Hope you can understand and help me.

Upvotes: 1

Views: 503

Answers (2)

Rajshekar Reddy
Rajshekar Reddy

Reputation: 18987

Using Regular Expressions


You can do this with Regex with multiple matches using Regex.Matches

The regex used is

(\d{2}:\d{2}:\d{2},\d+)    
  • \d select digits
  • {2} count of repeatition
  • + one or many repeatitions
  • : and , are plain characters without meaning.

Here is the syntax.

var matchList = Regex.Matches(subtitle, @"(\d{2}:\d{2}:\d{2},\d+)",RegexOptions.Multiline);
var times = matchList.Cast<Match>().Select(match => match.Value).ToList();

With this your times variable will be filled with all the time substrings.

Below is the result screenshot. enter image description here

Also note: The RegexOptions.Multiline part is optional in this scenario.

Upvotes: 1

Mohit S
Mohit S

Reputation: 14044

Probably this might help you get the times from the string you have.

string subtitle = @"1
00:00:40,152 --> 00:00:43,614
Out west there was this fella,

2
00:00:43,697 --> 00:00:45,824
fella I want to tell you about,";
List<string> timestrings = new List<string>();
List<string> splittedtimestrings = new List<string>();
List<string> splittedstring = subtitle.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries ).ToList();
foreach(string st in splittedstring)
{
    if(st.Contains("00"))
    {
        timestrings.Add(st);
    }
}

foreach(string s in timestrings)
{
    string[] foundstr =  s.Split(new string[] { " --> " }, StringSplitOptions.RemoveEmptyEntries);
    splittedtimestrings.Add(foundstr[0]);
    splittedtimestrings.Add(foundstr[1]);
}

I have tried splitting the string to get the time string instead of Regex. Because I think Regex should be used to processes text based on pattern matches rather than on comparing and matching literal text.

Screenshot

Upvotes: 1

Related Questions