Reputation: 33
I have a subtitle in my string
string subtitle = Encoding.ASCII.GetString(srt_text);
srt_text
is a byte array. I am converting it to string as you can see. subtitle
starts and finish with
Starts:
1
00:00:40,152 --> 00:00:43,614
Out west there was this fella,
2
00:00:43,697 --> 00:00:45,824
fella I want to tell you about,
Finish:
1631
01:52:17,016 --> 01:52:20,019
Catch ya later on
down the trail.
1632
01:52:20,102 --> 01:52:24,440
Say, friend, you got any more
of that good Sarsaparilla?
Now I want to take times and put them into array. I tried
Regex rgx = new Regex(@"^(?:[01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9],[0-9][0-9][0-9]$", RegexOptions.IgnoreCase);
Match m = rgx.Match(subtitle);
I am thinking I can just find times but didn't put into array. Assume 'times' is my string array. I want to array output like that
times[0] = "00:00:40,152"
times[1] = "00:00:43,614"
...
times[n-1] = "01:52:20,102"
times[n] = "01:52:24,440"
It have to keep going when subtitle is finish. All times might be in.
I am open for your advise. How can I do this? I am new probably have a lot of mistakes. I apoligize. Hope you can understand and help me.
Upvotes: 1
Views: 503
Reputation: 18987
Using Regular Expressions
You can do this with Regex with multiple matches using Regex.Matches
The regex used is
(\d{2}:\d{2}:\d{2},\d+)
\d
select digits{2}
count of repeatition+
one or many repeatitions: and ,
are plain characters without meaning.Here is the syntax.
var matchList = Regex.Matches(subtitle, @"(\d{2}:\d{2}:\d{2},\d+)",RegexOptions.Multiline);
var times = matchList.Cast<Match>().Select(match => match.Value).ToList();
With this your times
variable will be filled with all the time substrings.
Below is the result screenshot.
Also note: The RegexOptions.Multiline
part is optional in this scenario.
Upvotes: 1
Reputation: 14044
Probably this might help you get the times from the string you have.
string subtitle = @"1
00:00:40,152 --> 00:00:43,614
Out west there was this fella,
2
00:00:43,697 --> 00:00:45,824
fella I want to tell you about,";
List<string> timestrings = new List<string>();
List<string> splittedtimestrings = new List<string>();
List<string> splittedstring = subtitle.Split(new string[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries ).ToList();
foreach(string st in splittedstring)
{
if(st.Contains("00"))
{
timestrings.Add(st);
}
}
foreach(string s in timestrings)
{
string[] foundstr = s.Split(new string[] { " --> " }, StringSplitOptions.RemoveEmptyEntries);
splittedtimestrings.Add(foundstr[0]);
splittedtimestrings.Add(foundstr[1]);
}
I have tried splitting the string to get the time string instead of Regex. Because I think Regex should be used to processes text based on pattern matches rather than on comparing and matching literal text.
Upvotes: 1