Reputation: 81
I have a problem. I'm trying to count number of a subtitle lines with php. As you might know, a subtitle looks like this:
1
00:00:00,984 --> 00:00:03,503
All right, guys, let's get to it.
2
00:00:03,587 --> 00:00:04,821
What's that button?
3
00:00:04,872 --> 00:00:07,590
It's something designed
to help you get healthy.
4
00:00:07,658 --> 00:00:09,676
Just ignore it.
5
00:00:09,760 --> 00:00:12,962
So, Patrick, did you take the high road
Now, i tried to put the content of a subtitle file in an array, like this:
$f = fopen($file, 'rb');
$read = fread($f, filesize($file));
fclose($f);
$array = explode("\n",$read);
With this code:
$array = array_filter($array,'trim');
foreach($array as $key => $value) {
if(preg_match('/\d+/',$value)) {
unset($array[$key]);
}
}
$array = array_values($array);
echo '<pre>';
print_r($array);
echo '</pre>';
i get:
Array
(
[0] => All right, guys, let's get to it.
[1] => What's that button?
[2] => It's something designed
[3] => to help you get healthy.
[4] => Just ignore it.
[5] => So, Patrick, did you take the high road
[6] => and congratulate Wendy on that promotion
[7] => that you were supposed to get?
[8] => Yes, I did. I even bought her flowers.
[9] => Liar!
)
which is not ok because
It's something designed
to help you get healthy.
should be a in a single element of the array.
I've also tried to match everything between( example ) :
1
00:00:00,984 --> 00:00:03,503
and
2
00:00:03,587 --> 00:00:04,821
with:
(\d+\n)([0-9][0-9]:[0-9][0-9]:[0-9][0-9],\d+ --> [0-9][0-9]:[0-9][0-9]:[0-9][0-9],\d+\n).*\n
but it doesn't work and i'm out of ideas.
What i'm tring to output:
Array
(
[0] => All right, guys, let's get to it.
[1] => What's that button?
[2] => It's something designed to help you get healthy.
[3] => Just ignore it.
[4] => So, Patrick, did you take the high road
[5] => and congratulate Wendy on that promotion that you were supposed to get?
[6] => Yes, I did. I even bought her flowers.
[7] => Liar!
)
echo count($array); //for the previous array , should echo 8
Any help will be appreciated.
Upvotes: 1
Views: 99
Reputation: 4150
You can do it like this, by using library https://github.com/mantas-done/subtitles
$subtitles = Subtitles::load('subtitles.srt');
$blocks = $subtitles->getInternalFormat();
$array = [];
foreach ($blocks as $block) {
$array[] = implode(' ', $block['lines']);
}
print_r($array);
Upvotes: 0
Reputation: 1204
You can use the multiline modifier in PCRE to handle the embedded newlines after reading in the file; and then match lines not starting with a number/digit to get what you want:
$file = "./subtitles.txt";
$content = file_get_contents($file);
$blocks = preg_split('/^\s*$/m', $content);
// var_export($blocks);
$subtitles = array();
for ($i=0; $i < count($blocks); $i++) {
$lines = explode("\n", $blocks[$i]);
$matches = preg_grep("/^[^\d]/", $lines);
array_push($subtitles, implode(' ', $matches));
}
print_r($subtitles);
Which gives you the following output:
Array
(
[0] => All right, guys, let's get to it.
[1] => What's that button?
[2] => It's something designed to help you get healthy.
[3] => Just ignore it.
[4] => So, Patrick, did you take the high road
)
Upvotes: 2
Reputation: 12089
Here's a mock up:
$array = array(1, '00', 'one', 2, '00', 'two', 'abc', 3, '00', 'three', 4, '00', 'four', 'five', 5, '00', 'six', 6, '00', 'seven');
$string_last = 0; // keep track when last element was string
$string_array = array(); // new array to add elements I want to keep
$ii = 0;
foreach($array as $key => $value) {
if(preg_match('/^\d+/',$value)) { // check if first character in line is a digit
$string_last = 0; // if so, then last element is not string, go to next line
}
// we have string line
else {
if ( !$string_last ) { $ii++; } // if last element was not a string, increment index
else { $string_array[$ii] .= ' '; } // ...otherwise add a space
$string_array[$ii] .= $value;
$string_last = 1;
}
}
echo '<pre>';
print_r($string_array);
echo '</pre>';
Rather than unsetting elements I don't want I'm adding the elements I do want to a new array. That way I can merge consecutive string elements into one element in my new array.
Upvotes: 0