1
00:00:00,074 --> 00:00:02,564
Previously on Breaking Bad...
2
00:00:02,663 --> 00:00:04,393
Words...
i need to parse srt files with php and print the all subs in the file with variables.
i couldn't find the right reg exps. when doing this i need to take the id, time and the subtitle variables. and when printing there musn't be no array() s or etc. must print just the same as in the orginal file.
i mean i must print like;
$number <br> (e.g. 1)
$time <br> (e.g. 00:00:00,074 --> 00:00:02,564)
$subtitle <br> (e.g. Previously on Breaking Bad...)
by the way i have this code. but it doesn't see the lines. it must be edited but how?
$srt_file = file('test.srt',FILE_IGNORE_NEW_LINES);
$regex = "/^(\d)+ ([\d]+:[\d]+:[\d]+,[\d]+) --> ([\d]+:[\d]+:[\d]+,[\d]+) (\w.+)/";
foreach($srt_file as $srt){
preg_match($regex,$srt,$srt_lines);
print_r($srt_lines);
echo '<br />';
}
Here is a short and simple state machine for parsing the SRT file line by line:
define('SRT_STATE_SUBNUMBER', 0);
define('SRT_STATE_TIME', 1);
define('SRT_STATE_TEXT', 2);
define('SRT_STATE_BLANK', 3);
$lines = file('test.srt');
$subs = array();
$state = SRT_STATE_SUBNUMBER;
$subNum = 0;
$subText = '';
$subTime = '';
foreach($lines as $line) {
switch($state) {
case SRT_STATE_SUBNUMBER:
$subNum = trim($line);
$state = SRT_STATE_TIME;
break;
case SRT_STATE_TIME:
$subTime = trim($line);
$state = SRT_STATE_TEXT;
break;
case SRT_STATE_TEXT:
if (trim($line) == '') {
$sub = new stdClass;
$sub->number = $subNum;
list($sub->startTime, $sub->stopTime) = explode(' --> ', $subTime);
$sub->text = $subText;
$subText = '';
$state = SRT_STATE_SUBNUMBER;
$subs[] = $sub;
} else {
$subText .= $line;
}
break;
}
}
if ($state == SRT_STATE_TEXT) {
// if file was missing the trailing newlines, we'll be in this
// state here. Append the last read text and add the last sub.
$sub->text = $subText;
$subs[] = $sub;
}
print_r($subs);
Result:
Array
(
[0] => stdClass Object
(
[number] => 1
[stopTime] => 00:00:24,400
[startTime] => 00:00:20,000
[text] => Altocumulus clouds occur between six thousand
)
[1] => stdClass Object
(
[number] => 2
[stopTime] => 00:00:27,800
[startTime] => 00:00:24,600
[text] => and twenty thousand feet above ground level.
)
)
You can then loop over the array of subs or access them by array offset:
echo $subs[0]->number . ' says ' . $subs[0]->text . "\n";
To show all subs by looping over each one and displaying it:
foreach($subs as $sub) {
echo $sub->number . ' begins at ' . $sub->startTime .
' and ends at ' . $sub->stopTime . '. The text is: <br /><pre>' .
$sub->text . "</pre><br />\n";
}
Further reading: SubRip Text File Format