Dobbelina
Dobbelina

Reputation: 186

How to merge segmented webvtt subtitle files and output a single file?

How to merge a segmented webvtt subtitle file and output a single file?, m3u8 looks like this example:

#EXTM3U
#EXT-X-VERSION:4
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MEDIA-SEQUENCE:1
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-TARGETDURATION:4
#USP-X-TIMESTAMP-MAP:MPEGTS=900000,LOCAL=1970-01-01T00:00:00Z
#EXTINF:4, no desc
0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-1.webvtt
#EXTINF:4, no desc
0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-2.webvtt
#EXTINF:4, no desc
0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-3.webvtt
#EXTINF:4, no desc
0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-4.webvtt
#EXTINF:4, no desc
0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-5.webvtt
#EXTINF:4, no desc
0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-6.webvtt
#EXT-X-ENDLIST

I noticed that each segment is not synchronized/cued against total playing time, but against the individual ts segments. If ffmpeg could be used to do this, what magic input do i need to give it?

A single correctly cued vtt or srt file is what i want.

I have a great appetite and don't like chunks, lol!

Thanks for any replies you lovely people!


With this i get a merged vtt file, but the cues are all wrong:

ffmpeg -i "https://cmoreseusphlsvod60.akamaized.net/vod/bea44/0ghzi1b2cz5(11792107_ISMUSP).ism/0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000.m3u8" -f segment -segment_time 4 -segment_format webvtt -scodec copy out-%05d.vtt

Each segment is not synchronized/cued against total playing time, but against the individual ts segments. Example output of above command:

WEBVTT

00:00.000 --> 00:03.040
Du har aktier i ett företag
som saknar framtid.

00:00.000 --> 00:03.280
De vill ha aktierna.
Du känner dem inte, Olga.

00:00.000 --> 00:01.720
De som får Kastrups aktier vinner.

Cues all start like this which isn't very helpfull: 00:00.000

Some segments contains no cues, like segment 15 for example: https://cmoreseusphlsvod60.akamaized.net/vod/bea44/0ghzi1b2cz5(11792107_ISMUSP).ism/0ghzi1b2cz5(11792107_ISMUSP)-textstream_swe=2000-15.webvtt

"A WebVTT Segment MAY contain no cues; this indicates that no subtitles are to be displayed during that period."

Upvotes: 2

Views: 3613

Answers (1)

Aquarius Power
Aquarius Power

Reputation: 3985

With this linux script (that can be run on windows too using cygwin) you can merge several local files into a single .vtt one:

(head -n 2 2000-1.webvtt;cat $(ls 2000-*.webvtt -1 |sort -n |grep -v out.vtt) |egrep -v "WEBVTT|X-TIMESTAMP-MAP") >out.vtt

The head -n 2 is just to keep the header 2 lines of the first 2000-1.webvtt file that contain WEBVTT and X-TIMESTAMP-MAP, on the final merged file.

Despite I ordered the files, it doesnt matter the order of the entries nor if there are duplicated entries (that may happen in the begin and the end of each .vtt segment, I loaded it in android MXPlayer and it worked perfectly (my first test .vtt was totally unordered and worked!).

Btw, I created the segments using firefox/inspect/networkMonitor/filter:vtt, but had to click on each segment and save it as a single file, but I think there may have other easier way (I would like to know if you do).

Upvotes: 0

Related Questions