GNOME Bugzilla – Bug 707032
qtdemux: add support for webvtt subtitles
Last modified: 2018-11-03 14:49:44 UTC
Add support for WebVTT subtitles
Created attachment 253483 [details] [review] qtdemux: add support for webvtt subtitles
*** Bug 707030 has been marked as a duplicate of this bug. ***
The caps are wrong btw, they should be application/x-subtitle-webvtt
Created attachment 253490 [details] [review] qtdemux: add support for webvtt subtitles
Review of attachment 253490 [details] [review]: ::: gst/isomp4/qtdemux.c @@ +3910,3 @@ if (G_UNLIKELY (stream->subtype != FOURCC_text + && stream->subtype != FOURCC_sbtl && + stream->subtype != FOURCC_wvtt)) { Maybe it's time for a is_subtitle_fourcc() macro? @@ +10769,3 @@ + case GST_MAKE_FOURCC ('w', 'v', 't', 't'): + _codec ("WebVTT subtitle"); + caps = gst_caps_new_empty_simple ("application/x-subtitle"); I'd prefer subtitle/x-webvtt :)
> + caps = gst_caps_new_empty_simple ("application/x-subtitle"); > > I'd prefer subtitle/x-webvtt :) or subtitle/x-vtt ? (that matches text/vtt a bit more, and we might see this used in non-web contexts). Do you have a sample file for this? Is the timing info expressed in the text chunks, or as timestamp/duration on the buffers? (or both?)
In a first instance I though it would be packaged raw, one sample per cue point as it's done for HLS, but it seems to be packed in a different way for MP4. I have found here a proposal for the standard for WebVTT: http://biblio.telecom-paristech.fr/cgi-bin/download.cgi?id=13809 If the output of the demuxer should be parsable by a webvtt parser, I don't get why it needs to be packed/depacked in at the muxer/demuxer level. A sample file can be found here: http://download.tsi.telecom-paristech.fr/gpac/webvtt/counter-vtt.mp4
So the webvtt inside MP4 is different to raw webvtt, which is also different to webvtt inside MPEGTS? We would need different caps for all three variants then, probably some subtitle/x-webvtt,variant={mp4,mpegts,raw}
For webvtt in MP4 a sample would contain one VTTEmptyCueBox box (for a period of no-zero duration without cue data) or one or more VTTCueBox with the same start and stop time. The cue id goes in the 'iden' box, the cue point configuration in the 'sttg' box and the text string in the 'payl' box. For a cue point like: 1 00:00:10.000 --> 00:00:15.000 line:0 position:20% size:60% align:start <b>First line.</b> We would have a sample with: sample PTS: 00:00:10.000 DUR: 00:00:5.000 vttc iden=1 sttg=line:0 position:20% size:60% align:start payl=<b>First line.</b> It seems then appropriate to use subtitle/x-webvtt,variant={mp4,mpegts,raw} for the different variants, where subtitle/x-webvtt,variant=mp4 would consist in buffers with same timing info as the cue point, the payload as as data, and the cue point configuration in the a GstBufferVTTMeta. In case cue points overlaps and the duration of the sample is not representative, the CueDurationBox should be used to set the real duration of the cue point.
Andoni, can you rebase your patch against current master ? Would still be good to have this.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/92.