GNOME Bugzilla – Bug 793333
matroskademux: Allow Matroska headers to be read more than once
Last modified: 2018-10-05 12:04:14 UTC
MSE has the concepts of "initialization segment" and "media segment", whose definitions are specific to MSE and depend on the container format. In the case of WebM a "MSE media segment" is a Cluster element and a "MSE initialization segment" is the portion of a WebM file that defines the header before clusters can appear. MSE segments may appear in the stream in almost any order, with some limitations: the first ever MSE segment must be an initialization segment. Further MSE initialization segments in the stream are limited to have the same number and type of tracks, allowing very little variation. An noticeable exception is codec data (e.g. PPS and SPS for MP4/h264). Usually applications insert a second initialization segment after a quality change. Support of multiple MSE WebM segments in matroskademux revolves about the following question: When working in push mode, what should the demuxer do if it finds a new Matroska header? Note the difference with the Multisegment Matroska Movie playback problem exposed above. In the above case, the demuxer searched everywhere (including in the same file, past the current Matroska Segment boundary) for an specific Matroska segment with a given SegmentUID. Here, it's the other way around. The demuxer was happy parsing the current Matroska Segment when suddenly it ended and a new one took its place. What should happen in this case? There are several options: a) EOS. Stop reading. No more playback. This is the current behavior of matroskademux. Incompatible with MSE. b) Exactly what MSE wants: read the new header, but do not allow changing the number and type of tracks and do not recreate pads. This is the current behavior of qtdemux. c) Send EOS or something similar downstream, delete all the pads and signal the end of the Matroska segment to the application, but keep accepting data upstream, restart most of the demuxer except for the input buffer and demux the new segment from scratch. The new segment could have a completely diferent set of tracks. MSE implementations would need to handle the EOS event and map the old tracks with the new pads. The third solution looks like what Tim suggests and certainly it is more elegant in that it does not tie GStreamer to MSE requirements, but it has its own challenges: * How could it be implemented in GStreamer? There is the STREAM_START event counterpart to EOS, but it's only sent downstream and the documentation does not say it's supposed to be posted to the bus, so it is not helpful when you start demuxing a new segment and you still have no srcpads. A new kind of event would be needed. The `no-more-pads` signal would no longer signal that the element will not generate more "sometimes" pads, but rather that it will only stop doing so for the duration of the current Matroska segment. * Consistency with other demuxers. Other demuxers have similar headers (e.g. MP4 movies start with a `moov` box) and may find them within the stream. It would make sense that they all implemented the same behavior to the extent possible. If that behavior is (c), they should use the same event to signal the end of a movie.
This is necessary for MSE, where a new MSE initialization segment may be appended at any point. These MSE initialization segments consist of an entire WebM file until the first Cluster element (not included). [1] Note that track definitions are ignored on successive headers, they must match, but this is not checked by matroskademux (look for `(!demux->tracks_parsed)` in the code). Source pads are not altered when the new headers are read. This patch has been splitted from the original patch from eocanha in [2]. [1] https://www.w3.org/TR/mse-byte-stream-format-webm/ [2] https://bug334082.bugzilla-attachments.gnome.org/attachment.cgi?id=362212
Created attachment 368183 [details] [review] matroskademux: Allow Matroska headers to be read more than once This is necessary for MSE, where a new MSE initialization segment may be appended at any point. These MSE initialization segments consist of an entire WebM file until the first Cluster element (not included). [1] Note that track definitions are ignored on successive headers, they must match, but this is not checked by matroskademux (look for `(!demux->tracks_parsed)` in the code). Source pads are not altered when the new headers are read. This patch has been splitted from the original patch from eocanha in [2]. [1] https://www.w3.org/TR/mse-byte-stream-format-webm/ [2] https://bug334082.bugzilla-attachments.gnome.org/attachment.cgi?id=362212
Maybe they must match in MSE but I think it's a perfectly valid use case to have them change during normal streaming (even if rarely supported)?
Could you provide an example? qtdemux already works this way.
QuickTime is a different format. All I'm saying is that we probably shouldn't make this assumption, but check and error out if not.
It may be a different format, but we're talking about the same behavior: finding metadata again at some point in the stream. Shouldn't both be consistent? For instance, it would be a bit weird if in WebKit code where I need to support both I end up having to handle stream events just for one of the formats while it's handled by the demuxer in the other.
This is a duplicate of https://bugzilla.gnome.org/show_bug.cgi?id=334082 or not? Also has a patch, also mentions MSE
(In reply to Sebastian Dröge (slomo) from comment #6) > This is a duplicate of https://bugzilla.gnome.org/show_bug.cgi?id=334082 or > not? Also has a patch, also mentions MSE Not really, because the original problem in that patch is mostly unrelated to MSE even if the discussion turned that way. I'll post a mega-reply there explaining it soon.
Do you have any opinions on this patch? This is blocking https://bugs.webkit.org/show_bug.cgi?id=185731
Created attachment 373712 [details] MSE WebM segments that can be used to reproduce the bug I'm attaching a compressed folder containing MSE WebM segments from the YouTube tests that can be used to reproduce the bug. $ gst-launch-1.0 pushfilesrc location=<(cat {init,media1,init,media2}.webm) ! matroskademux ! fakesink silent=false -v Expected result: Last demuxed frame should be PTS=0:00:09.977000000. Actual result (without the patch): Last demuxed frame is PTS=0:00:04.972000000. (Matroskademux gives up on the second `init.webm`, having parsed only up to `media1.webm`) Note: Several init.webm is something that will only happen in MSE (typically on seeks or changes of quality), not in regular video files. Also, for the MSE use case the running time of the frames emitted by matroskademux is not important: only stream time matters.
Review of attachment 368183 [details] [review]: OK, this change makes a sense even if ideally we should verify that everything matches and fail in a clean way. With that patch we just get a failure later on, before we were just ignoring the content following the second header. Without that we can't play YouTube anymore with WebKitGTK since they are now using MSE/matroska and it is **really** not ideal, while that patch is maybe not perfect I think it is a good starting point to get things working and in the future we might want to add more checks to handle the case where the track topology changes with new headers, even if the behaviour in that case is not specified in matroska itself.
Created attachment 373731 [details] [review] matroskademux: Refactor track parsing out from adding tracks This splits gst_matroska_demux_add_stream() into: * gst_matroska_demux_parse_stream(): will read the Matroska bytestream and fill a GstMatroskaTrackContext. * gst_matroska_demux_parse_tracks(): will check there are no repeated tracks. * gst_matroska_demux_add_stream(): creates and sets up the pad for the track.
Created attachment 373732 [details] [review] matroskademux: Parse successive Tracks elements This patch allows matroskademux to parse a second Tracks element, erroring out if the tracks are not compatible (different number, type or codec) and emitting new caps and tag events should they have changed.
These two patches together make matroskademux reject immediately succesive headers with different tracks and emit new caps and/or tags (including codec-data) should they have changed.
Review of attachment 373731 [details] [review]: Makes sense to me. ::: gst/matroska/matroska-demux.c @@ +651,3 @@ GST_ERROR_OBJECT (demux, "Invalid TrackNumber 0"); ret = GST_FLOW_ERROR; break; Why can't you keep doing the unique check here? @@ +1514,3 @@ + break; + + case GST_MATROSKA_TRACK_TYPE_COMPLEX: just `default:` then? @@ +3112,3 @@ + if (gst_matroska_read_common_tracknumber_unique (&demux->common, + track->num)) + gst_matroska_demux_add_stream (demux, track); Please add brackets around that if too
Review of attachment 373732 [details] [review]: Looks good otherwise. ::: gst/matroska/matroska-demux.c @@ +3138,3 @@ +static GstFlowReturn +gst_matroska_demux_reparse_tracks (GstMatroskaDemux * demux, GstEbmlRead * ebml) update_tracks? @@ +3178,3 @@ + gst_matroska_read_common_stream_from_num (&demux->common, + new_track->num); + g_assert (old_track_index != -1); That should not be an assertion, if there is a mismatch the track might not be found, and we should cleanly error out.
Review of attachment 373731 [details] [review]: ::: gst/matroska/matroska-demux.c @@ +651,3 @@ GST_ERROR_OBJECT (demux, "Invalid TrackNumber 0"); ret = GST_FLOW_ERROR; break; Note this is gst_matroska_demux_parse_stream() [The review tool includes a header for the function name in the old version but not in the new one] After the refactor gst_matroska_demux_parse_stream() is supposed to be reasonably stateless and depend as little as possible on the bigger matroskademux state. Therefore, it makes no sense for it to reject tracks on the basis of being it the first time they appear or not. This will be very important in the next patch, where we will want the parsed track to have an existing TrackNumber. It makes much more sense to check that outside. @@ +1514,3 @@ + break; + + g_assert (demux->common.src->len == demux->common.num_streams); This was present in the original version. I guess the intention is to explicitly list all that is not supported, which is fine for me. @@ +3112,3 @@ + if (gst_matroska_read_common_tracknumber_unique (&demux->common, + track->num)) + gst_matroska_demux_add_stream (demux, track); With pleasure, I don't like ifs without braces, they are dangerous.
Review of attachment 373732 [details] [review]: ::: gst/matroska/matroska-demux.c @@ +3138,3 @@ +static GstFlowReturn +gst_matroska_demux_reparse_tracks (GstMatroskaDemux * demux, GstEbmlRead * ebml) OK. @@ +3178,3 @@ + gst_matroska_read_common_stream_from_num (&demux->common, + new_track->num); + switch (id) { Mismatches are already checked in the previous if(). In fact, gst_matroska_read_common_stream_from_num() and gst_matroska_read_common_tracknumber_unique() are very similar, apart from their return values and the fact that gst_matroska_read_common_stream_from_num() does some warning reporting that is redundant in this case.
Review of attachment 373732 [details] [review]: ::: gst/matroska/matroska-demux.c @@ +3178,3 @@ + gst_matroska_read_common_stream_from_num (&demux->common, + new_track->num); + g_assert (old_track_index != -1); Right, I missed that point, good then :-)
Created attachment 373733 [details] [review] matroskademux: Refactor track parsing out from adding tracks This splits gst_matroska_demux_add_stream() into: * gst_matroska_demux_parse_stream(): will read the Matroska bytestream and fill a GstMatroskaTrackContext. * gst_matroska_demux_parse_tracks(): will check there are no repeated tracks. * gst_matroska_demux_add_stream(): creates and sets up the pad for the track.
Created attachment 373734 [details] [review] matroskademux: Parse successive Tracks elements This patch allows matroskademux to parse a second Tracks element, erroring out if the tracks are not compatible (different number, type or codec) and emitting new caps and tag events should they have changed.
Attachment 368183 [details] pushed as 9dc7859 - matroskademux: Allow Matroska headers to be read more than once Attachment 373733 [details] pushed as f279bc5 - matroskademux: Refactor track parsing out from adding tracks Attachment 373734 [details] pushed as 0e60076 - matroskademux: Parse successive Tracks elements
I know this is basically a new feature but since Youtube is broken in WebKit without that, I backported it (as discussed on IRC). I think (and hope :-)) it is safe enough!
I'm a bit concerned about this commit and wonder if it needs more discussion: matroskademux: Parse successive Tracks elements This patch allows matroskademux to parse a second Tracks element, erroring out if the tracks are not compatible (different number, type or codec) and emitting new caps and tag events should they have changed. So now we push potentially different caps (even if same codec) on an existing source pad, right? Is that something we do anywhere else? My understanding is that we have explicitly not been doing this because the old decodebin couldn't handle this. We have a different mechanism for all of this using the GstStream API and adding new pad, which is implemented in tsdemux, for example.
https://gstreamer.freedesktop.org/documentation/design/stream-selection.html#decodebin3-usage-and-example
(In reply to Tim-Philipp Müller from comment #26) > I'm a bit concerned about this commit and wonder if it needs more discussion: > > matroskademux: Parse successive Tracks elements > > This patch allows matroskademux to parse a second Tracks element, > erroring out if the tracks are not compatible (different number, type or > codec) and emitting new caps and tag events should they have changed. > > So now we push potentially different caps (even if same codec) on an > existing source pad, right? > > Is that something we do anywhere else? > > My understanding is that we have explicitly not been doing this because the > old decodebin couldn't handle this. I had checked and we do this in qtdemux... if the parent is stream aware, but I missed that second part reading the code. I guess we should add that check.
Created attachment 373737 [details] Quality change test vectors for WebM and MP4 (for comparison) This is something we have already been doing for very long in qtdemux. $ formatter=(perl -ne 'if (/E \(type: ((stream-start|caps) .*);\) 0x[a-f0-9]+$/) { $event = $1; $event =~ s/\\//g; $event =~ s/(codec_data|stream-id)=.*?,/\1=(...),/; print "* $event\n" }') # qtdemux, no MSE initialization segment (moov) changes. $ gst-launch-1.0 pushfilesrc location=<(cat car-20120827-85.mp4/{init,media1}.mp4) ! qtdemux ! fakesink silent=false -v | "${formatter[@]}" * stream-start (10254), GstEventStreamStart, stream-id=(...), flags=(GstStreamFlags)GST_STREAM_FLAG_NONE, group-id=(uint)1 * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)2.1, profile=(string)main, codec_data=(...), width=(int)426, height=(int)240, pixel-aspect-ratio=(fraction)1/1" * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)2.1, profile=(string)main, codec_data=(...), width=(int)426, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)24000/1001" # qtdemux, with a MSE initialization segment (moov) change. $ gst-launch-1.0 pushfilesrc location=<(cat car-20120827-85.mp4/{init,media1}.mp4 car-20120827-86.mp4/{init,media1}.mp4) ! qtdemux ! fakesink silent=false -v | "${formatter[@]}" * stream-start (10254), GstEventStreamStart, stream-id=(...), flags=(GstStreamFlags)GST_STREAM_FLAG_NONE, group-id=(uint)1 * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)2.1, profile=(string)main, codec_data=(...), width=(int)426, height=(int)240, pixel-aspect-ratio=(fraction)1/1" * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)2.1, profile=(string)main, codec_data=(...), width=(int)426, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)24000/1001" * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)3, profile=(string)main, codec_data=(...), width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1" * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)3, profile=(string)main, codec_data=(...), width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)24000/1001" Now matroskademux is doing the same: # matroskademux, no MSE initialization segment (Tracks element) changes. $ gst-launch-1.0 pushfilesrc location=<(cat feelings_vp9-20130806-242.webm/{init,media1}.webm) ! matroskademux ! fakesink silent=false -v | "${formatter[@]}" * stream-start (10254), GstEventStreamStart, stream-id=(...), flags=(GstStreamFlags)GST_STREAM_FLAG_SELECT, group-id=(uint)1 * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-vp9, width=(int)426, height=(int)240, framerate=(fraction)0/1" # matroskademux, MSE initialization segment (Tracks element) changes $ gst-launch-1.0 pushfilesrc location=<(cat feelings_vp9-20130806-242.webm/{init,media1}.webm feelings_vp9-20130806-247.webm/{init,media1}.webm) ! matroskademux ! fakesink silent=false -v | "${formatter[@]}" * stream-start (10254), GstEventStreamStart, stream-id=(...), flags=(GstStreamFlags)GST_STREAM_FLAG_SELECT, group-id=(uint)1 * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-vp9, width=(int)426, height=(int)240, framerate=(fraction)0/1" * caps (12814), GstEventCaps, caps=(GstCaps)"video/x-vp9, width=(int)1280, height=(int)720, framerate=(fraction)0/1"
Ok well. I don't think this is right behaviour and if we ever support matroskademux with adaptivedemux we will have to do it right (and differentiate the MSE use case then somehow).
(In reply to Tim-Philipp Müller from comment #30) > Ok well. I don't think this is right behaviour and if we ever support > matroskademux with adaptivedemux we will have to do it right (and > differentiate the MSE use case then somehow). Why are you saying that if we just showed the behaviour is shared with qtdemux? (which works well with adaptivedemux...)