GNOME Bugzilla – Bug 791858
GstTocSetter: proposal to help standardize elements behaviour
Last modified: 2018-11-03 12:43:59 UTC
TOC Setter handling needs improvements in order to ease TOC related application developement and to standardize the behaviour of Elements that implement the GstTocSetter interface. This is a continuation of the discussion started in (0) and an attempt to trace and formalize what has been considered so far. This could also serve as guidelines for Elements that implement GstTocSetter. Please correct me where I'm wrong and review my proposal for a simple enhancement. My idea is to extract parts of this description for the documentation of the solution. Context ------- A Table Of Contents (1) is a tree structure which allows representing subdivisions of a media. Each TOC Entry is defined by its start and end positions, what it refers to (a chapter, an alternative angle, ...) and optional tags such as the title, the artist... An application can select a TOC Entry to reach the particular subdivision or aternative stream it refers to. TOCs can be Global to a source or relative (Current) to the currently playing stream. An application can listen to the message bus of the Pipeline and get TOC messages (2). From an Element's perspective, we can distinguish two main types for TOCs depending on their origin: - Upstream TOCs are received on sink pads via TOC Events (3). - User defined TOCs are assigned to the Element using the GstTocSetter interface (4). The User defined TOC should persist unaltered unless the user replaces it, resets it or if the Element is set back to NULL. TOC Events also indicate an "updated" status which may be used by Elements to decide what to do with the received TOC. Some media formats may not support all the features of the TOC model. It may be the case for alternative angles or the ability to define a Current TOC. An Element responsible for encoding to a format may need to adapt the upstream TOC to the format's capabilities. Moreover, there are multiple ways of representing a TOC structure. The Element should be flexible enough to map the input TOC's structure to fit its format's constraints. Use Cases --------- The following use cases are identified for an Element which implements the GstTocSetter interface. We can consider a multiplexer such as the Matroska Multiplexer (muxer). This muxer is responsible for encoding a Matroska compliant output stream from multiple audio, video or subtitles source streams. Each source stream may send a TOC. The application requirements can be splitted into the following modes: - Use the Upstream TOC as the Output TOC. This is the usual behaviour when an application doesn't need to modify the TOC. Note the following variations which might require specific processing: - One Global Upstream TOC is received. The Element only needs to adapt the TOC to fit the format's capabilities. - One Current Upstream TOC is received. If the Element doesn't support this kind of TOC, it may need to merge it into a Global TOC. - Multiple Global / Current Upstream TOCs are received, possibly on different sink pads. How the Output TOC should be built is context specific. An Element may favor a particular sink pad, ignore subsequent TOC Events unless they are marked as "updated" or perform some sort of conflict resolution. - Use the User TOC as the Output TOC. In this mode, the Element must ignore Upstream TOCs and use the User TOC. - Just like for the Upstream TOC, the Element might implement specific processing to match the format's capabilities. - The application may need to use the Upstream TOC(s) to build the User TOC. - No Output TOC. The element must not output any TOC whatever the Upstream TOC(s) received or the User defined TOC. Current Status -------------- GstTocSetter allows storing a user TOC and resetting the stored TOC. Each GstTocSetter implementer handles the use cases in its own way. E.g.: - wavenc & flacenc use a dedicated field for the Upstream TOC. When the output stream is encoded, the User TOC is selected only if no Upstream TOC has been received. The user can't prevent an Upstream TOC from being encoded. - matroskamux stores the Upstream TOC in the GstTocSetter interface. In the proper state, the user can access the Upstream TOC, apply any modification or reset it. However, the User TOC can't persist unaltered. As a workaround it is possible to define a sink pad probe (5) with GST_PAD_PROBE_TYPE_EVENT_DOWNSTREAM, then filter TOC events and return GST_PAD_PROBE_DROP to prevent the Upstream TOC from reaching the Element. The application can form a User TOC or do nothing if no TOC should be written. Another solution is to build a new TOC in the pad probe and replace the event with an event containing the new TOC. Proposal -------- The proposal for this enhancement is to introduce a "selection_mode" field (and the associated getter and setter), a GstTocSelectionMode enum with the following possible values: - GST_TOC_SELECTION_UPSTREAM. This is the default when the GstTocSetter is created and the selection_mode will be set back to this upon reset. Elements should use the Upstream TOC if any and shouldn't encode any TOC otherwise. - GST_TOC_SELECTION_USER. The selection_mode switches to this value when a TOC is set on the TOC Setter. Elements should use the TOC Setter's TOC if any and shouldn't encode any TOC otherwise. - GST_TOC_SELECTION_NONE. In this mode, Elements shouldn't encode any TOC. The workaround described in "Current Status" could be solution for applications which require to process the Upstream TOC(s) on the fly. --- (0) https://bugzilla.gnome.org/show_bug.cgi?id=791736 (1) https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstToc.html (2) https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstMessage.html#gst-message-new-toc (3) https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstEvent.html#gst-event-new-toc (4) https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/gstreamer-GstTocSetter.html (5) https://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstPad.html#gst-pad-add-probe
Just one comment: if I remember correctly how this is supposed to work: an element such as a muxer should never look at the global TOC. The global TOC is for applications, so they know what can be selected (e.g. different titles on a DVD). Elements are only concerned with the current TOC which will represent the TOC of the stream to follow (e.g. the chapters of the DVD title being played back).
That has problems with transmuxing (TOC would get lost), but now that you say that this sounds like what the original idea was.
How so? If you transmux a file with a TOC for a single DVD title (or a CD audio disc with multiple tracks) then the global toc emitted by the demuxer should be the same as the current toc, no? It's not quite clear to me how one would cover the scenario where you want to transmux a whole DVD with all titles, I think this would require application-side intervention/co-ordination then, and we don't support that in the muxer or demuxer anyway.
Indeed
The purpose of my application (1) is to build or modify TOCs. For this, I proceed in three phases: 1. The application builds a playback pipeline and sets it to Paused. The global TOC is retrieved from the message bus. No problem here. 2. The user can play, pause and seek, build or edit a chapter list using the GUI. 3. When the user wants to export the result, another pipeline is built to export the media with the new TOC. I guess this is the kind of pipelines Sebastian refers to as transmuxing. First, I was focused on using matroskamux and setting the new Global TOC using this muxer (2). It finally worked ok, though it doesn't comply with the transient nature of the GstTocSetter (see "Current Status" above). Then, I wanted to export audio only as individual tracks (one per chapter) to wave or flac files. This is when I found out that GstTocSetter couldn't override not reset the Upstream TOC (3). So the individual audio (part) files contained the whole source media TOC. I solved this using pad probes, but I thought GstTocSetter should allow this. Besides, I still use the muxer/encoder to write the global TOC. What would be the proper way of achieving this? Should the GstTocSetter only be used for Current TOCs? Could you point me to an element that implements TOC handling the way it is supposed to work, so that I can get it right? --- (1) https://github.com/fengalin/media-toc (2) https://bugzilla.gnome.org/show_bug.cgi?id=790686 (3) https://bugzilla.gnome.org/show_bug.cgi?id=791736
I'd be glad to continue working on this issue if you consider it could be of any use. Otherwise feel free to close it. For my application, I was able to solve TOC related issues by defining a sink pad probe.
From your comment it seems like there is some uncertainty about how the interface should work and how it's implemented in various elements, and we should definitely clear that up somehow. And make the implementations consistent. Also your use-case sounds perfectly normal, so we should find a way for doing that with the current interface. AFAIU the TOC event should only contain the TOC that applies to the following stream itself, so in Tim's example if you play one track of a DVD then you have the whole DVD in the global TOC message but only the chapters of the current track in the TOC event. Nonetheless it seems like there needs to be a way to override/merge/etc the TOCs from events in TOC setter elements, similar to what you suggested before. Otherwise your use-case would not work. What am I missing?
Thanks Sebastian, I misunderstood Tim's conclusion. I'll update my description with the message / event perspectives.
Created attachment 366679 [details] [review] Add TOC selection mode This is an implementation of the basic proposal above. This raises additional questions and comments. 1. Naming --------- I used the terms UPSTREAM for the TOC received from events and USER for the TOC defined by the application. Whenever possible, I added a precision that the upstream TOC was the one received from events. Is this acceptable or should I change these to EVENT and APPLICATION? 2. Defaults ----------- I decided to use GST_TOC_SELECTION_MODE_UPSTREAM as the default to reflect current behaviour for elements that implement the GstTocSetter interface. If the user defines a TOC using gst_toc_setter_set_toc(), the mode automatically switches to GST_TOC_SELECTION_MODE_USER. If the user calls gst_toc_setter_set_toc() with a NULL TOC, the mode automatically switches to GST_TOC_SELECTION_MODE_NONE. Users can force the TOC selection mode using the dedicated setter. The documentation is explicit about these. Is this OK? 3. Comment in GstTocSetter's description ---------------------------------------- AFAICT, the following comment in GstTocSetter's description doesn't match current behaviour for some implementers: * Elements implementing the #GstTocSetter interface can extend existing TOC * by getting extend UID for that (you can use gst_toc_find_entry() to retrieve it) * with any TOC entries received from downstream. In wavenc & flacenc, the upstream TOC is not accessible from the GstTocSetter interface. In matroskamux, the upstream TOC replaces the user TOC if it was set before, so it matches the description, however the user can't define a TOC beforehand. To standardize the GstTocSetter further, I think we should add an upstream_toc field to the GstTocSetter's internal data (and the associated setters/getters). The elements would use this field instead of defining their own and the application would be able to access the upstream TOC without the need to define a pad probe. The above comment would be replaced by an explanation about the upstream / user TOC getters. Shall I go on with these modifications? 4. gst_toc_setter_reset() ------------------------- The documentation for this function states: * Reset the internal TOC. Elements should call this from within the * state-change handler. As expected, it resets the user provided TOC. This is done in different states transitions depending on the element: - matroskamux: calls it in Paused to Ready. Since the upstream TOC and the user TOC as stored in the same field, this makes sense, but it defeats the idea of a persistent user TOC. - wavenc: calls it in Ready to NULL. - flacenc is an GstAudioEncoder and calls gst_toc_setter_reset() in stop() which is called when the sink pad is deactivated (so I guess that must be Ready to NULL). I think we should be more specific about the transition the element should call gst_toc_setter_reset() and I would use Ready to NULL in order to comply with the persistent nature of the user TOC. If we implement the proposal in (3), we'd probably want to reset the upstream / user TOCs during different transitions. I guess that would be: - Paused to Ready: reset the upstream TOC. - Ready to NULL: reset the user TOC. Is that correct?
Hi guys! What do you think about this? :)
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/265.