GNOME Bugzilla – Bug 634407
decodebin should expose pads in a deterministic order
Last modified: 2013-01-19 13:06:49 UTC
Right now the order of pads for multi-stream files is pretty random. It depends on which thread manages to push first. One potential solution could be to add the sometimes pads on decodebin2 at the same time we see the sometimes pads on the demuxer. Right now I use gst-discoverer and playbin2 and there is no way to map the discoverer streams to the "current-audio"/"current-video" tracks.
just for the record 13:40 < slomo> ensonic: you can't add pads to decodebin2 as soon as a demuxer adds them... you don't know if you can a) decode that stream and b) if it contains another container format (think of DV inside MXF) 13:40 < slomo> ensonic: but pads are added in a function called something with "expose" and there all pads are added at the same time. you could sort them like in playbin2 before really adding them 13:42 < wtay> you could keep track of the order in which they were added and then sort them when you finally expose the pads 13:42 < wtay> it's a bit tricky with demuxers in demuxers 13:44 < slomo> yes... but it's all possible ;) 13:44 < slomo> you could propagate the order downstream, i.e. demuxer 1 adds stream 5, this gets to demuxer 2 which then adds stream 5.1 and 5.2 13:44 < slomo> and in the end sort this
Created attachment 175025 [details] [review] WIP to make pad order deterministic My current attempt to fix the problem. Seems to not help though. This is the same issue as in #bug 607466.
Comment on attachment 175025 [details] [review] WIP to make pad order deterministic We need a better stream selection system in playbin2 for this IMHO, e.g. something to let the demuxer specify a default stream.
*** Bug 654830 has been marked as a duplicate of this bug. ***
Text from 654830: It is currently quite hard to work with files that have multiple streams of the same type using discoverer and uridecodebin and encodebin, as apart from the technical characteristics there seems to be no way of identified each stream. So especially when two streams are technically identical, ie. mp3 audio. It is a pain to deal with in a 100% safe way. It would be nice if we could come up with a system where discoverer could create some kind of unique identifier for each stream it finds. That way when we am connecting uridecodebin to encodebin or other elements I can look for the pad with that unique identifier and connect that, would probably simplify writing applications to handle this case a lot. If discoverer and uridecodebin where also able to automatically attach any language information to each stream, based on tag data etc. that would also be great.
Had a quick chat with Edward today about this issue. It seems to be a bug that will be trivial to fix in the 0.11/1.0 branch. My guess is with 1.0 looming quite close on the horizon and that fixing this in the 0.10 branch is quite hard, this bug is not likely to be ever fixed for the 0.10 branch.
It think its fixable as we already sort the pads. I'll try to have another look at it.
> Had a quick chat with Edward today about this issue. It seems to be a bug that > will be trivial to fix in the 0.11/1.0 branch. How so? Care to share some more details?
Christian ? I'm curious too :)
<cschalle> bilboed, see bug tpm posted, but in general it refers to when I discovered it wasn't easy to 'track' a stream through discoverer/decodebin etc., you said it was hard to fix in 0.10, but that it would be a lot easier in 0.11/1.0 <bilboed> since in 0.11 we have sticky events and tags are events and you can query what events are stuck on pads ... <cschalle> cymacs, ^^ <bilboed> if we had demuxers/decoders pushing out identifier tags you could then identify streams then you'd be able to identify a pad corresponding to a certain stream even if that pad is several elements down the road
I'm playing around with the following solution now: Defining a new tag GST_TAG_STREAM_ID, of string type and "DECODED" flag (i.e. not meta data, like the DURATION tag for example). The semantics shall be such that the tag's value is modified by any element that splits up a stream into different ones. So demuxers have to care, but not (the usual) decoders. tee doesn't have to care (splits into identical copies), but deinterleave probably should. When a demuxer (or demuxer-like) element forwards the tag to a new branch, it bases the new value for the tag on the upstream value by appending its own specific stream id followed by a colon (:). This rule produces values as such: "1:", "1:2:", "184917:4374391004:1:9:" The trailing colon has the benefit of reminding people that it's a string :) but also allows you to quickly test that "1:2:" is a sub branch of "1:" by comparing the prefix (otherwise, "1" could be confused with "11", so field splitting would be required). Not sure if that could be useful to somebody though. The local id that an element appends per pad must be unique among all the pads of course. It also has to be stable across multiple invocations in the same process (so that you can match ids between runs of discoverer and decodebin). Stability across the process boundary should not be required; the exact values are subject to change between different versions of an element for example. Elements can derive the id in a format specific way. For example in Ogg and Matroska, there are serial numbers/track UIDs. In general, this means the ids' lexical sort order does not necessarily match any kind of stream/track order in the file. The first track could be "5:". Or "28904729847239842398472:". Implementation wise, most of this can be encapsulated in some small helper function(s) if sticky events are used. Some kind of tag_list_add_stream_id_uint (list, sinkpad, my_local_id) perhaps (though should most likely not clutter up the tag list namespace?). That function can query the upstream value through gst_pad_get_sticky_event and do the magic to have the given uint value represented. Other variants for uint64 and string should make sense, or maybe just a single generic string variant where elements do the formatting on their own. This could also be done in 0.10 as well. However this means each demuxer has to catch tag events to extract and store the upstream id. Also, I guess it is harder for applications to extract per-stream tag events without sticky events, so this is a lot less useful (right?). Any opinions on the colon separator notation? One alternative would be to set the tag multiple times, and then gst_tag_merge_strings_with_comma would produce a single value to the application in the end.
Sounds like a good plan but IMHO this could be done in 0.10 too. playbin2 at least is catching tag events for applications, which makes this much easier from the application side.
Is there a reason to mandate numbers? Might be nicer to be able to use other descriptors too (e.g. audio_00 or so).
Ah yeah sure, can be anything (like hex-digit formatted hashes or whatever). It could also contain colons itself (which doesn't violate the prefix rule), for example I'm wondering if matroskademux should actually append two ids, one derived from the file (FILEUID) and one per stream.
And filesrc for example could have the URI of the file in the beginning, sounds great. One thing I don't like though is that you want to allow the separate character inside the strings too. IMHO it shouldn't be allowed there to allow creating a tree of the file structure from the tags. The separator, if used inside the strings, should probably be escaped somehow.
Yes, but you never know for sure which elements support it anyways, so the tree structure could always end up less detailed than it should be.
Hmm, a pad-unique string identifier that is stable across invocations... The pad name should already satisfy that in 99% of all cases, right? :) Now I want a different approach; how about we discover the pipeline's topology, restricted to a subset of elements we are interested in? Probably just looking at all elements with a 1-to-N topology, or looking at the factory classification for "Demuxer". The benefit is that no changes to elements are required at all. Well, except for weird (and rare?) cases where e.g. audio_00 and audio_01 can be confused randomly, which just needs fixing.
The reason I filed this bug was that it ended up being a bit random what audio was on the pads I got in multistream audio files, so the english track was the first pad in some runs and the dutch for instance in the second.
Yes, the order in which the final pads end up being exposed is random. I think Stefan's approach was to make this deterministic, but it seems to be tricky. Edward's idea is to add an identifier in the form of a tag, so that the randomness doesn't matter. I like that idea, but using a tag is probably not the most practical solution. I think a utility function is needed that iterates the pipeline starting at a given src pad, fetching the relevant names along the way to build the id.
I was only looking at the final ghost-pads that are exposed. The pad order inside the pipeline does not really matter (in this regard).
Created attachment 209657 [details] [review] WIP gst_util_make_stream_id function This implements a function like I mentioned in comment 19. It takes a srcpad and an optional element klass string (a single one for now, so no slashes in there). It walks the pipeline graph upstream, crossing bin boundaries and traversing into them. Calling this on an audio pad of decodebin2, this can return for example: "filesrc:src,typefind:src,matroskademux:audio_01,multiqueue:src2,ac3parse:src,capsfilter:src,a52dec:src" So it describes the whole pipeline upstream from that pad (sink pad names could be added too, but are less useful). When you pass "demuxer" as the klass, this is limited to just: "matroskademux:audio_01"
Why is this not implemented as a query?
If this can live in gst_pad_query_default, I would prefer it (simpler code). What I don't want is any code in specific elements. So this isn't a "typical" query in any way, just using the query system to conveniently peek at all the interesting pads. I'll implement it as a default query handler to see how it looks.
IMHO this should be a query and could probably be handled in gst_pad_query_default() too. But I'm sure we also need to change the default handler for some special elements.
I think we need three things in the end: 1) allow demuxers to signal an *order* for their source pads, so that "the first" audio stream will always be first in the list (whichever list that is, playbin's list for example, or the order in which decodebin2/uridecodebin exposes the pads), and "the second" will always be second, etc. 2) *stable unique identifiers* for different streams to match the same stream across multiple runs. This could tie in with the order thing from (1), but doesn't have to. 3) a way for a demuxer to signal upper layers which stream is the *default* one; e.g. a container might have info that audio stream #3 or language XY is supposed to be the default, and we should be able to communicate that somehow. These don't have to be all done at once, but there's some overlap it seems to me, so might not hurt to think if we can come up with something that takes care of all of these things, or at least 1) and 2).
I tried applying these patches against current git master (0.11). They do apply, but compilation of GStreamer core fails with: gstutils.c: In function 'make_stream_id': gstutils.c:3712:3: error: passing argument 2 of 'gst_iterator_next' from incompatible pointer type [-Werror] ../gst/gstiterator.h:254:19: note: expected 'struct GValue *' but argument is of type 'void **' gstutils.c:3729:5: error: implicit declaration of function 'gst_element_factory_get_klass' [-Werror=implicit-function-declaration] gstutils.c:3729:5: error: nested extern declaration of 'gst_element_factory_get_klass' [-Werror=nested-externs] gstutils.c:3729:11: error: assignment makes pointer from integer without a cast [-Werror] cc1: all warnings being treated as errors
Any chance of this issue being moved forward? Would be a great feature for GStreamer 1.0 if the stream selection stuff worked in discoverer and decodebin
I need to try out the query idea still. I'm thinking it should be a generic "GST_QUERY_TOPOLOGY" thing which returns you the pipeline graph that leads up to that point (i.e. where does the data flow come from that leads here). The query might return a GList of pads (where entries alternate between src and sink pads). The stream-id I explained before could be created from that by iterating the list and looking for "Demuxer" classified element factories (pad's parent), then including those element and pad names in the id to make it unique to that path, just as before. Does this make sense to anyone? :)
Yes, and it could use the iterate-internal-pads query to get to the corresponding pad on the other side in the generic handler for this query. A GList of pads is not enough though, it can be a tree that has the current pad at the root (just think of things like adder or interleave).
Err, what? GLists? This sounds a bit complicated to me. Also see my comments on bug #679315 btw. Setting something like "/pipeline0/filesrc0/47184114/matroskademux0/audio_00" as name/nick with the tag event would do the job just fine IMHO (or as ID inside the tag list, or a stream id somewhere else). The "/pipeline0/filesrc0/47184114/" part would be queried from upstream and could reflect the topology (I wouldn't expect anything to parse that ID for any purpose).
Sebastian: Right, the generic query needs to support the N-to-1 case. Can that be represented in a gobject-introspectible way without introducing new API? If handling the query return data is just as complicated as iterating the pipeline by yourself, the thing becomes moot. Tim: Wrong bug number? Sounds like my original approach.
Yes, sorry, that should've been bug #677619. It's very similar to your approach from comment #21, just a bit more concise (IMHO, anyway) - I don't think we need to mention the pads when there's only one source pad, for example. (The number in the example was meant to be a hash of the filesrc URI btw) To go back to my list in comment #25, this scheme would accommodate 1) allow demuxer to signal an order, e.g. via: .../foodemux/001/audio_02 .../foodemux/002/audio_00 2) *stable unique identifiers* - taken care of via topology and URI hashes (if desired) 3) a way for a demuxer to signal upper layers which stream is the *default*: Could let the demuxer do that via (1) as above - just put default first, or add :default suffix or so to ID, or add a boolean tag to the stream's taglist. (I haven't read back the entire comment thread, so apologies if this doesn't address some other comment).
Would it work to use gst_object_get_path_string() to build the id (I know that this is the containment hierarchy and not the dataflow path)?
(In reply to comment #30) > Err, what? GLists? This sounds a bit complicated to me. Instead of GLists maybe use nested GstStructures as is done for the stream-topology message in decodebin2? This way it's easy to handle by bindings and extendable. This is only about the topology query proposed by René, which is a good idea on its own IMHO, and could be used to solve this bug here and also the related ones about marking the default stream and other stream markers.
Created attachment 219835 [details] [review] event: Add new stream-id field to the stream-start event This is supposed to allow uniquely identifying a single stream.
Created attachment 219836 [details] [review] event: Update for stream-start event API changes
Review of attachment 219836 [details] [review]: ::: libs/gst/base/gstbaseparse.c @@ +2852,3 @@ if (G_UNLIKELY (parse->priv->push_stream_start)) { + GstQuery *query; + gchar *uri; maybe call this stream_id as in the random-number case it is not a uri (and we don't really require one anyway). @@ +2863,3 @@ + } + + if (gst_pad_peer_query (parse->sinkpad, query)) { The comment says MD5 and the code SHA256 ::: libs/gst/base/gstbasesrc.c @@ +840,3 @@ + g_checksum_update (cs, (const guchar *) uri, strlen (uri)); + + } else { urgs, copy'n'paste. But probably still too specific to move to a utility function?
Created attachment 219990 [details] [review] event: Add new stream-id field to the stream-start event This is supposed to allow uniquely identifying a single stream.
Created attachment 219991 [details] [review] event: Add new stream-id field to the stream-start event This is supposed to allow uniquely identifying a single stream.
Created attachment 219992 [details] [review] event: Update for stream-start event API changes
Looks much better than the first set IMHO.
Review of attachment 219991 [details] [review]: ::: gst/gstutils.c @@ +3543,3 @@ + * Creates a stream-id for the source #GstPad @pad by combining the + * upstream information with the optional @stream_id of the stream + * of @pad. @pad must have a parent #GstElement and must have zero "parent #GstElement and must have" -> "parent #GstElement which must have" @@ +3592,3 @@ + + /* And then generate an MD5 sum of the URI */ + g_return_val_if_fail (parent->numsinkpads <= 1, NULL); still MD5 <-> SHA256 :/ @@ +3617,3 @@ + g_free (upstream_stream_id); + + gst_event_unref (upstream_event); perhaps: if (stream_id) { gchar *id = g_strconcat (upstream_stream_id, "/", stream_id, NULL); g_free (upstream_stream_id); return id; } else { return upstream_stream_id; }
Created attachment 220064 [details] [review] event: Add new stream-id field to the stream-start event This is supposed to allow uniquely identifying a single stream.
Created attachment 220065 [details] [review] event: Update for stream-start event API changes
Review of attachment 220065 [details] [review]: I was wondering if it would make sense to add API to the base classes that allows the subclasses to set a stream-id, which is then used instead of the URI/random stream-id... something like gst_base_parse_set_stream_id(parse, const gchar *stream_id) Any opinions?
> I was wondering if it would make sense to add API to the base classes that > allows the subclasses to set a stream-id, which is then used instead of the > URI/random stream-id... something like gst_base_parse_set_stream_id(parse, > const gchar *stream_id) > > Any opinions? No opinion either way really? Does it matter? Is it needed for a first implement of things? How/where/when would you use it?
My hope for this functionality is that I can get a name/object for each stream from discoverer and then when I want to plug things to decodebin I can use those names/objects to identify the stream I want, so in the sense that I hope to just use objects and not string recognition I am fine either way.
(In reply to comment #46) > > I was wondering if it would make sense to add API to the base classes that > > allows the subclasses to set a stream-id, which is then used instead of the > > URI/random stream-id... something like gst_base_parse_set_stream_id(parse, > > const gchar *stream_id) > > > > Any opinions? > > No opinion either way really? Does it matter? Is it needed for a first > implement of things? How/where/when would you use it? For the case when there's no URI of the source or when there's another identifier. I guess one could simply check if the srcpad already has a stream-start event in the utility function for that... but can be added later. (In reply to comment #47) > My hope for this functionality is that I can get a name/object for each stream > from discoverer and then when I want to plug things to decodebin I can use > those names/objects to identify the stream I want, so in the sense that I hope > to just use objects and not string recognition I am fine either way. This will only work for the case when the source implements the URI handler interface and has an URI... or the source somehow (see above) provides another deterministic stream-id. The default case would provide a random stream ID. In any case, after these changes and after they are implemented wherever needed, you will be able to match streams by doing a string comparison of the stream-ids.
> (In reply to comment #47) > > My hope for this functionality is that I can get a name/object for each stream > > from discoverer and then when I want to plug things to decodebin I can use > > those names/objects to identify the stream I want, so in the sense that I hope > > to just use objects and not string recognition I am fine either way. > > This will only work for the case when the source implements the URI handler > interface and has an URI... or the source somehow (see above) provides another > deterministic stream-id. The default case would provide a random stream ID. > > In any case, after these changes and after they are implemented wherever > needed, you will be able to match streams by doing a string comparison of the > stream-ids. How would the matching help if the stream-ids are generated with a random-number generator? Wouldn't it make more sense to hash the caps? My use case is pretty much the same as Christians. I am using gst-discoverer to figure out the number and properties of streams and then want to configure playbin to play a certain stream.
These use cases should be covered fine, since they're all URI-based, no?
I am getting confused, but isn't the source in this case uridecodebin? And yes it operates on URIs, but if a given file gives me a random audio stream as stream 1, and provides no other means of me to keep track of the audio stream across multiple sessions it all becomes very random. The other item is that I would ideally want a clear link between stream metadata and the streams, so if the metadata says stream 2 is French language and stream 3 is German, I need to know that I got the correct stream without having to listen to it. So maybe grabbing that information could be part of Sebastians suggestion for manually setting a deterministinc stream id? That the demuxer adds DE or FR to the front of the stream id if such information is available?
(In reply to comment #49) > > (In reply to comment #47) > > > My hope for this functionality is that I can get a name/object for each stream > > > from discoverer and then when I want to plug things to decodebin I can use > > > those names/objects to identify the stream I want, so in the sense that I hope > > > to just use objects and not string recognition I am fine either way. > > > > This will only work for the case when the source implements the URI handler > > interface and has an URI... or the source somehow (see above) provides another > > deterministic stream-id. The default case would provide a random stream ID. > > > > In any case, after these changes and after they are implemented wherever > > needed, you will be able to match streams by doing a string comparison of the > > stream-ids. > > How would the matching help if the stream-ids are generated with a > random-number generator? Wouldn't it make more sense to hash the caps? The PRNG is only used for the case when the URI query returns nothing and there's no upstream stream-id yet. Problem with hashing the caps is that they're not unique, not even in the same pipeline. For the cases when there's no URI for the stream we would need another solution like the one I mentioned above, but that can be added later. Also, with these stream-ids it would now be possible to sort the streams inside decodebin/playbin to have a deterministic order of pads/streams.
(In reply to comment #52) > (In reply to comment #49) > > > > How would the matching help if the stream-ids are generated with a > > random-number generator? Wouldn't it make more sense to hash the caps? > > The PRNG is only used for the case when the URI query returns nothing and > there's no upstream stream-id yet. Problem with hashing the caps is that > they're not unique, not even in the same pipeline. > For the cases when there's no URI for the stream we would need another solution > like the one I mentioned above, but that can be added later. > > Also, with these stream-ids it would now be possible to sort the streams inside > decodebin/playbin to have a deterministic order of pads/streams. Okay, so it is uri + stream_id and who ever sets the stream id, needs to ensure that the stream id is picked in a deterministic manner? I probably have to see it end-to-end, as now I still wonder how this will work. We still need to pick stream_ids in a deterministic way, right?
Created attachment 220358 [details] [review] oggdemux: Add stream-id to the stream-start event
(In reply to comment #54) > Created an attachment (id=220358) [details] [review] > oggdemux: Add stream-id to the stream-start event Take this as an example, the stream-ids of the streams coming out of oggdemux will be "hash_of_uri/serialno_of_stream" for example. I guess it would make sense to add a GST_WARNING() to the random-number-case of gst_pad_create_stream_id() as this is something that ideally should never happen.
Looks like a gst_pad_create_stream_id_printf() might come in handy as well :) I don't think we should worry too much about non-filesrc based cases for now, and even less about non-URI-based cases. Those IDs are not set in stone, and no one should extract information from them directly IMHO.
(In reply to comment #56) > I don't think we should worry too much about non-filesrc based cases for now, > and even less about non-URI-based cases. Those IDs are not set in stone, and no > one should extract information from them directly IMHO. +1
commit 41954ff8cd5ecb258e079078d93d1489eaccdcfd Author: Sebastian Dröge <sebastian.droege@collabora.co.uk> Date: Sat Jul 28 08:37:00 2012 +0200 event: Add new stream-id field to the stream-start event This is supposed to allow uniquely identifying a single stream.
Ok, now that 1.0 is out it would be good to return to this and https://bugzilla.gnome.org/show_bug.cgi?id=654830 to see if we can get all elements using Sebastians new feature. Supporting multitrack files in Transmageddon has been on my todo list for over a year now :)
Christian, I believe everything you need is there now?
*** Bug 607466 has been marked as a duplicate of this bug. ***
commit c0926dc7cc9a0e180621ad454ee6eb54745f5de1 Author: Tim-Philipp Müller <tim@centricular.net> Date: Sat Jan 19 12:51:56 2013 +0000 pad: add gst_pad_get_stream_id() utility function API: gst_pad_get_stream_id() commit 500b86489923b8add387598f367eb8368522c9de Author: Tim-Philipp Müller <tim@centricular.net> Date: Sat Jan 19 13:03:03 2013 +0000 decodebin: try harder to always expose pads in the same order Use stream-id as sort criterion in addition to the media type. https://bugzilla.gnome.org/show_bug.cgi?id=634407