GNOME Bugzilla – Bug 618853
add lazily evaluated caps (e.g. intersection iterators)
Last modified: 2012-06-18 22:46:12 UTC
Quite often caps negotiation code does the following: intersect = gst_caps_intersect (caps1, caps2); gst_caps_truncate (intersect); to get the preferred (first) format that is good for both sides of a link. This is not performing well as we construct full caps with all the gvalues and then immediately after throw away most of it. The attached patch implements a iterator for a caps intersection, that can deliver the structure of the intersections as needed.
Created attachment 161211 [details] [review] caps intersection iterator (WIP)
Created attachment 161212 [details] small test app for the algorithm $ gcc -o intersect2 intersect2.c $ ./intersect2 intersect1(ABCD,CABD)=ACBD intersect1(CABD,ABCD)=ACBD intersect2(ABCD,CABD)=ACBD intersect2(CABD,ABCD)=ACBD
(In reply to comment #0) > This > is not performing well as we construct full caps with all the gvalues and then > immediately after throw away most of it. > Do we have any benchmarks or profiles that show that this is a problem?
(In reply to comment #3) > (In reply to comment #0) > > This > > is not performing well as we construct full caps with all the gvalues and then > > immediately after throw away most of it. > > > Do we have any benchmarks or profiles that show that this is a problem? Any pipeline using a couple of basetransform elements should be pretty obvious (yay, millions of gvalue creations).
Additional ideas from irc discussion, we want GstIterator *gst_caps_get_caps_iterator(GstPad *pad) - for sources, return an iterator wrapping the template caps - for others, return a iterator that iterates the intersection of the peer pad caps and the template caps) - it would be a function on a pad, that can be overridden (like GstPadGetCapsFunction -> GstPadGetCapsIteratorFunction) So the GstCapsIntersectionIterator should be changed to take to GstIterators and we need a iterator that iterates structures in a caps: GstIterator *gst_caps_iterate_structure(GstCaps * caps1)
gst_caps_get_caps_iterator -> gst_pad_get_caps_iterator :)
Don't forget to have support in the iterators for cookies, so that elements that override GstPadGetCapsIteratorFunc can provide a cookie that will be checked (which the element can update if ever the returned caps depend on some local variable that gets changed).
Created attachment 161234 [details] [review] progressive caps negotiation This is an advance version of the patch (still WIP). Some thoughts: - returning caps instead of structure is bad - we need to create intermediate caps objects - we need to copy structures (otherwise we can't add them to the temp caps) - its not solving the problem with any and empty TODO: The caps_iterator would need to return a copy of itself in case of any/empty (it could use a different _next function). The gst_caps_intersection_iterator_next() function needs to run the any/empty checks on every loop.
Review of attachment 161234 [details] [review]: Looks good in general, some comments below. Question is still how we could support ANY/EMPTY caps :) From all ideas that were proposed so far, I like Edwards idea to use singleton iterators for those and have macros for checking against them most. ::: gst/gstcaps.c @@ +1515,3 @@ + /* - we don't need an ItemFunction because we create new items in _next + * - we ref the caps and thus they can't change, we still need a dummy cookie + * - we have no lock It might be a good idea to allow some kind of cookie for the case when element properties change for example. This should then require a resync and getting new caps. Same goes for the lock. @@ +1697,3 @@ + * Since: 0.10.XXX + */ +/* FIXME: should this take ownership of the iterators? */ Yes, gst_iterator_filter() for example does this too @@ +1707,3 @@ + /* - we don't need an ItemFunction because we create new items in _next + * - we ref the caps and thus they can't change, we still need a dummy cookie + * - we have no lock A lock and cookie might be good to allow for the same reasons as above. ::: gst/gstpad.c @@ +2395,3 @@ + } + } + if ((templ = GST_PAD_PAD_TEMPLATE (pad))) { If the pad has no getcapsiterator function you should simply call gst_pad_get_caps()
Oh and in the intersection iterator you should do something more intelligent when RESYNC happens... like resetting and starting from the beginning if nothing else can be done. The DONE case shouldn't be a GST_WARNING() I guess. For the cookie/lock issue. I think this might actually be not necessary. If elements want this they can do that in their getcapsiterator function, where they create a custom iterator that has a intersection iterator pushed for example. And we might want to add substract/union/merge iterators too later ;)
Thanks for the review. I'll post updated patches as I go forward.
(In reply to comment #9) > Review of attachment 161234 [details] [review]: > > Looks good in general, some comments below. Question is still how we could > support ANY/EMPTY caps :) From all ideas that were proposed so far, I like > Edwards idea to use singleton iterators for those and have macros for checking > against them most. I am trying that right now. > > ::: gst/gstcaps.c > @@ +1515,3 @@ > + /* - we don't need an ItemFunction because we create new items in _next > + * - we ref the caps and thus they can't change, we still need a dummy > cookie > + * - we have no lock > > It might be a good idea to allow some kind of cookie for the case when element > properties change for example. This should then require a resync and getting > new caps. Same goes for the lock. Hmm. If you currently run a intersect it would be a sync() call that don't not take any parallel property changes into account. Do you have a use case in mind where the iterator is keept and iterated slowly over time? Basically the caps are reffed and thus immutable during the lifetime of the iterator. The iterator becomes invalid if the caps change, but the iterator does not know where to get the new ones from, so it cannot resync itself in that case. Also what should the lock protect? The caps that are intersected are usuly from two different elements, so we'd need to take two locks if at all. > > @@ +1697,3 @@ > + * Since: 0.10.XXX > + */ > +/* FIXME: should this take ownership of the iterators? */ > > Yes, gst_iterator_filter() for example does this too. Will do that. > > @@ +1707,3 @@ > + /* - we don't need an ItemFunction because we create new items in _next > + * - we ref the caps and thus they can't change, we still need a dummy > cookie > + * - we have no lock > > A lock and cookie might be good to allow for the same reasons as above. > > ::: gst/gstpad.c > @@ +2395,3 @@ > + } > + } > + if ((templ = GST_PAD_PAD_TEMPLATE (pad))) { > > If the pad has no getcapsiterator function you should simply call > gst_pad_get_caps() Good idea, done!
Created attachment 161350 [details] [review] progressive caps negotiation
Review of attachment 161350 [details] [review]: ::: gst/gstcaps.c @@ +1539,3 @@ + /* return special case signleton iterators */ + if (G_UNLIKELY (CAPS_IS_EMPTY (caps))) { + if (!__gst_empty_caps_iterator) { You should use g_once_init_* here or initialize somewhere from gst_init() for thread safety @@ +1705,3 @@ + goto restart; + break; + default: You should handle RESYNC here by resyncing the child iterators and starting again from the beginning. Use case for this would be a custom caps iterator on a pad, which resyncs whenever properties or something change.
Created attachment 161355 [details] [review] progressive caps negotiation Small sumary from IRC discussion. A new idea is to have lazily evaluated caps instead. So we don't add new pad-functions and don't change basesclasses. gst_caps_intersect(caps1,caps2) would basically just return caps that are backed by an iterator. When calling gst_caps_get_structure() (and internal gst_caps_get_structure_unchecked()) we roll the iterator until we have the n-th item. When calling gst_caps_get_size(), we'd need to evaluate it fully.
There is now a benchmark under "tests/benchmarks/capsnego".
Created attachment 161743 [details] [review] add lazy caps and implement it for intersections This is a different approach now. Everything is implemented inside gstcaps. No more patching of gstpad and basetransform.
I'll be away for a week and I'd welcome, if someone would try implementing lazy caps for another operator (caps_from_string could be a good candidate), And now also some meassurements below: = Audio = == before == > libtool --mode=execute ./capsnego -f audio building audio pipeline with depth = 4 and children = 3 built pipeline with 607 elements starting pipeline 0:00:08.151373888 reached paused 0:00:07.750882654 reached paused 0:00:07.730392286 reached paused == after lazy caps == > libtool --mode=execute ./capsnego -f audio building audio pipeline with depth = 4 and children = 3 built pipeline with 607 elements starting pipeline 0:00:07.404945120 reached paused 0:00:07.577315198 reached paused 0:00:07.386936241 reached paused = Video = == before == >libtool --mode=execute ./capsnego -f video building video pipeline with depth = 4 and children = 3 built pipeline with 607 elements starting pipeline 0:00:26.354146476 reached paused 0:00:25.057845497 reached paused 0:00:24.521472345 reached paused == after lazy caps == > libtool --mode=execute ./capsnego -f video building video pipeline with depth = 4 and children = 3 built pipeline with 607 elements starting pipeline 0:00:21.717630276 reached paused 0:00:23.754434052 reached paused 0:00:21.933146549 reached paused
The reason why it's still taking for-ever is because it's doing gazillions getcaps. You might want to try the patches from bug #618644 and link the elements before activating the elements. As far as I understand the code, you're doing an exponential number of expensive getcaps here, whereas with the provided patches you should end up only doing it once.
Edward, its also not yet showing much benefit as the approach needs to be applied elsewhere. E.g. gst_caps_can_intersect() needs to be turned into an iterative way to not call gst_caps_get_size() but rather iterate untile gst_caps_get_structure() returns NULL (to indicate end of caps). I'll try rewriting _can_intersect() todays still (if my time permits). I'll also try you patch.
Created attachment 161833 [details] [review] add lazy caps and implement it for intersections New patch with can_intersect fixed. We need a way to copy iterators. Right now its gst_caps_copy() that is eating the benefits of the lazy caps.
(In reply to comment #21) > We need a way to copy iterators. Right now > its gst_caps_copy() that is eating the benefits of the lazy caps. I think that's not the problem - copying the caps probably happens because they're modified afterwards. Otherwise we could just make gst_caps_copy() lazy, too. (For a proof-of-concept it's rather trivial: GstCaps * gst_caps_copy (GstCaps *original) { return gst_caps_intersect (original, gst_caps_new_any()); }
Created attachment 161846 [details] [review] add lazy caps and implement it for intersections Company, _copy cannot be implemented like that as _intersect calls copy if the other caps are any (you can't iterate any). This new patch has copy implemneted and now its getting faster.
This last patch is eating all of the padding in GstCaps. I was thinking of adding the function pointers as a array and instead of caps->free_func(caps) do #define CAPS_FREE_FUNC(caps) ((GstCapsFreeFunc)(caps->funcs[1]))(caps) That would still leave 2 pointers. Any better idea?
Created attachment 161866 [details] [review] add lazy caps and implement it for intersections Small cleanups, better docs. Copy already evaluated structures in copy_func.
Created attachment 161869 [details] [review] avoid using gst_caps_get_size() in gst_pad_fixate_caps Just truncate and then fixate if something is left. Avoids fully evaluating lazy caps. Remaining offender is gst_base_transform_transform_caps(). Need to implement a lazy caps iterator here too. One more question - would it be okay to change the semantics of gst_caps_get_structure() the return NULL for index>length. Its is doing that right now via: g_return_val_if_fail (index < caps->structs->len, NULL); which I'd like to turn into if (index >= caps->structs->len) return NULL; That'd allow to write: ix = 0; while ( s = gst_caps_get_structure(caps, ix)) { ... ix++; }
(In reply to comment #24) > This last patch is eating all of the padding in GstCaps. I was thinking of > adding the function pointers as a array and instead of > caps->free_func(caps) do > #define CAPS_FREE_FUNC(caps) ((GstCapsFreeFunc)(caps->funcs[1]))(caps) > > That would still leave 2 pointers. Any better idea? My idea was to use only one pointer and have it point to a GstCapsPrivate. The nice thing about that would be that we could use NULL to mean "caps aren't lazy" and wouldn't need to use the flags.
(In reply to comment #26) > One more question - would it be okay to change the semantics of > gst_caps_get_structure() the return NULL for index>length. That was one of the original ideas of my design, so yes, please do that. :)
(In reply to comment #26) > Created an attachment (id=161869) [details] [review] > avoid using gst_caps_get_size() in gst_pad_fixate_caps I think this patch should be applied anyway, it looks much cleaner. The only thing I'm wondering is wether we should g_return_if_fail() instead of just returning if someone tries to fixate ANY or EMPTY, as both of them cannot be fixated.
Benjamin, thanks for the comments. I'll make the respective changes. The priv-ptr and checking for != NULL is good idea indeed - its an easy change as I use a CAPS_IS_LAZY macro already. Also here are the performance figures from last patch: = Audio = 0:00:05.979139740 reached paused 0:00:06.152942043 reached paused 0:00:06.105653567 reached paused = Video = 0:00:18.052856277 reached paused 0:00:18.725586336 reached paused 0:00:18.753294489 reached paused
(In reply to comment #29) > (In reply to comment #26) > > Created an attachment (id=161869) [details] [review] [details] [review] > > avoid using gst_caps_get_size() in gst_pad_fixate_caps > > I think this patch should be applied anyway, it looks much cleaner. The only > thing I'm wondering is wether we should g_return_if_fail() instead of just > returning if someone tries to fixate ANY or EMPTY, as both of them cannot be > fixated. I checked some code that is using it and I think it would make sense. Anyone else?
Created attachment 161927 [details] [review] add lazy caps and implement it for intersections New patch that: - has complete docs - uses private for lazy caps data - does not use a new flag any more - code clean ups - bug fix in copy func
So, I reviewed the code and am going to dump my thoughts about the API here, in no particular order: - The whole implementation is not threadsafe I have no idea how we want to guarantee that. And it definitely has to be threadsafe. I suspect we'll find the first problems the moment we use it for gst_caps_from_string(). Anyway, I don't want to hunt races in my Fedora crash reports, so it must be threadsafe. I looked into lock-free arrays (for the structure array) yesterday and while they do exist it's not clear to me that they provide a noticable benefit to just putting a lock into every (lazy) caps. Locks are cheap these days after all... - Do we consider caps fields private? They're marked as private, but would replacing the structures array with a threadsafe variant be considered an API break? - Why do we need a copy func? It doesn't seem necessary to me as we can just do a copy with gst_caps_new_lazy() that copies structures on-demand. Also, requiring the copy code of _intersect() to compute the intersection again seems like overkill to me. - I don't like the EvaluateStructureFunc prototype What I don't like about it is that it hands ou the caps and tells you to add more structures to it using public API. This fails the moment the caps are not writable anymore (i.e. when somebody reffed them). I had imagined a prototype like: GstStructure * (* GetNextStructureFunc) (gpointer user_data); so that the functions never get handed the caps and the caps take care of adding the structures internally. I'm also not sure if an API like GstCaps * (* GetNextStructuresFunc) (gpointer user_data); would be better. That way the functions would be able to return more than one structure at a time. - The free func should be a GDestroyNotify Same reason as above: Do not give the function access to the caps, but only to the data pointer. - The current intersection implementation uses private API. This makes for a lousy test, as most of the users of the lazy API (think plugins) will not have that luxury. I'd much prefer if the intersection implementation only uses pu blic API. - We might need API for modifying caps lazily The transform-caps functions probably want API that makes it easy to transform caps. When I'm looking at ffmpegcolorspace, what it does is take the templatecaps, restrict width, height, framerate and aspect ratio to the input and return the union of original and restricted template caps. Can we even do this lazily? - How far do we want to take this? Do we want to transition so all caps are lazy? A place where this might be very useful is probing code in sources and sinks which can take a while. If we were able to delay a complete probe of all the weird formats, we might save quite a bit of time on osssinks and v4lsrcs. But that probably requires not closing the device before the caps is fully evaluated...
Patches don't apply cleanly anymore.
After throwing a profiler at the problem I'm not convinced lazy caps are necessary anymore. See bug 619806 for why.
(In reply to comment #33) > - Why do we need a copy func? > It doesn't seem necessary to me as we can just do a copy with > gst_caps_new_lazy() that copies structures on-demand. Also, requiring the copy > code of _intersect() to compute the intersection again seems like overkill to > me. The copy func also copies already evaluated structures.
(In reply to comment #29) > (In reply to comment #26) > > Created an attachment (id=161869) [details] [review] [details] [review] > > avoid using gst_caps_get_size() in gst_pad_fixate_caps > > I think this patch should be applied anyway, it looks much cleaner. The only > thing I'm wondering is wether we should g_return_if_fail() instead of just > returning if someone tries to fixate ANY or EMPTY, as both of them cannot be > fixated. The gst_caps_is_empty() was obsolete as we had g_return_if_fail (!gst_caps_is_empty (caps)); at the begin already. If I also add a g_return_if_fail (!gst_caps_is_any (caps)); I get failing tests though: gstcheck.c:72:F:general:basesrc_eos_events_pull:0: Unexpected critical/warning: gst_pad_fixate_caps: assertion `!gst_caps_is_any (caps)' failed gstcheck.c:72:F:general:basesrc_eos_events_pull_live_op:0: Unexpected critical/warning: gst_pad_fixate_caps: assertion `!gst_caps_is_any (caps)' failed gstcheck.c:72:F:general:basesrc_eos_events_pull_live_eos:0: Unexpected critical/warning: gst_pad_fixate_caps: assertion `!gst_caps_is_any (caps)' failed Now I just return quickly for any-caps and add a fixme-0.11. I'll push this in a sec.
Comment on attachment 161869 [details] [review] avoid using gst_caps_get_size() in gst_pad_fixate_caps commit 21b4ef4d0fe2bf163121b38d717efb688ac5c41c Author: Stefan Kost <ensonic@users.sf.net> Date: Mon May 24 17:25:52 2010 +0300 pads: Improve readability for gst_pad_fixate_caps() Just truncate and then fixate. We check for empty caps in the begin and a fixate-func that empties a caps would be broken. It also helps lazy caps impl. in bug 618853 by avoiding the gst_caps_get_size().
I have pushed my branch (lazycaps) to git://anongit.freedesktop.org/~ensonic/gstreamer
Just updated the branch. While it works for apps quite nicely, there are still bugs: The benchmark for "./capsnego -f audio" (default -d4 -c3) sometimes hangs. I have taken many backtraces and can't see any caps function in there, so this might be a bug elsewhere. I have not seen hangs in the video case. Two unit test fail in -base. This is the bug I am right now looking into. Definitely related to lazy caps. Using GST_DEBUG="GST_CAPS:4" (which print and thus evaluates many caps 'fixes' the tests).
Okay, the bug is in transform_caps part. The current implementation is also questionable as it relies on basetransform subclasses not returning ANY. This is not a big deal as it does not seem to make sense to me (can we declare it as invalid?). This part can also become easier if we decide on adding a new transform_caps_whole() vmethod (see bug #619844).
Bugs fixed and branch updated. I'd appreciate testing and discussion.
Bugs fixed and branch updated. No more hangs in the benchmark.
Branch rebased.
Let's WONTFIX this for the time being, since it doesn't look like we're going to do this even in 0.11, and it's not clear if it will still be needed in 0.11 with the new audio/video caps that explode a little less. And also we can reduce some of the caps explosion by passing filtercaps to get_caps() now. It was a cool idea though (but perhaps a bit too complicated implementation-wise.)