GNOME Bugzilla – Bug 662777
Caps intersect optimisations
Last modified: 2017-07-14 13:52:14 UTC
Created attachment 200032 [details] testcase When starting a video call with farsight using a cmaera with lots of different resolutions caps negotiation starts to take forever. I tracked it down to one particular intersection in basetransform that takes about 10 seconds for some reason. Attached is a small testcase with the caps (a.caps and b.caps) inside that i happen to see on my pipeline startup. In the previous release this intersection would take about 0.5 seconds, with current git it takes 10... To simply test compile the test.c inside the tarball as follows: gcc -Wall -g $(pkg-config --cflags --libs gstreamer-0.10) test.c -o test and try running with both released and git versions of gstreamer :)
Created attachment 200091 [details] [review] gststructure: rejig test ordering for speed
Created attachment 200092 [details] [review] gstvalue: quicker test for substraction emptiness When we do not care about the actual resulting set, but only whether it is empty of not, we can skip a fair bit of GValue juggling. Add a function that does so, since we cannot just pass NULL to the existing API as it may be part of the API contract.
Created attachment 200093 [details] [review] gststructure: early out when we know a value cannot be a subset If two values can be ordered, but are unequal, they are necessarily distinct, thus one cannot be a subset of the other.
Created attachment 200094 [details] [review] gststructure: simplify return statement in gst_structure_can_intersect
Created attachment 200095 [details] [review] gstvalue: quicker version of intersection when we do not need the result
Created attachment 200096 [details] [review] gstcaps: use a simplified copy of caps2 to intersect with It speeds up a lot, and still conserves the ordering guarantee against the first caps, though does not keep the exact same behavior as the existing one, since the returned caps will be possibly smaller (though still equivalent).
Created attachment 200097 [details] [review] gstcaps: remove unneeded use of gint64
Your test case runs in 29 seconds here. First patch makes it 20 seconds. Second patch makes it 12 seconds. Third patch makes it 8 seconds. Fourth patch is cleanup. Fifth patch speeds up intersection in the same way as the second one does for substraction, but is unused by the test case (and it's fast anyway, but it's an easy port from the substraction one, might want to leave it out). Sixth patch makes it less than 1 second, but does change the semantics of the intersection, since the second caps (which do not have a documented ordering guarantee) are simplified, and the resulting set will be smaller (or equal) to the resulting set returned by the current code. Seventh patch is cleanup. Simplyfying both input caps makes the entire thing about 2 seconds without any of those patches (but loses the ordering guarantee). Passes tests, so hopefully correct.
Created attachment 200341 [details] [review] gstcaps: add a routine to simplify caps while preserving order This will obviously simplify less (or equally) that the existing gst_caps_do_simplify version.
Created attachment 200342 [details] [review] gstcaps: use a simplified copy of caps2 to intersect with It speeds up a lot, and still conserves the ordering guarantee against the first caps, though does not keep the exact same behavior as the existing one, since the returned caps will be possibly smaller (though still equivalent).
Created attachment 200343 [details] [review] gstcaps: use a simplified copy of caps2 to intersect with This patch still conserves the ordering guarantee against both caps, though does not keep the exact same behavior as the existing one, since the returned caps will be possibly smaller (though still equivalent).
This version of the simplification patch now keeps the ordering guarantee, though does not speed up that particular test case. It might speed up in other cases though, when the input caps are large and redundant, but not already sorted by increasing generality.
Created attachment 200344 [details] [review] gstcaps: use a simplified copy of input caps for intersection This patch still conserves the ordering guarantee against both caps, though does not keep the exact same behavior as the existing one, since the returned caps will be possibly smaller (though still equivalent).
Just occured to me that since ordering is preserved, this can be done for both input caps, patch updated.
Created attachment 200346 [details] [review] gstcaps: add a routine to simplify caps while preserving order add gst_caps_do_simplify_keep_ordering to docs and win32 build
Comment on attachment 200092 [details] [review] gstvalue: quicker test for substraction emptiness Please change the function name to something like gst_value_can_substract(). Or maybe even make this private (add it to gstprivate.h and call it _priv_gst_value_blabla()).
Comment on attachment 200095 [details] [review] gstvalue: quicker version of intersection when we do not need the result Please rename to _can_intersect() or make it private. Same as for the subtraction function. Also, what's the difference to gst_value_can_intersect()? ;)
(In reply to comment #17) > (From update of attachment 200095 [details] [review]) > Please rename to _can_intersect() or make it private. Same as for the > subtraction function. > > Also, what's the difference to gst_value_can_intersect()? ;) And the Since marker is missing in the docs. Also if you add API, add the "API" keyword to the commit message and put it into the docs (docs/gst/*section.txt) and into win32/common/*.def.
Comment on attachment 200344 [details] [review] gstcaps: use a simplified copy of input caps for intersection Looks good but you still have the old commit message. This version now preserves the order. And it's not only useful for _intersect_first() but also for the other intersect() variant so add it there too. Did you check if this really speeds up something? The caps are simpler but you're creating duplicates of the caps
Comment on attachment 200346 [details] [review] gstcaps: add a routine to simplify caps while preserving order Looks good but also add this to the docs... and API keyword in the commit message
Created attachment 200882 [details] [review] gstvalue: quicker version of intersection when we do not need the result
Created attachment 200883 [details] [review] gstvalue: quicker test for substraction emptiness When we do not care about the actual resulting set, but only whether it is empty of not, we can skip a fair bit of GValue juggling. Add a function that does so, since we cannot just pass NULL to the existing API as it may be part of the API contract.
(In reply to comment #16) > (From update of attachment 200092 [details] [review]) > Please change the function name to something like gst_value_can_substract(). Or > maybe even make this private (add it to gstprivate.h and call it > _priv_gst_value_blabla()). Actually, based on discussion on IRC, I merged it with the non _null version, which now accepts NULL dest.
(In reply to comment #21) > Created an attachment (id=200882) [details] [review] > gstvalue: quicker version of intersection when we do not need the result Same here, _null version merged with non _null version.
(In reply to comment #19) > (From update of attachment 200344 [details] [review]) > Looks good but you still have the old commit message. This version now > preserves the order. And it's not only useful for _intersect_first() but also > for the other intersect() variant so add it there too. > > Did you check if this really speeds up something? The caps are simpler but > you're creating duplicates of the caps It does not speed up anything on the test case I have here, as the caps are already ordered in strict increasing generality order. The simplification itself does slow down a little bit the whole operation (from 8.2 seconds to 8.4 seconds here, though timings have a fair bit of jitter), but since a full simplification makes it less than a second, I expect that lesser simplification would help in some cases. I'd tried a nifty pre-sort and post-merge with bins based on structure name, to reduce the N in O(N^2), but unfortunately this did not help, as the actual looping is very fast compared to the few hits that take all the time, so the number of costly ops is still the same :( Up to you if you want to leave that patch out since I don't have an actual test case where it helps.
Created attachment 200885 [details] [review] [API] gstcaps: add a routine to simplify caps while preserving order This will obviously simplify less (or equally) that the existing gst_caps_do_simplify version.
(In reply to comment #20) > (From update of attachment 200346 [details] [review]) > Looks good but also add this to the docs... and API keyword in the commit > message Keyword added. The docs are already in, unless you mean something else that the addition to the list of routines ?
Created attachment 200887 [details] [review] gstcaps: add a routine to simplify caps while preserving order This will obviously simplify less (or equally) that the existing gst_caps_do_simplify version. API: gst_caps_do_simplify_keep_ordering()
For me things become a bit slower with the remaining two patches (before and after): GST_DEBUG_NO_COLOR=1 GST_DEBUG="bt-core:3" ~/buzztard/bin/buzztard-cmd -c p -i /home/ensonic/Dropbox/buzztard-songs/sunday.bzt 2>&1 | grep on_song_state_changed": playback started" 0:00:00.695180015 28916 0x24990f0 INFO bt-core song.c:538:on_song_state_changed: playback started 0:00:00.690175289 28950 0x228b0f0 INFO bt-core song.c:538:on_song_state_changed: playback started 0:00:00.698121161 29015 0x101c0f0 INFO bt-core song.c:538:on_song_state_changed: playback started 0:00:00.700914970 31222 0xab20f0 INFO bt-core song.c:538:on_song_state_changed: playback started 0:00:00.712962907 31256 0xb260f0 INFO bt-core song.c:538:on_song_state_changed: playback started 0:00:00.704535266 31289 0xc0b0f0 INFO bt-core song.c:538:on_song_state_changed: playback started I think this is because of the overhead if the caps cannot be simplified. Would it make sense to check if the caps are fixed or size=1 before doing the copy and calling gst_caps_do_simplify_keep_ordering() ? I'll retest this tonight on my netbook.
If the caps are fixed or size==1, the copy and simplification will probably be really fast, so I'm doubtful if that would be what you see (I don't know what those timings represent, and what the jitter is). The suggestion does make sense though, so Iĺl post an update to do that.
Created attachment 202612 [details] [review] gstcaps: use a simplified copy of input caps for intersection This patch still conserves the ordering guarantee against both caps, though does not keep the exact same behavior as the existing one, since the returned caps will be possibly smaller (though still equivalent).
I need to investigate. For me this is still a tiny bit slower. Also this patch has drifted and does not apply with git am (patch -F10 worked). gstcaps: add a routine to simplify caps while preserving order
Created attachment 202998 [details] [review] gstcaps: add a routine to simplify caps while preserving order This will obviously simplify less (or equally) that the existing gst_caps_do_simplify version. API: gst_caps_do_simplify_keep_ordering()
Created attachment 202999 [details] [review] gstcaps: use a simplified copy of input caps for intersection This patch still conserves the ordering guarantee against both caps, though does not keep the exact same behavior as the existing one, since the returned caps will be possibly smaller (though still equivalent).
(In reply to comment #32) > I need to investigate. For me this is still a tiny bit slower. Also this patch > has drifted and does not apply with git am (patch -F10 worked). > gstcaps: add a routine to simplify caps while preserving order Patches updated to latest master.
did someone tracked the commit that caused this? Was it: commit f56c6e12255b37d75b1eb949e434fa8e3bb33f51 basetransform: In getcaps() prefer the caps order and caps of downstream if possible ?
I think it was, yes... It was a patch from slomo which would cause caps calls to go through more elements than before IIRC, but memory's now a bit hazy. The discussion was all on IRC though.
I wonder if there is any reason to have a _simplify() function that doesn't preserver order. If not, we should probably replace it with the one that keeps the order (in 0.11)
> I wonder if there is any reason to have a _simplify() function that doesn't > preserver order. If not, we should probably replace it with the one that keeps > the order (in 0.11) Now would be the time to do that in 0.11
Is this still relevant?
Maybe we still want to add a _simplify() that preserves order? Just so it's there to use in 2.0 ;)
Let's close this.