After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 619806 - [basetransform] caps functions should be faster
[basetransform] caps functions should be faster
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal normal
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2010-05-27 10:54 UTC by Benjamin Otte (Company)
Modified: 2012-06-18 16:40 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Benjamin Otte (Company) 2010-05-27 10:54:23 UTC
This is fallout from bug 618853:

When running tests/benchmark/capsnego in a profile I noticed that a lot of time was spent inside basetransform doing rather weird things with caps. I'll take ffmpegcolorspace as an example of why this is very slow:

- In the getcaps function, peer caps are intersected with its template
This is absolutely not necessary, as the peer's getcaps function will return a subset of the template caps. (With asserts enabled, it even checks this.) If the peer is ffmpegcolorspace you'll mostly intersect the template caps with itself: 32x32 = 1024 structure intersections.

- In the getcaps function, transformed caps are intersected with template caps
This is unnecessary again, as transform_caps must return a subset of the template caps. It might make sense to keep a check with assertions enabled - though that is quite expensive. Again, 1024 structure intersections with ffmpegcolorspace.

- the algorithm for calling the transform function is slow
The current algorithm calls the transform function with every structure separately and then merges the result. This is often very slow. ffmpegcolorspace's transform function will expand the simple caps to its template caps and return them appended to the original, so it'll return 1060 structures via 32 caps that'll get merged into a 32 structures large caps.

All of this is considerable overhead taking more than 5 times the performance of the actual transform caps function.
Comment 1 Benjamin Otte (Company) 2010-05-27 11:18:30 UTC
(In reply to comment #0)
> - In the getcaps function, peer caps are intersected with its template
> This is absolutely not necessary, as the peer's getcaps function will return a
> subset of the template caps. (With asserts enabled, it even checks this.) If
> the peer is ffmpegcolorspace you'll mostly intersect the template caps with
> itself: 32x32 = 1024 structure intersections.
> 
Whoops, this statement is wrong: We intersect with _our_ template caps, not with the peer's. I guess this intersection is necessary then and we cannot get around it?
Comment 2 Sebastian Dröge (slomo) 2010-05-27 14:45:00 UTC
I think you can't change any of these...

1) is necessary to only pass valid caps (for the element) to the transform_caps function

2) is necessary because transform_caps often just removes all fields it doesn't care about but you still want them in the final caps (e.g. ffmpegcolorspace removes all format specific things but keeps width/height/par/fps)

3) Would be nice to have changed but can't be changed now without breaking many elements. Many basetransform elements assume that they get simple caps in transform_caps, some of them even have an assertion for this
Comment 3 Benjamin Otte (Company) 2010-05-27 15:14:56 UTC
2) is a bug in the elements and should be fixed IMO.

3) is something that could be fixed using a flag or something - I don't think fixing it unconditionally is possible unfortunately.

I also forgot to add:
With the 3 changes above hacked in and fixing videoscale to handle non-simple caps, tests/benchmark/capsnego -v from bug 618853 went to 1.2s for me, which is a tremendous improvement.
Comment 4 Benjamin Otte (Company) 2010-05-27 15:22:23 UTC
Of course, without a previous value that number is pretty useless.
Results are as follows for tests/benchmark/capsnego -v (compiled with -O0 -g):
8.2s - stock compilation
6.9s - disable subset check in gst_pad_get_caps_unlocked()
1.2s - also do the 3 things discussed above
Comment 5 Vincent Penquerc'h 2011-10-20 15:54:22 UTC
Some caching was added in https://bugzilla.gnome.org/show_bug.cgi?id=662291.
Comment 6 Vincent Penquerc'h 2011-11-01 11:16:29 UTC
And some speedup n caps intersection in https://bugzilla.gnome.org/show_bug.cgi?id=662777
Comment 7 Vincent Penquerc'h 2011-12-09 15:47:45 UTC
Can you check how the speedup with the two above patches compare with your target performance, and if that's enough (if that's possible :P) ?
Comment 8 Akhil Laddha 2012-03-06 08:00:14 UTC
Benjamin, did you get chance to test above patches ?
Comment 9 Akhil Laddha 2012-06-18 16:40:23 UTC
closing this bug in favor of bug 662291 and bug 662777