GNOME Bugzilla – Bug 734679
videobox: Much slower than videocrop
Last modified: 2018-11-03 14:53:51 UTC
I have an application that needs to do both video cropping and padding and which works with 4K videos. We chose videobox for the task because its the only element that does reasonable padding and the fact that it can also crop was a major bonus. However, in time trials we've found that videobox is much, MUCH slower than videocrop, even when its just cropping. For example, using gstreamer 1.4.0, this pipeline runs on a stock ivy-bridge server driving a SMSC zero client display at about 45 fps: gst-launch-1.0 -v filesrc location=Sintel-4K.mkv ! decodebin ! videoconvert ! videocrop left=100 ! videoscale ! fpsdisplaysink sync=false video-sink="xvimagesink display=:2" The same pipeline, with videobox used instead of videocrop gives us an average of about 20 fps, which is a huge drop. Worse the video in question is 24 fps, so with videobox we cannot play realtime without dropping frames.
That might be because videobox is interpolating the I420 (or any other subsampled YUV format) planes properly while videocrop is just offsetting them.
Did you already check with perf or callgrind where all the time is spent?
Created attachment 283313 [details] callgrind output for the pipeline mentioned Generating command: $ valgrind --tool=callgrind gst-launch-1.0 -v filesrc location=~/Videos/4K/Sintel-4K.mkv ! decodebin ! videoconvert ! videobox left=100 ! videoscale ! fpsdisplaysink sync=false video-sink="xvimagesink display=:2"
I've not used callgrind before, and so I'm not entirely sure how to use its output, but the results seem to indicate that the vast majority of the CPU was spent in a routine called "gst_video_filter_transform <cycle 7>" I've attached the callgrind output in case that helps.
You can open it in e.g. kcachegrind for working visually with it. So in summary: - videoscale and decoder both take about 5-6 billion instructions - videobox takes 20 billion instructions in its transform function, all of that in copy_i420_i420 Time to optimize that function :)
It's doing matrix multiplication to convert between SDTV and HDTV YUV, and if no conversion is necessary it multiplies with the identity matrix.
That's something that happens in all the YUV functions btw
Created attachment 283371 [details] [review] videobox: Don't do matrix multiplication unless required
I tried the attached approach to get rid of the matrix multiplication, assuming that the compiler would optimize away the branches in the new macros. Benchmarking showed that it makes almost no difference. Any further ideas?
I don't know if it will help, but my boss has authorized me to put a $500 (Canadian) Bug Bounty on this bug, for anyone who can get this working on 4K video (3840x2160) at 30fps by Midnight on August 28, Calgary Time (MST=UCT-7) We'll do the testing on our stock Haswell i7 server. If you're interested in attempting the bounty, please contact us first, as we'd like to keep track on who all is interested in working on this. Send the emails to my work email: stirling@userful.com Thanks!
If you apply the patch of Bug 737401 it will use video-converter and be slightly faster than the old videobox.
We've abandoned video box and have written a one-pass video cropper, scaler, rotator and color space converter. We'll probably be releasing it soon, once the bugs are out.
> We have written a one-pass video cropper, > scaler, rotator and color space converter. Ah, so basically like the new video converter API in -base (plus rotation though).
(In reply to Tim-Philipp Müller from comment #13) > > We have written a one-pass video cropper, > > scaler, rotator and color space converter. > > Ah, so basically like the new video converter API in -base (plus rotation > though). I guess, although our algorithm is a heck of a lot simpler and involves no matrices. We just have an AVX2-enabled loop in which we, for each output frame pixel, read an input frame pixel, convert it to the output colorspace, and write it out. We have one such loop for each pair of colorspaces we support.
Should we close this bug then?
As far as I know, bug is still true, but it no longer matters to me. If you feel it worth pursing as a bug, keep it open. Otherwise close it.
(In reply to Stirling Westrup from comment #12) > We've abandoned video box and have written a one-pass video cropper, scaler, > rotator and color space converter. We'll probably be releasing it soon, once > the bugs are out. Could you please share this code with me ? I really need this and I am also facing same issue
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-good/issues/126.