GNOME Bugzilla – Bug 667653
videoencoder: Autodetect/autoconfigure multicore/multithread use
Last modified: 2018-11-03 11:20:22 UTC
As initially reported in pitivi bug #573395, there are various codecs where the user has to manually set the amount of threads to use when encoding. Theora, x264 and VP8 seem to behave differently or in strange ways. I haven't tested other codecs yet. Users should never have to set those things manually. GStreamer encoders should use the multiple cores efficiently by default, no matter what the codec.
A quick check to make sure this is still current with the latest gst, GES and pitivi git: vp8, theora, xvid, mpeg2enc, dirac, x264enc... they all fail to fully use my CPU by default.
What's a good default then? Use as many threads as there are cores available? What happens if you encode two streams at the same time? What if you want to keep one core "idle" for other tasks? It's definitely not as easy as setting the number of threads to the number of available cores. Also for some codecs (e.g. VP8) it makes a quality/bitrate difference if you use multiple threads or not.
Maybe a bit complex, but some elements could have a "num-threads" property with limits that match what they can reasonably do, defaulting to 1. This would allow an automated way to allocate more threads in a pipeline by trawling all the elements and bumping num-threads on some of those that have a max > 1 by the number of available cores. Since this could be automated, you could leave the decision on the bumping to the high level layer. Also, it becomes easy to add that num-threads property on new elements as they become multithread able, without impacting the higher level "load balancing" code.
Well, for what it's worth, all the other video editors/encoders I've seen (in the commercial world *and* the open source world) default to using all the cores/threads they can when rendering. Especially with HD footage being common these days, you simply can't expect reasonable render times with a single thread. To give you some background perspective, circa 2004, with a single thread processor, when I rendered a 25 minutes timeline (not even in HD and not in a heavily compressed format, and that was with a professional commercial video editor), I started the render process and went to sleep, because it would take a couple of hours or the whole night. With those things in mind, as a video editor myself, I still see encoding as an "expensive, ultimate, fire ze missiles" operation where I *expect* my computer to be pegged down for a while with fans whirring full blast. I think most video editors expect this kind of behavior too. As for the question of "multiple streams"... correct me if I'm wrong, but video encoding is by far the most expensive operation. Encoding audio/subtitles/etc is negligibly fast compared to crunching multi-million pixels with complex algorithms multiple times per second. If you want to be absolutely safe and ensure that you can still watch 1080p videos on youtube at the same time as you're rendering a 4K video, maybe the default could be "maximum_threads -1" or "number_of_cores +1" or something... but a default of 1 is surely not a good default.
My point was that the high layer (eg, Pitivi or whatever) could decide to allocate threads to whatever (eg, look at the "class string" for each element that can do more than one thread, and start giving more threads to encoders, etc). Defaulting to 1 in each element is best because elements do not know what else is in the pipeline, nor should they. There could certainly be helpers in GstBin or GstPipeline for "distributing N threads among contained elements" so that a program may decide to allocate a thread per core with a single call.
Ah right, I had misread your comment and somehow forgot about the pipeline. So this makes it a standardization issue then... GStreamer should basically expose *one* "num-threads" property for all those encoders and we would do whatever we want with it on the application side (like burn users' CPUs :).
After further investigation with gst-inspect and various discussions: vp8enc: "threads" property = int range(0, 64), default 1 x264enc: "threads" property = int range(0, 4), default 0 (automanaged) theoraenc: none (I'm told libtheora will only have threading in 1.3) xvidenc: none mpeg2enc: none diracenc: none schroenc: none (I am told it is automanaged) ffmpeg plugins: automanaged, but gst-ffmpeg sets the default to 1* entropywave plugins: automanaged etc. *: http://gstreamer-devel.966125.n4.nabble.com/RELEASE-GStreamer-FFmpeg-Plug-ins-0-10-12-quot-A-year-in-hell-quot-td3680441.html#a3735368 ...So now I guess I'm left wondering what the next step is. Just put a "special case" hack in pitivi to force the number of threads in vp8enc to the desired maximum?
For a start in 0.11 we can still change the interface. Which of the encoders have have a property to set the threads already? Which encoders are multithreaded, but miss the property? Such a property could be moved to a base-encoder class, where 0 means auto (e.g. as many threads as cores) and 1 being the default. Encoders not yet using a base class could still use a property of the same name and with them same semantics. technically we could introduce an interface, but it seems to be overkill.
Definitely a project for basevideocodec.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/59.