After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 764169 - vp9dec: Dogslow VP9 4k playback with libvpx, works fine with avdec_vp9
vp9dec: Dogslow VP9 4k playback with libvpx, works fine with avdec_vp9
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal normal
: 1.8.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-03-24 19:04 UTC by Jean-François Fortin Tam
Modified: 2016-03-25 12:54 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
vpxdec: Use threads on multi-core systems (986 bytes, patch)
2016-03-24 23:27 UTC, Nicolas Dufresne (ndufresne)
none Details | Review
vpxdec: Use threads on multi-core systems (987 bytes, patch)
2016-03-24 23:28 UTC, Nicolas Dufresne (ndufresne)
committed Details | Review

Description Jean-François Fortin Tam 2016-03-24 19:04:52 UTC
Totem can't play this at more than ~0.1 fps (or something like that):

http://jeff.ecchi.ca/public/vp9-4k-sample-from-red-dragon-6k.webm

Tested with libvpx 1.4.0. CPU usage suggests it's single-threaded.
VLC plays the file butter-smooth, using as many CPU threads as needed.

Specs are:

- Intel Xeon W3520 @ 2.67GHz × 8
- 24 GB of RAM
- Radeon HD 7770 running the open-source drivers (Gallium 0.4, DRM 2.43.0) on X
Comment 1 Tim-Philipp Müller 2016-03-24 19:50:36 UTC
vlc uses the ffmpeg VP9 decoder, we use the libvpx one.

Commit b848c1b6 (vpxdec: Use threads on multi-core systems) was supposed to configure libvpx to use multiple threads. Either it's not implemented in the VP9 decoder in the version of libvpx that I use (1.5.0) or it doesn't work very well.
Comment 2 Nicolas Dufresne (ndufresne) 2016-03-24 23:14:45 UTC
I have fixed the threading and the frame copy in 1.8, could you retest ?
Comment 3 Nicolas Dufresne (ndufresne) 2016-03-24 23:21:02 UTC
Ok, someone manage to remove the code that calls g_get_num_processors() ...
Comment 4 Nicolas Dufresne (ndufresne) 2016-03-24 23:23:27 UTC
Was lost when vp8/vp9 was merged into a common base class.
Comment 5 Nicolas Dufresne (ndufresne) 2016-03-24 23:27:26 UTC
Created attachment 324730 [details] [review]
vpxdec: Use threads on multi-core systems

This is a redo of commit b848c1b6ffd1e508228820a013f94fb445e4777f. The
code was lost when the elements where ported to use a baseclass.
Comment 6 Nicolas Dufresne (ndufresne) 2016-03-24 23:28:26 UTC
Created attachment 324731 [details] [review]
vpxdec: Use threads on multi-core systems

This is a redo of commit b848c1b6ffd1e508228820a013f94fb445e4777f. The
code was lost when the elements where ported to use a baseclass.
Comment 7 Nicolas Dufresne (ndufresne) 2016-03-24 23:33:08 UTC
Comment on attachment 324731 [details] [review]
vpxdec: Use threads on multi-core systems

Attachment 324731 [details] pushed as 284d723 - vpxdec: Use threads on multi-core systems
Comment 8 Nicolas Dufresne (ndufresne) 2016-03-24 23:34:11 UTC
Leaving open as it should have worked in 1.8, so next step is to merge there.
Comment 9 Sebastian Dröge (slomo) 2016-03-25 08:45:03 UTC
Merged into 1.8
Comment 10 Tim-Philipp Müller 2016-03-25 10:33:47 UTC
Does it actually play smoothly for anyone with vp9dec and this fix in git master?
Comment 11 Sebastian Dröge (slomo) 2016-03-25 10:45:57 UTC
It has some stuttering and dropped frames in the first 10 seconds, then it plays smooth for me.
Comment 12 Sebastian Dröge (slomo) 2016-03-25 10:48:37 UTC
vlc plays it smoothly all the time, but it also prints vaapi things on the terminal. When using gstreamer-vaapi it plays smooth after ~20 seconds here.
Comment 13 Nicolas Dufresne (ndufresne) 2016-03-25 12:54:39 UTC
I don't have enough cores to test Jeff setup, but when I added this code, I was comparing against Chrome 720p CPU usage on a Quad Core ARM. With the memory stuff, it was making it equivalent.

The 10 second might be something that we can solve with a min threshold in buffering. VP9 decoding cost (same for most codec) is not uniform in term of require processing resources. If you have too much variation, or the first part is simply more expensive, you get stutter. Shall we re-open for more investigation ?