After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 770225 - multiqueue: High CPU usage with multiple audio tracks in playbin3
multiqueue: High CPU usage with multiple audio tracks in playbin3
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal blocker
: 1.9.2
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-08-22 09:07 UTC by HoonHee Lee
Modified: 2016-08-30 09:30 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
multiqueue: Re-compute the high time outside of loop (1.15 KB, patch)
2016-08-23 06:02 UTC, HoonHee Lee
rejected Details | Review
multiqueue: Fix high_time wakeup logic (7.42 KB, patch)
2016-08-25 10:40 UTC, Edward Hervey
committed Details | Review

Description HoonHee Lee 2016-08-22 09:07:13 UTC
Dear Edward Hervey, Tim-Philipp Müller and Jan Schmidt
 
In playbin3(decodebin3) with GST v1.9.1, CPU usage is more high compared with playbin2(decodebin2) when multiple audio track contents.
I tested 1-video and 8-audio stream by launching gst-launch.
 
Multiqueue is set as below in playbin3.
=> g_object_set (dbin->multiqueue, "sync-by-running-time", TRUE, "use-interleave", TRUE, "max-size-buffers", 0, NULL);
In that case, CPU usage is around 210%. And most of them are related with srcpads of multiqueue.
 
In playbin2(decodebin2), CPU usage is around 15%.
 
And I turned off the 'sync-by-running-time' in MQ and CPU usage is similar with playbin2.
 
I am guessing 'sync-by-running-time' mode consumes much more CPU usage than playbin2.
 
 
1) Do you have similar phenomenon?
2) Do you have a plan to reduce CPU usage when using "sync-by-running-time"?
3) May I get any opinions or guide to reduce CPU usage?
 
Please refer https://harmony.lge.com:8443/issue/browse/CTRGST-288
or http://gstreamer-devel.966125.n4.nabble.com/High-CPU-usage-in-multiple-audio-track-in-playbin3-td4679179.html
 
I am always welcome any questions and opinions.
 
Thanks.
Comment 1 HoonHee Lee 2016-08-23 06:02:39 UTC
Created attachment 333974 [details] [review]
multiqueue: Re-compute the high time outside of loop
Comment 2 HoonHee Lee 2016-08-23 06:07:28 UTC
Dear All
 
I found out that to re-computing the high time after waking up in while loop causes high CPU usage.
  
I confirmed that CPU usage is similar with playbin2(non 'sync-by-running-time' mode) after applying this patch.
 
Please check and review this patch.
 
I am always welcome any questions, opinions and ideas.
 
Thanks.
Comment 3 Edward Hervey 2016-08-23 15:12:07 UTC
Hi,

  While the intention of the patch is correct, it doesn't take into account the "high_time by group" but only the global high_time.
Comment 4 Edward Hervey 2016-08-24 07:19:38 UTC
Note that compute_high_id also does also introduce a noticeable overheader in that loop.

So the idea here should be to:
1) only call compute_high_{time|id} when one of the parameters involved in the calculation changed.
2) And only call that function from wherever one of the values changed (that waiting loop is *not* modifying any of the parameters involved)

Readers that don't modify the values (such as that waiting loop function) would only need to check against the stored value.

Values involved in both compute_high_id and compute_high_time:
* single queues being added/removed
* single queue nextid/oldid being updated
* single queue flow return *changing* (to/from GST_FLOW_NOT_LINKED or GST_FLOW_EOS)
* after pushing a GST_EVENT_EOS
* after flushing

Values involved in compute_high_id:
* single queue nextid/oldid being updated

Values involved in compute_high_time:
* single queue next_time/last_time being updated

The potentially tricky part is regarding the groupid feature for compute_high_time. We might to store the high_time by groupid in addition (maybe with a lookup table)
Comment 5 Edward Hervey 2016-08-25 08:17:19 UTC
After digging deeper, moving the computation out of the loop still results in very high CPU usage (playing a 8 video/8 audio file, results in 212% cpu usage of which only 1% is video decoding).

The culprit seems to be an insane amount of futex usage.
Comment 6 Edward Hervey 2016-08-25 08:36:12 UTC
And after digging even deeper, the real culprit was ... wake_up_next_non_linked doing wrong calculations/checks.

I'll provide a patch soon
Comment 7 HoonHee Lee 2016-08-25 09:26:10 UTC
Hello Edward.
Thanks for your support.
I am looking forward to get the patch.
 
Thanks.
Comment 8 Edward Hervey 2016-08-25 10:40:38 UTC
Created attachment 334122 [details] [review]
multiqueue: Fix high_time wakeup logic

When calculating the high_time, cache the group value in each singlequeue.

This fixes the issue by which wake_up_next_non_linked() would use the global
high-time to decide whether to wake-up a waiting thread, instead of the group
one, resulting in those threads constantly spinning.

Tidy up a bit the waiting logic while we're at it.
Comment 9 Edward Hervey 2016-08-25 10:45:40 UTC
With this patch, we go from 212% playing a 8 audio / 8 video file down to less than 10% (most of it being the video decoding).
Comment 10 HoonHee Lee 2016-08-25 12:23:23 UTC
Hello Edward.
Awesome!!
 
It works fine in desktop version and our hardware target after applying your patch.
And I confirmed that CPU usage much more was reduced.
Also, changing track for audio works well.


 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+                                                   
 2408 hoonhee+  20   0 1850508 524488  65316 R  12.9  1.6 333:04.86 compiz                                                                                                               
10209 hoonhee+  20   0 2128104 146252  26348 S  10.6  0.4   0:11.80 lt-gst-play-1.0
 
Thanks for your effort.
Comment 11 Sebastian Dröge (slomo) 2016-08-30 09:29:20 UTC
Attachment 334122 [details] pushed as 3117525 - multiqueue: Fix high_time wakeup logic