GNOME Bugzilla – Bug 690937
alsasink: playing video files with gst-launch causes CPU 100%
Last modified: 2013-01-24 14:20:35 UTC
Created attachment 232444 [details] [review] patch for the bug Overview:When playing some video files with gst-lauch, the cpu usage will go up to nearly 100%. Steps to Reproduce: 1)gst-launch playbin2 uri=file:///video/[V_H264][A_AAC_LC].m4v Actual Results:The cpu usage is nearly 100% Expected Results: the cpu usage is about 10% platform: iMx53 board derived from one from FSL The PCM stream output frequency is 48KHz. additional information: gst-launch filesrc location= [V_H264][A_AAC_LC].m4v typefind=true ! aiurdemux name=demux demux. ! queue max-size-buffers=0 max-size-time=0 ! mfw_vpudecoder ! mfw_v4lsink demux. ! queue max-size-buffers=0 max-size-time=0 ! mfw_aacdecoder ! fakesink when playing problematic files with the above pipeline, the cpu usage is normal gst-launch filesrc location= [V_H264][A_AAC_LC].m4v typefind=true ! aiurdemux name=demux demux. ! queue max-size-buffers=0 max-size-time=0 ! mfw_vpudecoder ! mfw_v4lsink demux. ! queue max-size-buffers=0 max-size-time=0 ! mfw_aacdecoder ! alsasink when playing problematic files with the above pipeline, the cpu usage is nearly 100%. The frequency of audio stream on problematic media files is 44.1KHz. We have created an patch for the problem.
We are having this same behaviour on the XOs machines. There is a bug report filled here: * https://dev.laptop.org/ticket/12450 I tested this with gst-launch-0.10 and gst-launch-1.0 with the same 'test.ogv' file. 0.10 works without problem but 1.0 consumes 100% and makes some weird noise at that moment (when it reaches the 100%)
Created attachment 233457 [details] [review] updated patch As Manuel noted above, we have also been hit by this on OLPC XO hardware (some video files unplayable with to excessive CPU usage observed) and it is a clear regression over gstreamer-0.10. Sample video that exhibits this problem: https://dl.dropbox.com/u/2646578/OGV/test.ogv yanghuolin's patch solves the issue: the video is now playable, and CPU usage is back to normal. Here is an updated version fixing some coding style issues and updating against the latest sources.
What (if any) is the rationale for introducing a second lock instead of using GST_ALSA_SINK_LOCK which is already taken for snd_pcm_wait(), snd_pcm_writei() and snd_pcm_prepare/_resume() ?
Created attachment 234034 [details] [review] alternative patch that shares the lock I don't have very extensive knowledge of ALSA, so I'm not immediately sure of the the answer there. But, through some experimentation: without taking yanghuolin's patch, simply taking GST_ALSA_SINK_LOCK while calling snd_pcm_delay() does not solve the issue. Now the audio plays OK but the video plays back with a framerate of about 1 new frame being shown every 10 seconds (and plenty of "A lot of buffers are being dropped" warnings). Through reading the source, I estimate that this is caused because GST_ALSA_SINK_LOCK is taken while we call snd_pcm_wait(), which is a function that actually sleeps until buffer space is available. I guess that prevents the snd_pcm_delay() call from executing within any reasonable timeframe, causing the above effect. Experimenting further, we *can* share the single ALSA lock if we drop the lock while sleeping, as in the attached patch. Now the video and audio plays back fine. However, I'm not sure of the implications of dropping the lock while calling snd_pcm_wait() are - is this 100% safe?
Maybe dropping the lock in that case is not safe, because perhaps gst_alsasink_reset() could then reset the handle in the middle of execution of gst_alsasink_write(), which doesn't sound like a nice situation. If that's true, it suggests that having two locks would be necessary.
I commited the one with two locks. commit db7ff0e290274d75172aef2544df70e653feb9ea Author: yanghuolin <Huolin.Yang@delphi.com> Date: Thu Nov 15 03:31:47 2012 -0500 alsasink: don't use 100% CPU The root cause is that alsa-lib is not thread safe for the same handle. There are two threads in the gstreamer accessing alsa-lib not serilized. The race condition happens when one thread holds the old framebuffer app_ptr position in the kernel, another thread advances the framebuffer app_ptr. when the former thread is scheduled to run again, it overwrites the app_ptr to old value by copying from kernel.Thus,the app_ptr in the upper alsa-lib(pcm_rate) become one period size more advanced than the lower alsa-lib(pcm_hw & kernel). gstreamer uses noblock and poll method to communicate with the alsa-lib. The app_ptr unsync situation as described above makes the poll return immediately because it concludes there is enough space for the ring-buffer via the low-level alsa-lib. The write function returns immediately because it concludes there is not enough space for the ring-buffer from the upper-level alsa-lib. Then the loop of poll and write runs again and again until another period size is available for ring-buffer.This leads to the cpu 100 problem. delay_lock is used to avoid the race condition. Fixes: https://bugzilla.gnome.org/show_bug.cgi?id=690937