GNOME Bugzilla – Bug 605594
relatively high cpu usage compared to other players
Last modified: 2010-06-22 03:19:27 UTC
Playing an audio file (vorbis, mpeg) using gstreamer (via gst-launch, rhythmbox and so on) I have a higher cpu than using other players, e.g. mplayer, mpg123 and so on. The cpu usage is always two to three times higher than any other player I've tested. time mplayer some_file.ogg; time gst-launch-0.10 playbin uri=file:///"`pwd`/somefile.ogg" ========================================================================== Öffne Audiodecoder: [ffmpeg] FFmpeg/libavcodec audio decoders AUDIO: 44100 Hz, 2 ch, s16le, 192.0 kbit/13.61% (ratio: 24000->176400) Ausgewählter Audiocodec: [ffvorbis] afm: ffmpeg (FFmpeg Vorbis) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: kein Video Starte Wiedergabe... A: 331.3 (05:31.2) of 331.4 (05:31.4) 1.0% real 5m31.586s user 0m6.428s sys 0m1.804s (gst-launch-0.10:21852): GLib-WARNING **: g_set_prgname() called multiple times Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstAudioSinkClock Got EOS from element "playbin0". Execution ended after 331431382411 ns. Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ... real 5m32.227s user 0m21.937s sys 0m2.136s In this case gstreamer uses 3.4 times more cpu time to play a simple ogg file (using 6.6 % of the cpu power). While 6.6 % is not much, it drains more battery power of a notebook. This is reproducible (at least) using an ATOM processor and gentoo-x86 as system. Everything was compiled using -O2. Processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 28 model name : Intel(R) Atom(TM) CPU N280 @ 1.66GHz stepping : 2 cpu MHz : 1667.000 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 apicid : 1 initial apicid : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx constant_tsc arch_perfmon pebs bts pni dtes64 monitor ds_cpl est tm2 ssse3 xtpr pdcm movbe lahf_lm bogomips : 3324.91 clflush size : 64 power management:
Forgot to mention: gstreamer is 0.10.24 Same measurement for mp3 file: FN=somefile.mp3 time mplayer "$FN"; time gst-launch-0.10 playbin uri=file:///"`pwd`/$FN"; time mpg123 "$FN" MPlayer SVN-r29463-4.3.4 (C) 2000-2009 MPlayer Team [cut] ========================================================================== Öffne Audiodecoder: [mp3lib] MPEG layer-2, layer-3 AUDIO: 44100 Hz, 2 ch, s16le, 192.0 kbit/13.61% (ratio: 24000->176400) Ausgewählter Audiocodec: [mp3] afm: mp3lib (mp3lib MPEG layer-2, layer-3) ========================================================================== [cut] real 7m8.355s user 0m10.949s sys 0m2.488s (gst-launch-0.10:3415): GLib-WARNING **: g_set_prgname() called multiple times Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstAudioSinkClock Got EOS from element "playbin0". Execution ended after 428212510743 ns. Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ... real 7m9.236s user 0m25.162s sys 0m2.296s High Performance MPEG 1.0/2.0/2.5 Audio Player for Layers 1, 2 and 3 version 1.9.2; written and copyright by Michael Hipp and others free software (LGPL/GPL) without any warranty but with best wishes [cut] MPEG 1.0 layer III, 192 kbit/s, 44100 Hz joint-stereo Note: Illegal Audio-MPEG-Header 0x41504554 at offset 0x9cd6f1. Note: Trying to resync... Note: Hit end of (available) data during resync. [cut] real 7m8.331s user 0m12.433s sys 0m1.532s In this case I get: app: relative cpu usage: mplayer 1.0 gstreamer 2.3 mpg123 1.1
Please confirm that you use the same audiosinks? What about using -ao null for mplayer and fakesink on gstreamer. this way you could also try unsync decoding (as fast as possible). Just add sync=false for fakesink. That is time gst-launch-0.10 playbin2 uri=file:///"`pwd`/somefile.ogg" audiosink="fakesink sync=false" ALso you use old playbin, try playbin2. Finally how is you gstreamer build. Is it a debug build or are you (or your distro) use e.g. CFLAGS="-DG_DISABLE_ASSERT -DG_DISABLE_CAST_CHECKS" Also ensure you use the matching -mtune -march flags for gcc if you aim for best cpu usage for a certain target platform.
Two more comments. Run the commands always twice and take the second meassuremrnt to lower cache impacts on measurements. For gst-launch, run gst-launch-0.10 (to skip the parallel version switching) and add --gst-disable-registry-update to not meassure that (this is a one time application init overhead).
> Please confirm that you use the same audiosinks? They do, both use the alsa audio sink. > Finally how is you gstreamer build. Is it a debug build or are you (or your distro) use e.g. > CFLAGS="-DG_DISABLE_ASSERT -DG_DISABLE_CAST_CHECKS" Using gentoo the package management compiles everything possible from source. Therefore I used the following CFLAGS: -march=k8 -msse -msse2 -msse3 -mssse3 -mfpmath=sse -ftree-vectorize -O2 -pipe This is a good set of flags for my ATOM netbook. march=k8 is intentional, as it seems to be faster than optimizing for pentium-m. Therefore the tests were sort of a debug build. I rebuilt gstreamer using CFLAGS=-march=k8 -msse -msse2 -msse3 -mssse3 -mfpmath=sse -ftree-vectorize -O2 -pipe -DG_DISABLE_ASSERT -DG_DISABLE_CAST_CHECKS And added --disable-trace to the configure flags. In total they where: --prefix=/usr --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --disable-static --disable-dependency-tracking --enable-nls --disable-valgrind --disable-examples --disable-trace --disable-tests --with-package-name="GStreamer ebuild for Gentoo" --with-package-origin=http://packages.gentoo.org/package/media-libs/gstreamer ############### Here some benchmarks using this (and the old build): Setting fakesink using playbin2 seems not to work. Therefore I used the following: #### gst (with asserts, async) time gst-launch-0.10 filesrc location=tailspin.ogg ! oggdemux ! vorbisdec! fakesink sync=false [...] real 0m9.168s user 0m9.117s sys 0m0.028s #### gst (without asserts, async) time gst-launch-0.10 filesrc location=tailspin.ogg ! oggdemux ! vorbisdec! fakesink sync=false [...] real 0m8.963s user 0m8.645s sys 0m0.112s ###### As I couldn't find a switch in mplayer to disable sync output, again the same with sync output #### gst (with asserts, sync) time gst-launch-0.10 filesrc location=tailspin.ogg ! oggdemux ! vorbisdec! fakesink sync=true [...] real 5m17.516s user 0m13.305s sys 0m0.912s #### gst (without asserts, sync) time gst-launch-0.10 filesrc location=tailspin.ogg ! oggdemux ! vorbisdec! fakesink sync=true [...] real 5m17.508s user 0m12.185s sys 0m0.768s #### mplayer (null) # And again with mplayer time mplayer -ao null tailspin.ogg [...] real 5m27.494s user 0m5.832s sys 0m1.204s ########## And now with audio output, using playbin* ### gst (with asserts, playbin) time gst-launch-0.10 playbin uri=file:///"`pwd`/tailspin.ogg" [...] real 5m18.044s user 0m20.837s sys 0m1.976s ### gst (with asserts, playbin2) time gst-launch-0.10 playbin2 uri=file:///"`pwd`/tailspin.ogg" [...] real 5m17.955s user 0m20.441s sys 0m2.112s ### gst (without asserts, playbin) time gst-launch-0.10 playbin uri=file:///"`pwd`/tailspin.ogg" [...] real 5m17.885s user 0m19.021s sys 0m1.752s ### gst (without asserts, playbin2) time gst-launch-0.10 playbin2 uri=file:///"`pwd`/tailspin.ogg" [...] real 5m17.821s user 0m21.669s # error of measurement? sys 0m2.044s ### mplayer (alsa) real 5m18.415s user 0m6.788s sys 0m1.812s ### To conclude the tests I rebuilt gstreamer, the vorbis-plugin and libvorbis with -O3 ### same options as above, but -O2 replaced (mplayer seems not to be linked against libvorbis) ### gst (without asserts, playbin, -O3) time gst-launch-0.10 playbin uri=file:///"`pwd`/tailspin.ogg" [...] real 5m18.055s user 0m19.541s sys 0m2.024s ### gst (without asserts, playbin2, -O3) time gst-launch-0.10 playbin2 uri=file:///"`pwd`/tailspin.ogg" [...] real 5m17.800s user 0m18.617s sys 0m1.996s ### To find out if playbin* are slow, I constructed the following pipeline time gst-launch-0.10 filesrc location=tailspin.ogg ! oggdemux ! vorbisdec! audioconvert ! alsasink [...] real 5m17.648s user 0m14.501s sys 0m1.012s # Table of results: [seconds] | w ass. | wo ass. -------------------------------------- fakesink, async | 9.117 | 8.645 fakesink, sync | 13.305 | 12.185 playbin | 20.837 | 19.021 playbin2 | 20.441 | 21.669 playbin -O3 | --- | 19.541 playbin2 -O3 | --- | 18.617 alsasink -O3 | --- | 14.501 mplayer null | 5.832 mplayer alsa | 6.788 [%] cpu time/ runtime *100 | w ass. | wo ass. -------------------------------------- fakesink, async | 99.4 | 96.5 fakesink, sync | 4.2 | 3.8 playbin | 6.6 | 6.0 playbin2 | 6.4 | 6.8 playbin -O3 | --- | 6.1 playbin2 -O3 | --- | 5.9 alsasink -O3 | --- | 4.6 mplayer null | 1.8 mplayer alsa | 2.1 Conclusion: - Using -O3 doesn't bring much more performance - Playbin has overhead! - The asserts have also a noticeable overhead
Some update: Running with --gst-disable-registry-update gives me identical output (difference is smaller than 10^-1 sec.)
(In reply to comment #4) > And added --disable-trace to the configure flags. In total they where: > > --prefix=/usr --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu > --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share > --sysconfdir=/etc --localstatedir=/var/lib --disable-static > --disable-dependency-tracking --enable-nls --disable-valgrind > --disable-examples --disable-trace --disable-tests > --with-package-name="GStreamer ebuild for Gentoo" > --with-package-origin=http://packages.gentoo.org/package/media-libs/gstreamer > If you are fine with english error messages you can add --disabled-nls. That saves a bit of startup time and memory. Again probably not much. > ############### > > Here some benchmarks using this (and the old build): > > Setting fakesink using playbin2 seems not to work. Therefore > I used the following: Which gstreamer version is that? Since ~half a year you can do gst-launch-0.10 playbin2 audio-sink="fakesink" uri=... mplayer has a builtin vorbis decoder (from ffmpeg). Dunno if gstreamer ivorbisdec is any faster for you. For mp3 imho there is a mpg123 based gstreamer decoder in bugzilla. > > # Table of results: > > [seconds] | w ass. | wo ass. > -------------------------------------- > fakesink, async | 9.117 | 8.645 > fakesink, sync | 13.305 | 12.185 > playbin | 20.837 | 19.021 > playbin2 | 20.441 | 21.669 > playbin -O3 | --- | 19.541 > playbin2 -O3 | --- | 18.617 > alsasink -O3 | --- | 14.501 > mplayer null | 5.832 > mplayer alsa | 6.788 > > > [%] cpu time/ > runtime *100 | w ass. | wo ass. > -------------------------------------- > fakesink, async | 99.4 | 96.5 > fakesink, sync | 4.2 | 3.8 > playbin | 6.6 | 6.0 > playbin2 | 6.4 | 6.8 > playbin -O3 | --- | 6.1 > playbin2 -O3 | --- | 5.9 > alsasink -O3 | --- | 4.6 > mplayer null | 1.8 > mplayer alsa | 2.1 > > Conclusion: > > - Using -O3 doesn't bring much more performance > - Playbin has overhead! > - The asserts have also a noticeable overhead Would be nice to make this a benchmark script (where I just put my filenames in) and rerun. Playing a pcm wav file could also help to check how much it is the gstreamer framework overhead or how much it is the codec itself. Thanks for the meassuerements.
> Which gstreamer version is that? > Since ~half a year you can do gst-launch-0.10 playbin2 audio-sink="fakesink" uri=... It's the stable version of my distribution, 0.10.24. Maybe it's not included yet, or my syntax was wrong. I'll check it. Also trying 0.10.25 or git master might lead to some improvement. > Playing a pcm wav file could also help to check how much it is the gstreamer framework overhead or how much it is the codec itself. I'll look into that when I'm at home again. > Thanks for the meassuerements. N.P.
Olaf, did you had time already to test further?
I'm sorry, I'm currently a bit busy. But I managed to run the wav tests :D The machine changed a bit, so the times are a bit different, so I re-run mplayer vs. gstreamer/playbin2 wo asserts for ogg/vorbis too: runtime [sec] | vorbis | wav ------------------------------ playbin2 wo ass | 20.8 | 7.1 mplayer | 8.0 | 5.8 cpu usage [%] | vorbis | wav ------------------------------ playbin2 wo ass | 6.5 | 2.2 mplayer | 2.5 | 1.8 So it seems as if playing a wav file isn't that costly using gstreamer. Most overhead really seems to come from the codecs used. But on the other hand: playing a wav file using gstreamer is nearly as "costly" as playing an vorbis stream using mplayer.
Created attachment 154452 [details] Small script to do benchmarks This is a small and simple script to do some benchmarks. It accepts filenames as arguments and outputs averaged runtime stats.
Finally tried the script: Playing of file: /home/ensonic/Musik/klingon.mp3 Tag cpu [s] cpu [%] mplayer_ 1.2 0.0 playbin2a 2.4 0.0 playbin2b 1.6 0.0 playbin2b is: gst-launch-0.10 -q --gst-disable-registry-update filesrc location=%s ! mpg123 ! alsasink mpg123 is from http://gst.homeunix.net/ I'll try to add the plugin to gst-plugin-bad in the next days if the author agrees.
Forgot to mention, that this was with a full debug gstreamer (nothing disabled).
Created attachment 164248 [details] sysprofil profiling data playing a 4 minutes song gst uses between 2% and 10% of cpu on playing a mp3 file on my system. Possibly the profiling data collected with sysprofil on my system can help you to find where the problem is.
Martin, I know how to use a profiler :) The issue here is not a hotspot issue anymore. Using an efficient decoder helps, the rest is to some extend the overhead from gstreamer modularity. If it taks up to 10% for you, you either have a slow system or have not gone through the steps we described above. Regarding the attachment, I don't feel like downloading it and looking though it. If you think there is something good in there, just attach a screenshot -> that can be viewed more easily.
This is due to the shorter default buffer-time and latency-time of alsasink compared to other players. The effect completely disappears if you use 'alsasink buffer-time=1000000 latency-time=1000000'. Any values for latency-time much larger than 10 ms (the default) starts to cause things like volume sliders and play/pause to have a noticeable delay. Perhaps 30-40ms would be a better tradeoff between cpu usage and latency. IMO, however, it would be better for background audio playing apps (like rhythmbox) to set the latency higher manually rather than changing the default.