After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 682480 - clockoverlay create huge load
clockoverlay create huge load
Status: RESOLVED WONTFIX
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
0.10.x
Other Linux
: Normal normal
: NONE
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2012-08-22 14:57 UTC by Levente Farkas
Modified: 2012-10-17 15:05 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Levente Farkas 2012-08-22 14:57:47 UTC
we use clockoverlay to write date and time over a video stream. we've 16, 32 or more parallel pipeline 1-5 fps and all have a clockoverlay. after we try to find which cause high load it turns out if we remove the clockoverlay then the load drops from 4 to 1.5 on a p4 celeron 2ghz. it's may be not a big problem on modern pc, but as we like to run it on older machines wehere this elements cause real high load it'd be nice to fix. not to mention it probably produce such a huge load on moder cpu too just we still not recognize it.

ps. we encode the 16-32 video stream vide xvidenc (scale, colorspace etc). and all of this case 50% cpu and 1.5 load while which clockoverlay the load goes to 4!!!
Comment 1 Tim-Philipp Müller 2012-08-23 09:33:03 UTC
I'm not really sure what to do with this bug.

What is your pipeline like exactly ? If you remove clockoverlay from the pipeline, that might affect other things that may possibly be expensive (e.g. colorspace conversions).

Which version of clockoverlay is this? (gst-inspect-0.10 clockoverlay | grep Version)

Do you have any suggestions on how to improve performance, or do you intend to work on it?

Do you have detailed measurements which code paths / functions are expensive?
Comment 2 Levente Farkas 2012-08-23 11:22:43 UTC
0.10.36

just run this test (eg on 2.8ghz p4 celeron):

gst-launch -e videotestsrc is-live=true ! caps=video/x-raw-yuv,framerate=25/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink

gst-launch -e videotestsrc is-live=true ! caps=video/x-raw-yuv,framerate=25/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" ! fakesink

gst-launch -e videotestsrc is-live=true ! caps=video/x-raw-yuv,framerate=25/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="a" ! fakesink

gst-launch -e videotestsrc is-live=true ! caps=video/x-raw-yuv,framerate=25/1,width=720,height=288 ! clockoverlay ! fakesink


gst-launch -e videotestsrc is-live=true ! caps=video/x-raw-yuv,framerate=25/1,width=720,height=288 ! fakesink

cpu usage 6%, 5%, 2.5%, 3% and 2.5%. and it's only 1 pipeline.
it seems the time-format is the most expensive thing. it's depend on the long of the timeformat.

to get a much more obvious example run this 2 pipes the first use 28% and second only 4% cpu:

gst-launch -e videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! clockoverlay shaded-background=true halignment=left valignment=top xpad=0 ypad=0 time-format="%Y-%m-%d %H:%M:%S" ! fakesink sync=true silent=true

gst-launch -e videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true videotestsrc is-live=true ! video/x-raw-yuv,framerate=1/1,width=720,height=288 ! fakesink sync=true silent=true
Comment 3 Levente Farkas 2012-08-24 11:38:26 UTC
in short imho it's not normal that some overlay cause almost the same cpu load as an mpeg4 encoding. 
we simple put the overlay into our source element. generate all characters at class init time (with the given properties set by the element, it can be changes to regenerate it when properties changed) and simple lookup and copy into the frame buffer the characters during run. which cause 0%cpu (ie. not noticeable) load during pipeline running. while the current clockoverlay cause realy huge load.
imho the same could be used in clockoverlay too. since in the current form it's not worth to use at all.
Comment 4 Tim-Philipp Müller 2012-10-17 15:05:04 UTC
Well, I'm sorry this element doesn't suit your use case. I would suggest you don't use it then.

Of course it's possible to create a much more efficient special-purpose clock overlay, but this is not such an element.

I have checked that it doesn't do anything stupid (at least not in 1.0 and not with other default settings), that is it re-creates the overlay only when the text changes (so for %H:%M only once a minute), and not constantly for every input frame.

If you have any suggestions for optimisations based on in-detail profiling, do let us know or make a patch please, thanks!

Closing this bug, since I don't see any point in keeping it open.