GNOME Bugzilla – Bug 748259
New audio/video level element
Last modified: 2015-12-02 12:41:44 UTC
Created attachment 302092 [details] [review] Patch for the new element The alevel element acts like a synchronized audio/video "level" element. For each video frame, it posts a level-style message containing the RMS value of the corresponding audio frames. This element needs both video and audio to pass through it. Furthermore, it needs a queue after its video source.
Created attachment 302093 [details] [review] Patch for the new element
Review of attachment 302093 [details] [review]: ::: gst/alevel/gstalevel.c @@ +6,3 @@ + * Copyright (C) <1999> Erik Walthinsen <omega@cse.ogi.edu> + * Copyright (C) 2000,2001,2002,2003,2005 + * Thomas Vander Stichele <thomas at apestaart dot org> This does not really reflect anymore who owns copyright on the gstlevel.c code :) It was mostly rewritten at least once since 2005 What should we do about that?
What is the use case for it?
We use it for displaying audio levels in our broadcast video application for incoming streams / observing mixer outputs. Our customers would expect that audio level is displayed on a per-video-frame basis. Imagine a test signal that has a single white frame, and exactly in the duration of the frame a test sine (silence otherwise). The user would expect to see an up-rise of the audio levels exactly for the white frame. This couldn't be done with the normal level element, so Vivia wrote the alevel element. Also: While you can set a fixed window with "level", hoping to be close to the frame boundaries, you would have problems with NTSC-like frame rates (e.g. 30000/1001): Here, audio samples per video frame are a repeated pattern of 1602/1601/1602/1601/1602. That's properly handled by alevel.
While new elements start their life in -bad, I wonder if for this we can add this to the existing level plugin and make them share code. Another usecase would be to sync the audio-levels to the UI display refresh rate. But thats even harder with all the GSync and FreeSync stuff available now. @Slomo, year, I've touch this a lot too (since I heavily use it): > git ls-tree -r -z --name-only HEAD -- *.{c,h} | xargs -0 -n1 git blame --line-porcelain HEAD |grep "^author "|sort|uniq -c|sort -nr 185 author Stefan Kost 179 author Andy Wingo 165 author Thomas Vander Stichele 156 author Stefan Sauer 60 author Wim Taymans 43 author Tim-Philipp Müller 26 author Peteris Krisjanis 20 author Michael Smith 14 author Mark Nauwelaerts 13 author Johan Dahlin 12 author David Schleef 8 author Sebastian Dröge 4 author Jan Schmidt 3 author Mike Smith 2 author Sébastien Moutte 2 author Olivier Crête 2 author Havard Graff 2 author Edward Hervey 1 author Benjamin Otte Seems, we should atleast drop erik :)
I wonder if a demuxer could send video frame data as meta on the audio stream. Then we could add an option to "level" like "dynamic-window-from-meta" but with a shorter name :).
@Stefan: It's not easy to make this share code with the existing level element. One reason is that alevel requires both audio and video to pass through it. The other is the completely different timing. Basically, the way they are now they only share part of the calculations, which is two functions IIRC. If they became merged (video pad made request, and requesting it would raise a flag?), looks like the biggest part of the alevel code would become nested under an if{...} ... so I think I wouldn't like that... @slomo what do you think? :) Unless we ... patch all the demuxers?... and make them send metadata, that would solve a lot of tricky timing issues. However, imagine a scenario where your data doesn't come from one single muxer but from several different video sources which then pass through a compositor (which is part of what Heinrich describes). Then we'd also need to patch compositor. And I really don't know what more to patch, looks to me like we'd need to patch all the things [see meme] :)
Pasting from IRC before it's forgotten: 16:55 < slomo> vivia: i agree, yes 16:55 < slomo> it's all bad ;)
@Vivia, I meant sharing as probably having some of the inner code in a separate file and then two elements in one plugin using the same inner loop as a helper. The suggestion with the demuxer is more like a long term thing of course. It is definitely overkill, but I like to think about it and see if we can find more uses for it. I mean now its level, next element we duplicate is spectrum. If we have 10 elements duplicated we need a more generic approach :)
More ideas. 1) What about changing this element into one called 'audiosync' / 'audioadapter' / ... . The element would get video and audio ins and it would push all audio into an GstAdapter and foreach videoframe push the matching audio. On the 'level' and 'spectrum' elements, we can add a flag to send a message-per-buffer, since now the buffers perfectly cover a frame. We should probably add a new property for this (and not use a special value like interval == 0), not sure what others think. 2) Another option could be to make the 'interval' property controlable and to have a 'videorate-controller' element that exports a control-source that emits the video rate. Not sure if we can pigy-back that feature into an existing element that one would use anyway (encoder baseclass?).
1) sounds like a completely different problem, and 2) seems almost impossible to get right while keeping sync properly between the two elements. Sharing the level analysis code would be a good idea though.
1) means that the audio blocks would be matching the frame durations. Together with a flag that tells level to emit a message for each frame it would achieve the desired effect too. What I dislike about the approach after some more thoughts is that the size of audio buffers is not guaranteed to be kept and thus one would need to put such reframing element right before the analyzer. 2) right it is not trivial, the video element that provides the control-source would need to have seen the video-frames before the audio element calls sync(pts). As said, I don't want to block this, just tried to see if there is a more scalable approach. To proceed I see two options: 1) We can start this as usual in bad and when moving to good merge it into the level plugin and try to share some code. 2) Merge it into good right away.
@Stefan: so you think that it could be merged into good already if we could manage to make it share code with the normal level element? Also, could one of you please explain how the code sharing would be? Where would the file with the "shared" code reside? From what I understand, it's too complicated to do if one plugin is in -good and the other in -bad, right?
Just a quick update from irc discussion. Lets merge to bad first after we have a better name (I suggest "vsync-level"). Imho the trouble with the name is already a sign that we should find a better solution for this feature. Another option for not having the separate element would be to add a request sink-pad for video to level, that if used would override the interval property. But I think it is not very discoverable and vivia also mentioned that this is not exactly how this new element works right now.
There seems to be a consensus for "videoframe-audiolevel" as a name on IRC. I'll change it to that if nobody raises any concerns for a few days :)
Review of attachment 302093 [details] [review]: Just cosmetic/formatting remarks, nothing blocking but worth considering ::: gst/alevel/gstalevel.c @@ +331,3 @@ + GstALevel *self = GST_ALEVEL (parent); + GST_LOG_OBJECT (pad, "Got %s event", GST_EVENT_TYPE_NAME (event)); + Maybe drop the braces on the case statements bellow @@ +375,3 @@ + GstALevel *self = GST_ALEVEL (parent); + GST_LOG_OBJECT (pad, "Got %s event", GST_EVENT_TYPE_NAME (event)); + ditto @@ +434,3 @@ + self->total_frames = 0; + if (self->CS) + g_free (self->CS); I also prefer not to use braces on single statement if blocks, maybe you can check the rest of the code and make it consistent? There are multiple times in which you do @@ +536,3 @@ + g_mutex_lock (&self->mutex); + self->vsegment.position = timestamp; + duration = GST_BUFFER_DURATION (inbuf); Like bellow but happens several times @@ +720,3 @@ + available_bytes, bytes); + + if (available_bytes >= bytes) { proly if available_bytes < bytes with a goto done to #746 ::: tests/check/elements/alevel.c @@ +214,3 @@ + } else if (i == 4 && audio_nondiscont) { + timestamp += 30 * GST_MSECOND; + } See previous comment on {} @@ +259,3 @@ + } else if (video_overlaps) { + timestamp -= 10 * GST_MSECOND; + } ditto @@ +272,3 @@ +on_message (GstBus * bus, GstMessage * message, gpointer user_data) +{ + if (message->type == GST_MESSAGE_ELEMENT) { if message->type != GST_MESSAGE_ELEMENT with an immediate return GST_BUS_PASS or alternatively a goto to #312 @@ +282,3 @@ + guint i; + + if (strcmp (name, "alevel") == 0) { Same as above with != 0
Created attachment 306566 [details] [review] Patch for the new element Hi again, Thank you for your comments, and thanks Reynaldo for the review! I am attaching the newest version, let me know what you think now :)
There is a trailing whitespace in the first line of gst/videoframe_audiolevel/gstvideoframe-audiolevel.h It is printing a lot of "rms=(GValueArray)NULL" for me. Interesting new element :)
Thanks for the trailing whitespace, nice catch! As for the NULL, it's expected if you just g_print it. The RMS value is actually a GValueArray which can't be trivially printed as string. You need to unfold the GValueArray like in the unit test: gst/videoframe_audiolevel/gstvideoframe-audiolevel.c
Right! The problem is that gst-launch prints all messages received. It isn't a big problem and it isn't your problem. Your element is intended to be used inside an application that can smartly catch those messages. +1 from me.
Cool, thanks :) So I can submit a new patch just for this trailing whitespace, but I'll wait for approval from others first - if there are more things to fix, I'll just do everything at once.
No worries. Submitter can remove that quickly before sending the current version. Plus, as you mentioned, it is in a header file so gst-indent won't block it. Good work.
Review of attachment 306566 [details] [review]: ::: gst/videoframe_audiolevel/gstvideoframe-audiolevel.c @@ +43,3 @@ +/* FIXME 0.11: suppress warnings for deprecated API such as GValueArray + * with newer GLib versions (>= 2.31.0) */ +#define GLIB_DISABLE_DEPRECATION_WARNINGS Can we use a GPtrArray instead. If this works well, we can also add this code to GstLevel when we do gstreamer-2.x. The FIXME should also target 2.X and not 0.11 :) @@ +236,3 @@ + gst_adapter_clear (self->adapter); + g_queue_foreach (&self->vtimeq, (GFunc) g_free, NULL); + g_queue_clear (&self->vtimeq); g_queue_free_full (&self->vtimeq, (GFunc) g_free); also below. Since elements are fixed size and potentially allocated and freed a lot, you could consider using g_slice here. @@ +680,3 @@ + guint num_frames = + gst_util_uint64_scale (*vt0 - cur_time, rate, GST_SECOND); + bytes = num_frames * GST_AUDIO_INFO_BPF (&self->ainfo); you could move this down where it is used (after the next condition) @@ +689,3 @@ + cur_time = *vt0; + } else { + GST_DEBUG_OBJECT (self, "We flush %ld out of %ld bytes", bytes, nit: I'd prefer just the facts :) "flushed %ld out ..." that is no 'We'. @@ +694,3 @@ + self->total_frames += num_frames; + if (available_bytes <= bytes) { + num_frames = available_bytes / GST_AUDIO_INFO_BPF (&self->ainfo); why is this updated here? @@ +711,3 @@ + } + available_bytes = gst_adapter_available (self->adapter); + GST_DEBUG_OBJECT (self, "We have %ld out of %ld bytes", nit: like above, maybe "Adapter contains %ld out ..."
Sorry it took so long - replying now! (In reply to Stefan Sauer (gstreamer, gtkdoc dev) from comment #23) > Review of attachment 306566 [details] [review] [review]: > > ::: gst/videoframe_audiolevel/gstvideoframe-audiolevel.c > @@ +43,3 @@ > +/* FIXME 0.11: suppress warnings for deprecated API such as GValueArray > + * with newer GLib versions (>= 2.31.0) */ > +#define GLIB_DISABLE_DEPRECATION_WARNINGS > > Can we use a GPtrArray instead. If this works well, we can also add this > code to GstLevel when we do gstreamer-2.x. The FIXME should also target 2.X > and not 0.11 :) Made it FIXME 2.0 for now :) > @@ +236,3 @@ > + gst_adapter_clear (self->adapter); > + g_queue_foreach (&self->vtimeq, (GFunc) g_free, NULL); > + g_queue_clear (&self->vtimeq); > > g_queue_free_full (&self->vtimeq, (GFunc) g_free); Cannot use this because the vtimeq is allocated in self and not created using g_queue_new. > > @@ +680,3 @@ > + guint num_frames = > + gst_util_uint64_scale (*vt0 - cur_time, rate, GST_SECOND); > + bytes = num_frames * GST_AUDIO_INFO_BPF (&self->ainfo); > > you could move this down where it is used (after the next condition) Actually it's needed here to calculate available_bytes, which is needed in the "if" condition. > @@ +689,3 @@ > + cur_time = *vt0; > + } else { > + GST_DEBUG_OBJECT (self, "We flush %ld out of %ld bytes", bytes, > > nit: I'd prefer just the facts :) "flushed %ld out ..." that is no 'We'. OK, fixed. > @@ +694,3 @@ > + self->total_frames += num_frames; > + if (available_bytes <= bytes) { > + num_frames = available_bytes / GST_AUDIO_INFO_BPF (&self->ainfo); > > why is this updated here? Oops. Nice catch, thanks :) > @@ +711,3 @@ > + } > + available_bytes = gst_adapter_available (self->adapter); > + GST_DEBUG_OBJECT (self, "We have %ld out of %ld bytes", > > nit: like above, maybe "Adapter contains %ld out ..." Fixed too, thanks :) Attaching new patch...
Created attachment 308787 [details] [review] Updated patch
Review of attachment 308787 [details] [review]: Regarding the tests, up to you. ::: tests/check/elements/videoframe-audiolevel.c @@ +413,3 @@ +GST_START_TEST (test_videoframe_audiolevel_16chan_1) +{ + channels = 16; If you use a structure for the parameters that you pass to test_videoframe_audiolevel_generic(), you'll get a warning if there is an uninitialized member, right now i'd be worried that in the future another param is added and not all tests are updated to set it.
Created attachment 313745 [details] [review] Updated patch: cleaned up unit tests Thanks for the comments. Added a set_default_params() function to set all defaults for the test parameters. As discussed on IRC, we have many flags, and initializing the struct wouldn't look very readable: TestingParams params = {40, 15, 16, 1, 0.0078125, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE}; The set_default_params() function though makes the code much more readable: we set all parameters to their defaults and initialize only the ones we need different.
Comment on attachment 313745 [details] [review] Updated patch: cleaned up unit tests Thanks for improving the tests. Lets get this in now.
commit 978bcd7181bf0f282cb17d2f5f6f1ac009508f37 Author: Vivia Nikolaidou <vivia@toolsonair.com> Date: Tue Apr 21 21:09:19 2015 +0300 alevel: New audio/video level element The videoframe-audiolevel element acts like a synchronized audio/video "level" element. For each video frame, it posts a level-style message containing the RMS value of the corresponding audio frames. This element needs both video and audio to pass through it. Furthermore, it needs a queue after its video source. https://bugzilla.gnome.org/show_bug.cgi?id=748259
Created attachment 316653 [details] [review] videoframe-audiolevel: Use appropriate printf format for gsize Since commit 978bcd7181bf0f282cb17d2f5f6f1ac009508f37 building plugins bad returns the following errors: /git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c: In function 'gst_videoframe_audiolevel_asink_chain': /git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:691:9: error: format '%ld' expects argument of type 'long int', but argument 8 has type 'gsize' [-Werror=format=] GST_DEBUG_OBJECT (self, "Flushed %ld out of %ld bytes", bytes, ^ /git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:691:9: error: format '%ld' expects argument of type 'long int', but argument 9 has type 'gsize' [-Werror=format=] /git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:712:5: error: format '%ld' expects argument of type 'long int', but argument 8 has type 'gsize' [-Werror=format=] GST_DEBUG_OBJECT (self, "Adapter contains %ld out of %ld bytes", ^ /git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:712:5: error: format '%ld' expects argument of type 'long int', but argument 9 has type 'gsize' [-Werror=format=] Use G_GSIZE_FORMAT format instead of %ld fixes the issue.
Comment on attachment 316653 [details] [review] videoframe-audiolevel: Use appropriate printf format for gsize Thanks for the patch, but I fixed that too already. Didn't see that here was a patch. Additionally I also fixed building of static plugins though: commit 92d926d7331bda02b2087cb1fb07d4e84bb196e3 Author: Sebastian Dröge <sebastian@centricular.com> Date: Wed Dec 2 14:35:22 2015 +0200 videoframe-audiolevel: Fix compilation of static plugin and some compiler warnings Use G_GSIZE_FORMAT for gsize instead of %ld and make sure that the plugin name is a valid C identifier, i.e. contains no spaces or dashes.