Bug 748259 – New audio/video level element

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 748259 - New audio/video level element


Summary:	New audio/video level element


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-bad
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	1.7.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-04-21 18:28 UTC by Vivia Nikolaidou
Modified:	2015-12-02 12:41 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch for the new element (53.62 KB, patch) 2015-04-21 18:28 UTC, Vivia Nikolaidou	none	Details \| Review
Patch for the new element (53.81 KB, patch) 2015-04-21 19:08 UTC, Vivia Nikolaidou	none	Details \| Review
Patch for the new element (56.45 KB, patch) 2015-07-01 21:51 UTC, Vivia Nikolaidou	none	Details \| Review
Updated patch (56.38 KB, patch) 2015-08-05 13:15 UTC, Vivia Nikolaidou	none	Details \| Review
Updated patch: cleaned up unit tests (51.88 KB, patch) 2015-10-20 13:44 UTC, Vivia Nikolaidou	committed	Details \| Review
videoframe-audiolevel: Use appropriate printf format for gsize (2.69 KB, patch) 2015-12-02 12:11 UTC, Athanasios Oikonomou	committed	Details \| Review

Description Vivia Nikolaidou 2015-04-21 18:28:56 UTC

Created attachment 302092 [details] [review]
Patch for the new element

The alevel element acts like a synchronized audio/video "level" element.
For each video frame, it posts a level-style message containing the RMS value of the corresponding audio frames.
This element needs both video and audio to pass through it.
Furthermore, it needs a queue after its video source.

Comment 1 Vivia Nikolaidou 2015-04-21 19:08:59 UTC

Created attachment 302093 [details] [review]
Patch for the new element

Comment 2 Sebastian Dröge (slomo) 2015-04-21 19:13:21 UTC

Review of attachment 302093 [details] [review]:

::: gst/alevel/gstalevel.c
@@ +6,3 @@
+ * Copyright (C) <1999> Erik Walthinsen <omega@cse.ogi.edu>
+ * Copyright (C) 2000,2001,2002,2003,2005
+ *           Thomas Vander Stichele <thomas at apestaart dot org>

This does not really reflect anymore who owns copyright on the gstlevel.c code :) It was mostly rewritten at least once since 2005

What should we do about that?

Comment 3 Stefan Sauer (gstreamer, gtkdoc dev) 2015-05-20 18:52:40 UTC

What is the use case for it?

Comment 4 Heinrich Fink 2015-06-12 09:41:19 UTC

We use it for displaying audio levels in our broadcast video application for incoming streams / observing mixer outputs. Our customers would expect that audio level is displayed on a per-video-frame basis. Imagine a test signal that has a single white frame, and exactly in the duration of the frame a test sine (silence otherwise). The user would expect to see an up-rise of the audio levels exactly for the white frame.

This couldn't be done with the normal level element, so Vivia wrote the alevel element.

Also: While you can set a fixed window with "level", hoping to be close to the frame boundaries, you would have problems with NTSC-like frame rates (e.g. 30000/1001): Here, audio samples per video frame are a repeated pattern of 1602/1601/1602/1601/1602. That's properly handled by alevel.

Comment 5 Stefan Sauer (gstreamer, gtkdoc dev) 2015-06-12 13:46:05 UTC

While new elements start their life in -bad, I wonder if for this we can add this to the existing level plugin and make them share code.

Another usecase would be to sync the audio-levels to the UI display refresh rate. But thats even harder with all the GSync and FreeSync stuff available now.

@Slomo, year, I've touch this a lot too (since I heavily use it):

> git ls-tree -r -z --name-only HEAD -- *.{c,h} | xargs -0 -n1 git blame --line-porcelain HEAD |grep  "^author "|sort|uniq -c|sort -nr
    185 author Stefan Kost
    179 author Andy Wingo
    165 author Thomas Vander Stichele
    156 author Stefan Sauer
     60 author Wim Taymans
     43 author Tim-Philipp Müller
     26 author Peteris Krisjanis
     20 author Michael Smith
     14 author Mark Nauwelaerts
     13 author Johan Dahlin
     12 author David Schleef
      8 author Sebastian Dröge
      4 author Jan Schmidt
      3 author Mike Smith
      2 author Sébastien Moutte
      2 author Olivier Crête
      2 author Havard Graff
      2 author Edward Hervey
      1 author Benjamin Otte

Seems, we should atleast drop erik :)

Comment 6 Stefan Sauer (gstreamer, gtkdoc dev) 2015-06-12 13:48:31 UTC

I wonder if a demuxer could send video frame data as meta on the audio stream. Then we could add an option to "level" like "dynamic-window-from-meta" but with a shorter name :).

Comment 7 Vivia Nikolaidou 2015-06-12 14:53:43 UTC

@Stefan: It's not easy to make this share code with the existing level element. One reason is that alevel requires both audio and video to pass through it. The other is the completely different timing. Basically, the way they are now they only share part of the calculations, which is two functions IIRC. If they became merged (video pad made request, and requesting it would raise a flag?), looks like the biggest part of the alevel code would become nested under an if{...} ... so I think I wouldn't like that... @slomo what do you think? :)

Unless we ... patch all the demuxers?... and make them send metadata, that would solve a lot of tricky timing issues. However, imagine a scenario where your data doesn't come from one single muxer but from several different video sources which then pass through a compositor (which is part of what Heinrich describes). Then we'd also need to patch compositor. And I really don't know what more to patch, looks to me like we'd need to patch all the things [see meme] :)

Comment 8 Vivia Nikolaidou 2015-06-12 15:34:28 UTC

Pasting from IRC before it's forgotten:

16:55 < slomo> vivia: i agree, yes
16:55 < slomo> it's all bad ;)

Comment 9 Stefan Sauer (gstreamer, gtkdoc dev) 2015-06-12 16:07:39 UTC

@Vivia, I meant sharing as probably having some of the inner code in a separate file and then two elements in one plugin using the same inner loop as a helper.

The suggestion with the demuxer is more like a long term thing of course. It is definitely overkill, but I like to think about it and see if we can find more uses for it. I mean now its level, next element we duplicate is spectrum. If we have 10 elements duplicated we need a more generic approach :)

Comment 10 Stefan Sauer (gstreamer, gtkdoc dev) 2015-06-19 06:26:04 UTC

More ideas.

1) What about changing this element into one called 'audiosync' / 'audioadapter' / ... . The element would get video and audio ins and it would push all audio into an GstAdapter and foreach videoframe push the matching audio.
On the 'level' and 'spectrum' elements, we can add a flag to send a message-per-buffer, since now the buffers perfectly cover a frame. We should probably add a new property for this (and not use a special value like interval == 0), not sure what others think.

2) Another option could be to make the 'interval' property controlable and to have a 'videorate-controller' element that exports a control-source that emits the video rate. Not sure if we can pigy-back that feature into an existing element that one would use anyway (encoder baseclass?).

Comment 11 Sebastian Dröge (slomo) 2015-06-22 10:36:55 UTC

1) sounds like a completely different problem, and 2) seems almost impossible to get right while keeping sync properly between the two elements.


Sharing the level analysis code would be a good idea though.

Comment 12 Stefan Sauer (gstreamer, gtkdoc dev) 2015-06-22 11:19:28 UTC

1) means that the audio blocks would be matching the frame durations. Together with a flag that tells level to emit a message for each frame it would achieve the desired effect too. What I dislike about the approach after some more thoughts is that the size of audio buffers is not guaranteed to be kept and thus one would need to put such reframing element right before the analyzer.

2) right it is not trivial, the video element that provides the control-source would need to have seen the video-frames before the audio element calls sync(pts).

As said, I don't want to block this, just tried to see if there is a more scalable approach. To proceed I see two options:

1) We can start this as usual in bad and when moving to good merge it into the level plugin and try to share some code.
2) Merge it into good right away.

Comment 13 Vivia Nikolaidou 2015-06-22 19:20:11 UTC

@Stefan: so you think that it could be merged into good already if we could manage to make it share code with the normal level element?

Also, could one of you please explain how the code sharing would be? Where would the file with the "shared" code reside? From what I understand, it's too complicated to do if one plugin is in -good and the other in -bad, right?

Comment 14 Stefan Sauer (gstreamer, gtkdoc dev) 2015-06-22 20:19:18 UTC

Just a quick update from irc discussion. Lets merge to bad first after we have a better name (I suggest "vsync-level"). Imho the trouble with the name is already a sign that we should find a better solution for this feature.

Another option for not having the separate element would be to add a request sink-pad for video to level, that if used would override the interval property. But I think it is not very discoverable and vivia also mentioned that this is not exactly how this new element works right now.

Comment 15 Vivia Nikolaidou 2015-06-24 13:57:53 UTC

There seems to be a consensus for "videoframe-audiolevel" as a name on IRC. I'll change it to that if nobody raises any concerns for a few days :)

Comment 16 Reynaldo H. Verdejo Pinochet 2015-06-24 17:13:48 UTC

Review of attachment 302093 [details] [review]:

Just cosmetic/formatting remarks, nothing blocking but
worth considering

::: gst/alevel/gstalevel.c
@@ +331,3 @@
+  GstALevel *self = GST_ALEVEL (parent);
+  GST_LOG_OBJECT (pad, "Got %s event", GST_EVENT_TYPE_NAME (event));
+

Maybe drop the braces on the case statements bellow

@@ +375,3 @@
+  GstALevel *self = GST_ALEVEL (parent);
+  GST_LOG_OBJECT (pad, "Got %s event", GST_EVENT_TYPE_NAME (event));
+

ditto

@@ +434,3 @@
+      self->total_frames = 0;
+      if (self->CS)
+        g_free (self->CS);

I also prefer not to use braces on single statement
if blocks, maybe you can check the rest of the code
and make it consistent? There are multiple times in
which you do

@@ +536,3 @@
+  g_mutex_lock (&self->mutex);
+  self->vsegment.position = timestamp;
+  duration = GST_BUFFER_DURATION (inbuf);

Like bellow but happens several times

@@ +720,3 @@
+        available_bytes, bytes);
+
+    if (available_bytes >= bytes) {

proly if available_bytes < bytes with a goto done to #746

::: tests/check/elements/alevel.c
@@ +214,3 @@
+    } else if (i == 4 && audio_nondiscont) {
+      timestamp += 30 * GST_MSECOND;
+    }

See previous comment on {}

@@ +259,3 @@
+      } else if (video_overlaps) {
+        timestamp -= 10 * GST_MSECOND;
+      }

ditto

@@ +272,3 @@
+on_message (GstBus * bus, GstMessage * message, gpointer user_data)
+{
+  if (message->type == GST_MESSAGE_ELEMENT) {

if message->type != GST_MESSAGE_ELEMENT with an immediate
return GST_BUS_PASS or alternatively a goto to #312

@@ +282,3 @@
+    guint i;
+
+    if (strcmp (name, "alevel") == 0) {

Same as above with != 0

Comment 17 Vivia Nikolaidou 2015-07-01 21:51:00 UTC

Created attachment 306566 [details] [review]
Patch for the new element

Hi again,

Thank you for your comments, and thanks Reynaldo for the review!

I am attaching the newest version, let me know what you think now :)

Comment 18 Luis de Bethencourt 2015-07-02 10:12:43 UTC

There is a trailing whitespace in the first line of gst/videoframe_audiolevel/gstvideoframe-audiolevel.h

It is printing a lot of "rms=(GValueArray)NULL" for me.

Interesting new element :)

Comment 19 Vivia Nikolaidou 2015-07-02 10:18:31 UTC

Thanks for the trailing whitespace, nice catch!

As for the NULL, it's expected if you just g_print it. The RMS value is actually a GValueArray which can't be trivially printed as string. You need to unfold the GValueArray like in the unit test: gst/videoframe_audiolevel/gstvideoframe-audiolevel.c

Comment 20 Luis de Bethencourt 2015-07-02 10:19:50 UTC

Right! The problem is that gst-launch prints all messages received.

It isn't a big problem and it isn't your problem. Your element is intended to be used inside an application that can smartly catch those messages.

+1 from me.

Comment 21 Vivia Nikolaidou 2015-07-02 10:22:54 UTC

Cool, thanks :)

So I can submit a new patch just for this trailing whitespace, but I'll wait for approval from others first - if there are more things to fix, I'll just do everything at once.

Comment 22 Luis de Bethencourt 2015-07-02 10:26:35 UTC

No worries. Submitter can remove that quickly before sending the current version.

Plus, as you mentioned, it is in a header file so gst-indent won't block it.

Good work.

Comment 23 Stefan Sauer (gstreamer, gtkdoc dev) 2015-07-02 12:32:05 UTC

Review of attachment 306566 [details] [review]:

::: gst/videoframe_audiolevel/gstvideoframe-audiolevel.c
@@ +43,3 @@
+/* FIXME 0.11: suppress warnings for deprecated API such as GValueArray
+ * with newer GLib versions (>= 2.31.0) */
+#define GLIB_DISABLE_DEPRECATION_WARNINGS

Can we use a GPtrArray instead. If this works well, we can also add this code to GstLevel when we do gstreamer-2.x. The FIXME should also target 2.X and not 0.11 :)

@@ +236,3 @@
+      gst_adapter_clear (self->adapter);
+      g_queue_foreach (&self->vtimeq, (GFunc) g_free, NULL);
+      g_queue_clear (&self->vtimeq);

g_queue_free_full (&self->vtimeq, (GFunc) g_free);

also below. Since elements are fixed size and potentially allocated and freed a lot, you could consider using g_slice here.

@@ +680,3 @@
+      guint num_frames =
+          gst_util_uint64_scale (*vt0 - cur_time, rate, GST_SECOND);
+      bytes = num_frames * GST_AUDIO_INFO_BPF (&self->ainfo);

you could move this down where it is used (after the next condition)

@@ +689,3 @@
+        cur_time = *vt0;
+      } else {
+        GST_DEBUG_OBJECT (self, "We flush %ld out of %ld bytes", bytes,

nit: I'd prefer just the facts :) "flushed %ld out ..." that is no 'We'.

@@ +694,3 @@
+        self->total_frames += num_frames;
+        if (available_bytes <= bytes) {
+          num_frames = available_bytes / GST_AUDIO_INFO_BPF (&self->ainfo);

why is this updated here?

@@ +711,3 @@
+    }
+    available_bytes = gst_adapter_available (self->adapter);
+    GST_DEBUG_OBJECT (self, "We have %ld out of %ld bytes",

nit: like above, maybe "Adapter contains %ld out ..."

Comment 24 Vivia Nikolaidou 2015-08-05 13:14:58 UTC

Sorry it took so long - replying now!

(In reply to Stefan Sauer (gstreamer, gtkdoc dev) from comment #23)
> Review of attachment 306566 [details] [review] [review]:
> 
> ::: gst/videoframe_audiolevel/gstvideoframe-audiolevel.c
> @@ +43,3 @@
> +/* FIXME 0.11: suppress warnings for deprecated API such as GValueArray
> + * with newer GLib versions (>= 2.31.0) */
> +#define GLIB_DISABLE_DEPRECATION_WARNINGS
> 
> Can we use a GPtrArray instead. If this works well, we can also add this
> code to GstLevel when we do gstreamer-2.x. The FIXME should also target 2.X
> and not 0.11 :)

Made it FIXME 2.0 for now :)

> @@ +236,3 @@
> +      gst_adapter_clear (self->adapter);
> +      g_queue_foreach (&self->vtimeq, (GFunc) g_free, NULL);
> +      g_queue_clear (&self->vtimeq);
> 
> g_queue_free_full (&self->vtimeq, (GFunc) g_free);

Cannot use this because the vtimeq is allocated in self and not created using g_queue_new.

> 
> @@ +680,3 @@
> +      guint num_frames =
> +          gst_util_uint64_scale (*vt0 - cur_time, rate, GST_SECOND);
> +      bytes = num_frames * GST_AUDIO_INFO_BPF (&self->ainfo);
> 
> you could move this down where it is used (after the next condition)

Actually it's needed here to calculate available_bytes, which is needed in the "if" condition.

> @@ +689,3 @@
> +        cur_time = *vt0;
> +      } else {
> +        GST_DEBUG_OBJECT (self, "We flush %ld out of %ld bytes", bytes,
> 
> nit: I'd prefer just the facts :) "flushed %ld out ..." that is no 'We'.

OK, fixed.

> @@ +694,3 @@
> +        self->total_frames += num_frames;
> +        if (available_bytes <= bytes) {
> +          num_frames = available_bytes / GST_AUDIO_INFO_BPF (&self->ainfo);
> 
> why is this updated here?

Oops. Nice catch, thanks :)
 
> @@ +711,3 @@
> +    }
> +    available_bytes = gst_adapter_available (self->adapter);
> +    GST_DEBUG_OBJECT (self, "We have %ld out of %ld bytes",
> 
> nit: like above, maybe "Adapter contains %ld out ..."

Fixed too, thanks :)

Attaching new patch...

Comment 25 Vivia Nikolaidou 2015-08-05 13:15:36 UTC

Created attachment 308787 [details] [review]
Updated patch

Comment 26 Stefan Sauer (gstreamer, gtkdoc dev) 2015-10-19 13:41:04 UTC

Review of attachment 308787 [details] [review]:

Regarding the tests, up to you.

::: tests/check/elements/videoframe-audiolevel.c
@@ +413,3 @@
+GST_START_TEST (test_videoframe_audiolevel_16chan_1)
+{
+  channels = 16;

If you use a structure for the parameters that you pass to test_videoframe_audiolevel_generic(), you'll get a warning if there is an uninitialized member, right now i'd be  worried that in the future another param is added and not all tests are updated to set it.

Comment 27 Vivia Nikolaidou 2015-10-20 13:44:18 UTC

Created attachment 313745 [details] [review]
Updated patch: cleaned up unit tests

Thanks for the comments. Added a set_default_params() function to set all defaults for the test parameters.

As discussed on IRC, we have many flags, and initializing the struct wouldn't look very readable:
TestingParams params = {40, 15, 16, 1, 0.0078125, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE};

The set_default_params() function though makes the code much more readable: we set all parameters to their defaults and initialize only the ones we need different.

Comment 28 Stefan Sauer (gstreamer, gtkdoc dev) 2015-12-02 10:52:09 UTC

Comment on attachment 313745 [details] [review]
Updated patch: cleaned up unit tests

Thanks for improving the tests. Lets get this in now.

Comment 29 Sebastian Dröge (slomo) 2015-12-02 10:56:35 UTC

commit 978bcd7181bf0f282cb17d2f5f6f1ac009508f37
Author: Vivia Nikolaidou <vivia@toolsonair.com>
Date:   Tue Apr 21 21:09:19 2015 +0300

    alevel: New audio/video level element
    
    The videoframe-audiolevel element acts like a synchronized audio/video "level"
    element. For each video frame, it posts a level-style message containing the
    RMS value of the corresponding audio frames. This element needs both video and
    audio to pass through it. Furthermore, it needs a queue after its video
    source.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=748259

Comment 30 Athanasios Oikonomou 2015-12-02 12:11:59 UTC

Created attachment 316653 [details] [review]
videoframe-audiolevel: Use appropriate printf format for  gsize

Since commit 978bcd7181bf0f282cb17d2f5f6f1ac009508f37 building
plugins bad returns the following errors:

/git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c: In function 'gst_videoframe_audiolevel_asink_chain':
/git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:691:9: error: format '%ld' expects argument of type 'long int', but argument 8 has type 'gsize' [-Werror=format=]
         GST_DEBUG_OBJECT (self, "Flushed %ld out of %ld bytes", bytes,
         ^
/git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:691:9: error: format '%ld' expects argument of type 'long int', but argument 9 has type 'gsize' [-Werror=format=]
/git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:712:5: error: format '%ld' expects argument of type 'long int', but argument 8 has type 'gsize' [-Werror=format=]
     GST_DEBUG_OBJECT (self, "Adapter contains %ld out of %ld bytes",
     ^
/git/gst/videoframe_audiolevel/gstvideoframe-audiolevel.c:712:5: error: format '%ld' expects argument of type 'long int', but argument 9 has type 'gsize' [-Werror=format=]

Use G_GSIZE_FORMAT format instead of %ld fixes the issue.

Comment 31 Sebastian Dröge (slomo) 2015-12-02 12:41:44 UTC

Comment on attachment 316653 [details] [review]
videoframe-audiolevel: Use appropriate printf format for  gsize

Thanks for the patch, but I fixed that too already. Didn't see that here was a patch.

Additionally I also fixed building of static plugins though:

commit 92d926d7331bda02b2087cb1fb07d4e84bb196e3
Author: Sebastian Dröge <sebastian@centricular.com>
Date:   Wed Dec 2 14:35:22 2015 +0200

    videoframe-audiolevel: Fix compilation of static plugin and some compiler warnings
    
    Use G_GSIZE_FORMAT for gsize instead of %ld and make sure that the plugin name
    is a valid C identifier, i.e. contains no spaces or dashes.