Bug 609801 – [volume] Use sample accurate property values if a controller is used

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 609801 - [volume] Use sample accurate property values if a controller is used


Summary:	[volume] Use sample accurate property values if a controller is used


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	0.10.29
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-02-13 00:02 UTC by Sebastian Dröge (slomo)
Modified:	2010-03-16 14:28 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
API: Add helper function to split audio buffers into specific intervals (14.42 KB, patch) 2010-02-13 00:02 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: Use audio buffer splitting helper functions (4.92 KB, patch) 2010-02-13 00:03 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: Use audio buffer splitting helper functions (4.92 KB, patch) 2010-02-13 00:13 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: If a controller is used, use sample accurate property values (20.72 KB, patch) 2010-02-17 18:21 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: If a controller is used, use sample accurate property values (20.70 KB, patch) 2010-02-17 18:24 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: If a controller is used, use sample accurate property values (17.62 KB, patch) 2010-02-17 18:31 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: If a controller is used, use sample accurate property values (18.38 KB, patch) 2010-02-17 19:00 UTC, Sebastian Dröge (slomo)	none	Details \| Review
volume: If a controller is used, use sample accurate property values (17.56 KB, patch) 2010-02-18 10:22 UTC, Sebastian Dröge (slomo)	committed	Details \| Review
test.c (3.33 KB, text/plain) 2010-02-21 17:11 UTC, Sebastian Dröge (slomo)		Details
interpolationcontrolsource: Optimize get_value_array() (31.92 KB, patch) 2010-02-22 15:11 UTC, Sebastian Dröge (slomo)	none	Details \| Review
lfocontrolsource: Optimize get_value_array() (16.71 KB, patch) 2010-02-22 19:02 UTC, Sebastian Dröge (slomo)	committed	Details \| Review
interpolationcontrolsource: Optimize get_value_array() (30.80 KB, patch) 2010-02-22 19:02 UTC, Sebastian Dröge (slomo)	committed	Details \| Review

Description Sebastian Dröge (slomo) 2010-02-13 00:02:28 UTC

Hi,
the attached patches add some helper API to libgstaudio for splitting audio buffers into specific intervals. This is usually a good idea if GstController is used to have smoother changes of the controlled properties.

The current implementation of it counts the samples and makes sure that every N samples the properties are synced, while balancing the accumulative rounding errors. Buffer timestamps are then interpolated from the current buffer's start timestamp but are clamped to the current buffer's end timestamp. Future implementations can change this behaviour, for example by making sure that the timestamps are always at x, 2x, 3x, 4x, ...

The second patch uses this in the volume element.

Comment 1 Sebastian Dröge (slomo) 2010-02-13 00:02:50 UTC

Created attachment 153677 [details] [review]
API: Add helper function to split audio buffers into specific intervals

Comment 2 Sebastian Dröge (slomo) 2010-02-13 00:03:06 UTC

Created attachment 153678 [details] [review]
volume: Use audio buffer splitting helper functions

Comment 3 Sebastian Dröge (slomo) 2010-02-13 00:13:11 UTC

Created attachment 153680 [details] [review]
volume: Use audio buffer splitting helper functions

Comment 4 Wim Taymans 2010-02-17 09:26:02 UTC

we really don't want to do that.. If you want sample accurate interpollation we don't want to split the buffer on each sample. We should use the controller to generate an array of values for each sample that we need to modify (with some interpolation) and then combine the arrays (this also allows us to use efficient matrix multiply with sse or something).

Comment 5 Sebastian Dröge (slomo) 2010-02-17 18:21:37 UTC

Created attachment 154061 [details] [review]
volume: If a controller is used, use sample accurate property values

Fixes bug #609801.

Comment 6 Sebastian Dröge (slomo) 2010-02-17 18:22:39 UTC

(In reply to comment #5)
> Created an attachment (id=154061) [details] [review]
> volume: If a controller is used, use sample accurate property values
> 
> Fixes bug #609801.

What about this instead? ;)

Comment 7 Sebastian Dröge (slomo) 2010-02-17 18:24:10 UTC

Created attachment 154062 [details] [review]
volume: If a controller is used, use sample accurate property values

Fixes bug #609801.

Comment 8 Sebastian Dröge (slomo) 2010-02-17 18:31:25 UTC

Created attachment 154063 [details] [review]
volume: If a controller is used, use sample accurate property values

Fixes bug #609801.

Comment 9 Sebastian Dröge (slomo) 2010-02-17 19:00:09 UTC

Created attachment 154066 [details] [review]
volume: If a controller is used, use sample accurate property values

Fixes bug #609801.

Comment 10 Stefan Sauer (gstreamer, gtkdoc dev) 2010-02-17 19:05:07 UTC

(In reply to comment #4)
> we really don't want to do that.. If you want sample accurate interpollation we
> don't want to split the buffer on each sample. We should use the controller to
> generate an array of values for each sample that we need to modify (with some
> interpolation) and then combine the arrays (this also allows us to use
> efficient matrix multiply with sse or something).

We want sample accurate interpollation, but not a new control value per sample. The controller is for control curves and those cause to high overhead if done on sample level. Having a max controlrate of e.g. 50 Hz is covering already exotic use cases. Most use cases would still be fine with 2-10Hz.

Comment 11 Wim Taymans 2010-02-17 20:44:53 UTC

(In reply to comment #10)
> (In reply to comment #4)
> > we really don't want to do that.. If you want sample accurate interpollation we
> > don't want to split the buffer on each sample. We should use the controller to
> > generate an array of values for each sample that we need to modify (with some
> > interpolation) and then combine the arrays (this also allows us to use
> > efficient matrix multiply with sse or something).
> 
> We want sample accurate interpollation, but not a new control value per sample.
> The controller is for control curves and those cause to high overhead if done
> on sample level. Having a max controlrate of e.g. 50 Hz is covering already
> exotic use cases. Most use cases would still be fine with 2-10Hz.

I agree that always doing sample accurate interpollation is a bit over the top for
most use cases. The granularity should somehow be a decision of the application
using the controller. 

At the volume level, however, I believe it should be possible to request
an equal amount of control points as you have samples so that you can later
optimize the multiplication (in volume case) with some straightforward sse/mmx
paralel multiplies (as the latest patch seems to do). I don't think the volume
element should try to split buffers or mess with the controller too much.

How the controller then generates those points would depend on its
configuration (it could duplicate the same value N times or it could do a simple
linear interpollation between the calculated control points or...). 

It can't imagine that rendering a controller curve for N samples should be
such an expensive operation. Even less so when the algorithm is allowed to
take shortcuts based on the configured granularity. I might be wrong, though...

Comment 12 Stefan Sauer (gstreamer, gtkdoc dev) 2010-02-17 21:14:16 UTC

Now more comments on the actual patch - it is nicely done. But honestly I don't think we want to do it like this for audio. For video we can sync on each frame. For audio we want to sync at a similar interval.

process_controlled_*
...
  for (i = 0; i < num_samples; i++) {
    for (j = 0; j < channels; j++) {
      if (*mute) {
        *data++ = 0;
      } else {
        gdouble val = *data * (*volume) + 0.5;

        *data++ = (gint8) CLAMP (val, VOLUME_MIN_INT8, VOLUME_MAX_INT8);
      }
    }
    volume++;
    mute++;
  }

lets at least move the "if (*mute)" check out of the channel loop. Otherwise
this is okay. Compared to the uncontrolled loops we have the overhead for
nested loops and interating over 3 arrays. In addition we have the overhead for
filling the 2 control value arrays all the time. SO the controller overhead is larger than the actual operation.

I see the control-value array functionality more suitable for fetching the
control curse for display purposes in the UI.

in volume_before_transform:

* shouldn't the GstControlSource object be taken in e.g. volume_setup() - this doesn't change while running.

* also filling the self->mutes/volumes with default values needs to be done when the size has changes and not on every buffer.

Comment 13 Stefan Sauer (gstreamer, gtkdoc dev) 2010-02-17 21:24:22 UTC

In reply to comment #11. I agree that if the inner loop is multiplying the samples with a const value or with one vector, then the speed difference might be acceptable. To get there we would need:
* process_controlled() to take only one volume array
  * thus if mute is also controlled, merge both: volume=(1-mute)*volume;
* some means to avoid updating control value array unless there is a change

Comment 14 Sebastian Dröge (slomo) 2010-02-18 08:13:22 UTC

(In reply to comment #12)
> Now more comments on the actual patch - it is nicely done. But honestly I don't
> think we want to do it like this for audio. For video we can sync on each
> frame. For audio we want to sync at a similar interval.
> 
> process_controlled_*
> ...
>   for (i = 0; i < num_samples; i++) {
>     for (j = 0; j < channels; j++) {
>       if (*mute) {
>         *data++ = 0;
>       } else {
>         gdouble val = *data * (*volume) + 0.5;
> 
>         *data++ = (gint8) CLAMP (val, VOLUME_MIN_INT8, VOLUME_MAX_INT8);
>       }
>     }
>     volume++;
>     mute++;
>   }
> 
> lets at least move the "if (*mute)" check out of the channel loop. Otherwise
> this is okay. Compared to the uncontrolled loops we have the overhead for
> nested loops and interating over 3 arrays. In addition we have the overhead for
> filling the 2 control value arrays all the time. SO the controller overhead is
> larger than the actual operation.

Yes, that can be optimized of course. We could even get rid of the mute array by first combining it with the volume array.

> I see the control-value array functionality more suitable for fetching the
> control curse for display purposes in the UI.
> 
> in volume_before_transform:
> 
> * shouldn't the GstControlSource object be taken in e.g. volume_setup() - this
> doesn't change while running.

It doesn't? It can change at every point in time unfortunately, it could even change while we're processing a buffer. Nothing is preventing the application from doing that.

> * also filling the self->mutes/volumes with default values needs to be done
> when the size has changes and not on every buffer.

Because the control source can change for every buffer it is necessary unfortunately. For one buffer both could be there, for the next only the volume csource could be there.

(In reply to comment #13)
> In reply to comment #11. I agree that if the inner loop is multiplying the
> samples with a const value or with one vector, then the speed difference might
> be acceptable. To get there we would need:
> * process_controlled() to take only one volume array
>   * thus if mute is also controlled, merge both: volume=(1-mute)*volume;

Yes, that would probably be a good idea, see above ;) If the generic approach here is accepted I can optimize this kind of things

> * some means to avoid updating control value array unless there is a change

the GstController API definitely needs some improvements and cleanups, yes ;) Most of that can only be done in 0.11 though...


Another way of lowering the overhead would be to generate a property value for every second sample or for every n-samples. That could be combined with the control-rate of the GstController somehow.

But it's definitely better to do this way than to sync the properties (including notify signals and stuff) multiple times for every buffer, that has a bigger overhead ;)

(In reply to comment #11)

> It can't imagine that rendering a controller curve for N samples should be
> such an expensive operation. Even less so when the algorithm is allowed to
> take shortcuts based on the configured granularity. I might be wrong, though...

Well, it's quite expensive. Let's take the GstInterpolationControlSource in linear interpolation mode for example (cubic is even more expensive of course, same goes for the GstLFOControlSource). For every position it currently searches inside a binary tree (GSequence to be exact), gets the current and next value, interpolates, returns it. Doing that for every sample is probably not the best idea :) Yes, for generating a row of values there could be a single binary tree search and incremental searches for the next value but that's still quite expensive.

Comment 15 Sebastian Dröge (slomo) 2010-02-18 10:22:06 UTC

Created attachment 154119 [details] [review]
volume: If a controller is used, use sample accurate property values

Fixes bug #609801.

Comment 16 Sebastian Dröge (slomo) 2010-02-18 10:23:26 UTC

(In reply to comment #15)
> Created an attachment (id=154119) [details] [review]
> volume: If a controller is used, use sample accurate property values
> 
> Fixes bug #609801.

This optimizes a bit already. Still I don't know how this should be properly optimized by MMX/SSE because there's a variable number of channels and double-to-something conversions are needed. Wim, Stefan?

Comment 17 Stefan Sauer (gstreamer, gtkdoc dev) 2010-02-19 19:55:55 UTC

This looks nicer. For the future, I wonder if we can add some shortcut to the controller api where it would tell, that there is no control-change in the requested array (and thus all fields would have the same value).

Comment 18 Sebastian Dröge (slomo) 2010-02-21 08:19:54 UTC

Ok, shall I push this after 0.10.27 release or do you want some things fixed first? Like optimizing the get_value_array() of the interpolation control source?

Comment 19 Stefan Sauer (gstreamer, gtkdoc dev) 2010-02-21 11:27:05 UTC

Sebastian, okay to push from my side. If its not asked too much could you do a before and after performance meassurement, so that we get an idea of huw big the overhead is?

Comment 20 Sebastian Dröge (slomo) 2010-02-21 17:11:24 UTC

Created attachment 154331 [details]
test.c

Ok, here's a small benchmark that processes 400s of audio, runs this 5 times and calculates the average and standard deviation. It's about 15.5 times slower now :)

0.400000 -> 6.198000

Comment 21 Sebastian Dröge (slomo) 2010-02-21 18:12:43 UTC

94% of the time is spent in gst_control_source_get_value_array(), most of them time in g_sequence_search(). That's easy to optimize and I'll do that before pushing this.

Comment 22 Sebastian Dröge (slomo) 2010-02-22 09:26:39 UTC

Ok, I have it down to 1.37 times slower for none interpolation. I guess that's acceptable :)

I'll optimize it everywhere else and do some benchmarks for other interpolation modes.

Comment 23 Sebastian Dröge (slomo) 2010-02-22 15:11:14 UTC

Created attachment 154404 [details] [review]
interpolationcontrolsource: Optimize get_value_array()

This makes it >10x faster if more than a single value is requested
by not searching in the GSequence for every value and converting
the value from GValue to the real value type.

Comment 24 Sebastian Dröge (slomo) 2010-02-22 19:02:30 UTC

Created attachment 154426 [details] [review]
lfocontrolsource: Optimize get_value_array()

Don't convert from GValue to the actual type for every single
value.

Comment 25 Sebastian Dröge (slomo) 2010-02-22 19:02:39 UTC

Created attachment 154427 [details] [review]
interpolationcontrolsource: Optimize get_value_array()

This makes it >10x faster if more than a single value is requested
by not searching in the GSequence for every value and converting
the value from GValue to the real value type.