After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 404646 - [audiofx] Compressor/Expander element
[audiofx] Compressor/Expander element
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal enhancement
: 0.10.6
Assigned To: Sebastian Dröge (slomo)
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2007-02-05 16:27 UTC by Sebastian Dröge (slomo)
Modified: 2007-03-08 10:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
audiodynamic.diff (24.66 KB, patch)
2007-03-06 17:25 UTC, Sebastian Dröge (slomo)
committed Details | Review

Description Sebastian Dröge (slomo) 2007-02-05 16:27:01 UTC
+++ This bug was initially created as a clone of Bug #340362 +++

It would be nice if a compressor element could be added to the audiofx element. This was already discussed partly in #340362 but this bug was in reality about something else.

Below are the essential comments from that bug



Rene Stadler:

-6 dB hard limiting means to apply a transfer function like

   ouput = tanh ((input - 0.5) / 0.5) * 0.5 + 0.5

to all values with input > 0.5 and a respective negative variant for values < 
-0.5.

This will smoothly compress values above 0.5 (ca. -6 dB), which has the
following advantages:

  - You can input values >1.0 and still get a good sounding output (unless you
really overdo it).  This is of use for ReplayGain if you disable clipping
prevention (which you don't have to).

  - The dynamic range gets reduced (input 1.0 -> output  ca. 0.88).  This
allows you for example to listen to classical music through loudspeakers in
front of a computer (noisy environment), or for playing music on a party.  This
is entirely unrelated to ReplayGain.

Stefan Kost:
I can second René's comments. Dynamic compression, expansion, limmiting can
also be seen as a lookup-table transformation in->out. A compressor starts to
work for a signal above a threshold and then compresses the signal. A hard
limmiter clamps the signal above the treshhold to the treshhold. A compressor
compresses signals above the threshold, e.g. a compressor with the ration 2:1
would do

out = thresh + ((in-thresh)/2.0)

That would be a compressor with hard knee characteristcs. The other bahaviour
is soft-knee which would use a smooth curved transition when going from linear
to compressed.


Rene Stadler:
http://bugzilla.gnome.org/attachment.cgi?id=73746&action=view

Graphs of different transfer functions

A picture says more than thousand words:

 1. Red graph: Clipping (as originally proposed by Артём)
 2. Blue graph: Linear filter (like Stefan mentioned)
 3. Green graph: Smooth hard limiting (like I mentioned)

All filters have a threshold of 0.5 (ca. -6 dB), that is they leave values
below 0.5 (above -0.5) intact.  The picture shows clearly why for clipping, the
waveform gets irreversibly distorted if it exceeds the threshold.  I mention
this because some people seem to have the impression that clipping can somehow
be undone by reducing the volume or something.  It is not.  Clipping is also
done implicitely by audioconvert when it converts from float to integer, which
is most often done to finally send the data to the sound card.  That's the red
area on the graphs, it can never be reached in integer formats as it is beyond
what is defined as maximum.

Second one is a real signal compressor that gives good results instead of
useless distortion.  It's exactly what Stefan said: Basically a selective
volume change that only operates on values exceeding a threshold.  It is very
efficient (just a multiplication in addition to the switching), but that comes
at a price: Trained ears (which I don't claim to have) seem to be able to hear
the point of the threshold because of the rather abrupt change that exceeding
the threshold introduces.  This depends on the input data and especially the
ratio of course.

The third one avoids that by giving smooth values instead.  These come at the
price of higher computational comlexity (a tanhf in addition to the switching
plus scaling around a bit).


Stefan Kost:
In pro-audio devices compressors have four parameters:
* attack (lets omit this for now)
* ratio
* threshold (the point where the compression kicks in)
* characteristics = {soft-knee, hard-knee)

I would use threshold (double) and characteristics (enum) and ration (fraction)
as gobject properties. I am not sure wheter the ration has to be a fraction or
can't simply be a double too.
Comment 1 Sebastian Dröge (slomo) 2007-02-05 16:43:22 UTC
Stefan, what would ratio correspond to in the soft-knee case? The limit, i.e. in the 0.5 case mentioned above 1.0?
Comment 2 Stefan Sauer (gstreamer, gtkdoc dev) 2007-02-10 20:42:48 UTC
"... When the Compressor is set for Hard knee, the compression ratio applies only to signals above the threshold level. If the Compressor is set for Soft knee, the compression ratio gradually increases from 1:1 to the currently selected ratio over a range of approximately 5 dB, so that the transition from uncompressed to compressed is more gradual. The difference between Hard Knee and Soft Knee is more obvious at high compression ratios. Once the input signal crosses the Threshold, the unit will compress the signal at the full ratio level. ..."
-- http://alesis.com/support/notes/Signal_Processing/Compterm.html
Comment 3 Sebastian Dröge (slomo) 2007-03-01 23:18:53 UTC
Ok, roadmap for now is the following:
1) Implement an element with the following properties:
   - characteristics = {hardknee,softknee}
   - mode = {compressor,expander}
   - ratio = [0.0;G_MAXDOUBLE]
   - threshold = [0.0;1.0]

The soft-knee solution will most probably use a bezier curve in the beginning, which will have a slope of 1 at the $threshold and a slope of $ratio$ at the maximum (i.e. 0 for expander, 1 compressor).

2) Implement additional properties:
   - attack = [0.0;GST_SECOND]
   - release = [0.0;GST_SECOND]

Attack will look "in the future" and if something above $threshold is noticed $attack seconds in the future the ratio will be raised from 1 to $ratio over this time, having full compression at the point where the input is above $threshold.

If the input is below $threshold again the compression will be lowered from $ratio to 1 in the following $release seconds. If the input gets above $threshold again in $attack seconds while lowering the compression the effect of $attack and $release are mixed.


Any additional ideas on this?
Comment 4 Sebastian Dröge (slomo) 2007-03-06 17:25:45 UTC
Created attachment 84087 [details] [review]
audiodynamic.diff

Ok, here's a first version of state 1). Currently it works as compressor and expander in soft-knee and hard-knee mode without any problems. Unit tests and docs integration will follow after someone reviewed this.

The only thing I'm not yet completely sure about is that I look at each channel separately. Could this lead to problems, is it necessary to look at all channels at once and apply the same rate to them? How exactly should this work?
Comment 5 Sebastian Dröge (slomo) 2007-03-06 17:34:34 UTC
It might also make sense to use gdoubles instead of gfloat for some intermediate variables.
Comment 6 Stefan Sauer (gstreamer, gtkdoc dev) 2007-03-08 07:41:54 UTC
looks good. some nitpicks:

* Instead of (1 - filter->ratio) I would write (1.0 - filter->ratio) to show that it is a float operation.

* what is 'zero' ? maybe a comment in the algorithm.

But basicaly, it can go in. Thanks for the hard work on this.
Comment 7 Sebastian Dröge (slomo) 2007-03-08 10:04:12 UTC
Ok, thanks. I committed it with some small changes (+ docs integration and unit test):

* Use 1.0 and 0.0 instead of 1 and 0 in the float cases
* Use doubles as intermediate values to prevent rounding errors
* Check in the beginning of each processing function if we actually have to do something, this removes some FIXME because division by zero can't happen anymore. One is still present though...
* Improve comments


2007-03-08  Sebastian Dröge  <slomo@circular-chaos.org>

	reviewed by: Stefan Kost  <ensonic@users.sf.net>

	* gst/audiofx/Makefile.am:
	* gst/audiofx/audiodynamic.c:
	(gst_audio_dynamic_characteristics_get_type),
	(gst_audio_dynamic_mode_get_type),
	(gst_audio_dynamic_set_process_function),
	(gst_audio_dynamic_base_init), (gst_audio_dynamic_class_init),
	(gst_audio_dynamic_init), (gst_audio_dynamic_set_property),
	(gst_audio_dynamic_get_property), (gst_audio_dynamic_setup),
	(gst_audio_dynamic_transform_hard_knee_compressor_int),
	(gst_audio_dynamic_transform_hard_knee_compressor_float),
	(gst_audio_dynamic_transform_soft_knee_compressor_int),
	(gst_audio_dynamic_transform_soft_knee_compressor_float),
	(gst_audio_dynamic_transform_hard_knee_expander_int),
	(gst_audio_dynamic_transform_hard_knee_expander_float),
	(gst_audio_dynamic_transform_soft_knee_expander_int),
	(gst_audio_dynamic_transform_soft_knee_expander_float),
	(gst_audio_dynamic_transform_ip):
	* gst/audiofx/audiodynamic.h:
	* gst/audiofx/audiofx.c: (plugin_init):
	Add new audiodynamic element which can act as a compressor or
	expander. Supported are hard-knee and soft-knee operation modes with
	user-specified ratio and threshold.
	Attack and release parameters are not yet implemented but will follow.
	* docs/plugins/Makefile.am:
	* docs/plugins/gst-plugins-good-plugins-docs.sgml:
	* docs/plugins/gst-plugins-good-plugins-sections.txt:
	* docs/plugins/gst-plugins-good-plugins.args:
	* docs/plugins/inspect/plugin-audiofx.xml:
	Integrate audiodynamic into the docs.
	* tests/check/Makefile.am:
	* tests/check/elements/audiodynamic.c: (setup_dynamic),
	(cleanup_dynamic), (GST_START_TEST), (dynamic_suite), (main):
	Add unit test for audiodynamic.