Bug 767800 – Introduce a WebRTC Audio Processing based echo canceller for GStreamer

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 767800 - Introduce a WebRTC Audio Processing based echo canceller for GStreamer


Summary:	Introduce a WebRTC Audio Processing based echo canceller for GStreamer


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-bad
Version:	git master
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	1.9.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2016-06-17 21:06 UTC by Nicolas Dufresne (ndufresne)
Modified:	2016-06-23 17:26 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
webrtcdsp: Add WebRTC Audio Processing support (39.14 KB, patch) 2016-06-17 21:07 UTC, Nicolas Dufresne (ndufresne)	none	Details \| Review
webrtcdsp: Add WebRTC Audio Processing support (44.69 KB, patch) 2016-06-21 15:41 UTC, Nicolas Dufresne (ndufresne)	none	Details \| Review
webrtcdsp: Add WebRTC Audio Processing support (44.94 KB, patch) 2016-06-21 17:58 UTC, Nicolas Dufresne (ndufresne)	committed	Details \| Review

Description Nicolas Dufresne (ndufresne) 2016-06-17 21:06:08 UTC

I have written minimal support for echo cancelling using WebRTC Audio Processing library (v0.2). The design introduces two new element, webrtcdsp and webrtcechoprobe.

The webrtcdsp is called this way as it's responsible for Noise Supression, Echo Cancellation, Automatic Gain Control and many more filters. It can also provide audio level (rms), voice indication and various metrics. This element is generally placed near the capture.

On the other end, the webrtcechoprobe is a simpler element that keep a short amount of echo data. The echo data is the data that you are playing back, but that you want to suppress from your recording. It's generally the audio from the far end, probed right before the audiosink.

As the WebRTC library only accept 10ms buffers, both dsp and probe uses an adapter to accumulate and process data. The probe is configured on the DSP using it's object name. A minimalistic global cache has been created to let the dsp finds it's probe. Once a probed is used by a dsp, it cannot be used by any other dsp.

A symbolic pipeline would be:
  far_end_src ! webrtcechoprobe ! audiosink
  audiosrc ! webrtcdsp ! far_end_sink

One can test a single echo loop-back using the following pipeline. Note that the library is not design for this kind of usage. You will notice the side effect when sending a monotonic tone to it. But it's good enough for local testing. Such pipeline is expected to produce a single echo.

  pulsesrc ! webrtcdsp ! webrtcechoprobe ! pulesink

In the following implementation, both element need to reside in the same pipeline (they need to have the same base_time for proper synchronisation). The following patch is also a work-in-progress, though I'd like to merge with a minimal set of features. We can then add more features iteratively. Features that won't initially be implemented:

  - Beamforming (to use those stereo microphones!)
  - Level/Voice Activity/Metrics
  - Drift Control
  - Mobile Mode
  - Analog Gain Control (we enable it in software atm)

Comment 1 Nicolas Dufresne (ndufresne) 2016-06-17 21:07:43 UTC

Created attachment 329973 [details] [review]
webrtcdsp: Add WebRTC Audio Processing support

This DSP library can be used to enhance voice signal for real time
communication call. In implements multiple filters like noise reduction,
high pass filter, echo cancellation, automatic gain control, etc.

The webrtcdsp element can be used along, or with the help of the
webrtcechoprobe if echo cancellation is enabled. The echo probe should
be placed as close as possible to the audio sink, while the DSP is
generally place close to the audio capture. For local testing, one can
use an echo loop pipeline like the following:

  autoaudiosrc ! webrtcdsp ! webrtcechoprobe ! autoaudiosink

This pipeline should produce a single echo rather then repeated echo.
Those elements works if they are placed in the same top level pipeline.

TODO:
* Document the elements
* Add missing controls:
  - Echo Suppression level
  - Noise Suppression level

Comment 2 Nicolas Dufresne (ndufresne) 2016-06-21 15:41:09 UTC

Created attachment 330144 [details] [review]
webrtcdsp: Add WebRTC Audio Processing support

This DSP library can be used to enhance voice signal for real time
communication call. In implements multiple filters like noise reduction,
high pass filter, echo cancellation, automatic gain control, etc.

The webrtcdsp element can be used along, or with the help of the
webrtcechoprobe if echo cancellation is enabled. The echo probe should
be placed as close as possible to the audio sink, while the DSP is
generally place close to the audio capture. For local testing, one can
use an echo loop pipeline like the following:

  autoaudiosrc ! webrtcdsp ! webrtcechoprobe ! autoaudiosink

This pipeline should produce a single echo rather then repeated echo.
Those elements works if they are placed in the same top level pipeline.

Comment 3 Nicolas Dufresne (ndufresne) 2016-06-21 15:42:46 UTC

So I would suggest to create a baseline from that, it's really the strict minimum we need. We can build-up from there in parallel, exposing more feature of the WebRTC Audio Processing library.

Comment 4 Nicolas Dufresne (ndufresne) 2016-06-21 17:58:27 UTC

Created attachment 330149 [details] [review]
webrtcdsp: Add WebRTC Audio Processing support

This DSP library can be used to enhance voice signal for real time
communication call. In implements multiple filters like noise reduction,
high pass filter, echo cancellation, automatic gain control, etc.

The webrtcdsp element can be used along, or with the help of the
webrtcechoprobe if echo cancellation is enabled. The echo probe should
be placed as close as possible to the audio sink, while the DSP is
generally place close to the audio capture. For local testing, one can
use an echo loop pipeline like the following:

  autoaudiosrc ! webrtcdsp ! webrtcechoprobe ! autoaudiosink

This pipeline should produce a single echo rather then repeated echo.
Those elements works if they are placed in the same top level pipeline.

Comment 5 Nicolas Dufresne (ndufresne) 2016-06-21 18:00:10 UTC

Attachment 330149 [details] pushed as 398f705 - webrtcdsp: Add WebRTC Audio Processing support

Comment 6 Nicolas Dufresne (ndufresne) 2016-06-23 17:26:36 UTC

Note, I've pushed few fixes this morning, shall be slightly more robust now.

commit c551a853b323a6148d26aae3b22081df92d8e281
Author: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Date:   Wed Jun 22 22:28:03 2016 -0400

    webrtcdsp: Offset timestamp with duration

commit 86aa3b5f9c7016e749e4f389adc6c5a29108b6f3
Author: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Date:   Wed Jun 22 21:54:13 2016 -0400

    webrtcdsp: Synchronize with delays

commit fb8662eb5c64c9c25ef670a018174b74616b6a2a
Author: Nicolas Dufresne <nicolas.dufresne@collabora.com>
Date:   Wed Jun 22 21:45:08 2016 -0400

    webrtdsp: Remove restriction on channels number