Bug 725167 – opusdec PLC doesn't seem to work as well as Chrome

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 725167 - opusdec PLC doesn't seem to work as well as Chrome


Summary:	opusdec PLC doesn't seem to work as well as Chrome


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-bad
Version:	1.2.1
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	1.5.90
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2014-02-25 19:32 UTC by tcdgreenwood
Modified:	2015-08-16 13:38 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch to gst-plugins-base that updates calculations for timestamps during packet loss to give a better estimate (3.24 KB, patch) 2014-05-20 14:40 UTC, tcdgreenwood	needs-work	Details \| Review
Patch for gst-plugins-good that gives rtptime exact times during packet loss (4.88 KB, patch) 2014-05-20 14:42 UTC, tcdgreenwood	rejected	Details \| Review
Patch for gst-plugins-bad that gives the opus codec the correct timings for PLC (6.39 KB, patch) 2014-05-20 14:46 UTC, tcdgreenwood	needs-work	Details \| Review
alternative patch for frame size determintion on FEC/PLC cases (3.33 KB, patch) 2015-01-14 10:51 UTC, Vincent Penquerc'h	none	Details \| Review
New patch for PLC frame size calculation (4.25 KB, patch) 2015-04-20 13:09 UTC, Carlos Rafael Giani	committed	Details \| Review

Description tcdgreenwood 2014-02-25 19:32:04 UTC

I have noticed that PLC in opusdec seems far worse than in Chrome - it does make some difference but not as much as it could.  I think this is to do with the code in gstopusdec.c line 427 that fixes the samples at 120.  The documentation for the opus codec (see http://www.opus-codec.org/docs/html_api-1.1.0/group__opus__decoder.html#ga7d1111f64c36027ddcb81799df9b3fc9) states:

In the case of PLC (data==NULL) or FEC (decode_fec=1), then frame_size needs to be exactly the duration of audio that is missing, otherwise the decoder will not be in the optimal state to decode the next incoming packet. For the PLC and FEC cases, frame_size must be a multiple of 2.5 ms.

I think gstreamer has this information from the GAP_EVENT that is processed in gstaudiodecoder.c, and is passed to gstopusdec.c in the 0 size buffer that is passed in for PLC.  I think that using the correct number for samples may improve things.

I will try to find a way to get the timings from the GstBuffer and use them to set the correct size of samples in the PLC case (where the size of the buffer is 0) in order to create a patch.

Comment 1 tcdgreenwood 2014-02-26 12:16:34 UTC

Also think I may have spotted another related problem.  The code currently checks if the last_buffer was missing by checking if it's not NULL:

if (dec->last_buffer) {
  /* normal delayed decode */
  ...
} else {
  /* FEC reconstruction decode */
  ...
}

I think that the contract says the buffer will be 0 size for PLC so it should be checking the size of dec->last_buffer as it does earlier when deciding if to decode using buffer or last_buffer.  gst_buffer_get_size seems to return 0 if the buffer is null - so I think it should be safe to change it.

Comment 2 tcdgreenwood 2014-03-03 15:48:44 UTC

Having looked into this further there is an issue with using the gap events in that it seems that opus doesn't like the samples to represent exactly one lost packet (in this case 20ms).  The duration from the gap events is quite variable when using a real rtpbin - sometimes much less than 1 lost packet (in my case it's sometimes less than 1ms).  I will continue to look into it!

Comment 3 Olivier Crête 2014-03-03 18:21:37 UTC

Imho, if the gap is less than one packet, you should just ignore it (and get desynchronized a bit), if it is more,than you can just call into libopus multiple times ?

Comment 4 tcdgreenwood 2014-03-03 23:00:43 UTC

(In reply to comment #3)
> Imho, if the gap is less than one packet, you should just ignore it (and get
> desynchronized a bit), if it is more,than you can just call into libopus
> multiple times ?

If you look at the opus codec API docs:

In the case of PLC (data==NULL) or FEC (decode_fec=1), then frame_size needs to
be exactly the duration of audio that is missing, otherwise the decoder will
not be in the optimal state to decode the next incoming packet.

It seems like it doesn't go badly wrong if you pass in different times, but I am trying to get the PLC/FEC to work as well as it can.  I think if the timing was based on the RTP timestamps I think it could be pretty exact.

From looking at the jitterbuffer code it seems that the DTS is based on the times received so has jitter.  Though I am still trying to work things through.  I was wondering if another alternative was to use PTS as I understand it's got the smoothed out values which might give a value much closer to the difference in RTP timestamps.

Comment 5 Olivier Crête 2014-03-07 17:45:37 UTC

RTP doesn't carry DTS information, so the DTS is meaningless. rtpjitterbuffer should probably set it to the PTS or just set it to GST_CLOCK_TIME_NONE

Comment 6 tcdgreenwood 2014-03-10 11:02:26 UTC

That's helpful as it confirms what I was thinking.  Currently the GstRTPPacketLost event raised in gstrtpjitterbuffer.c takes the timestamp from a timer that I think is always created using expected DTS when the packet would be received.  As far as I can tell the events only contain one timestamp value so there is no PTS available for the packet lost case.  In the opus depayloader the code creates the gap event using this timestamp so it's filling in the timestamp of the gap event with a DTS value.

I can see 2 different ways this could be fixed:
 - Supplying PTS in the packet loss and gap events
 - Attempting to calculate the PTS for the packet within the audio decoder so the audio to cover the gap starts straight after the last decoded sample from the last packet.

Does that make sense or is there a better way to fix this?

I think that fixing the opus decoder to match the API is a good idea as well - which I have hacked something for but it's not good at the moment :(

(In reply to comment #5)
> RTP doesn't carry DTS information, so the DTS is meaningless. rtpjitterbuffer
> should probably set it to the PTS or just set it to GST_CLOCK_TIME_NONE

Comment 7 Olivier Crête 2014-03-10 15:01:30 UTC

GstRTPPacketLost normally contains a PTS, not a DTS (there is no DTS in the RTP case at all). Anyway, for all audio cases, DTS==PTS all the time, so the difference is meaningless there.

Comment 8 tcdgreenwood 2014-03-12 11:28:26 UTC

It seems to me that rtpbin sets the DTS to the receive time with some adjustment, but then calculates the PTS.  I've traced through the code and the packet loss events and they are created with the expected receive time - which is similar to DTS.  Also the code has a comment that states:

gstrtpjitterbuffer.c
/* take the DTS of the buffer. This is the time when the packet was
   * received and is used to calculate jitter and clock skew. We will adjust
   * this DTS with the smoothed value after processing it in the
   * jitterbuffer and assign it as the PTS. */

It seems to me that the main thing that doesn't end up in DTS is the clock-skew calculations.

In the rtpjitterbuffer.c  the PTS is set:
/* do skew calculation by measuring the difference between rtptime and the
   * receive dts, this function will return the skew corrected rtptime. */
  item->pts = calculate_skew (jbuf, rtptime, dts);

I cannot see any code in that would set DTS to PTS after this unless DTS isn't set (which it is almost always for my use case).  If as you assert DTS==PTS then perhaps when we push the buffer DTS should be unset or set to PTS.

Comment 9 tcdgreenwood 2014-03-12 16:19:46 UTC

I made some changes to the logging so I could see the values used in my pipeline:

0:00:22.050750000 ^[[335m40732^[[00m 0x7fab6c037370 ^[[37mDEBUG  ^[[00m ^[[00m     rtpjitterbuffer gstrtpjitterbuffer.c:2349:pop_and_push_next:<rtpjitterbuffer3>^[[00m Pushing buffer 3271, dts 0:00:06.122462478, pts 0:00:05.728122767

You can see the buffer it's pushing has a few 100ms difference between dts and pts.

Comment 10 tcdgreenwood 2014-05-20 14:40:33 UTC

Created attachment 276871 [details] [review]
Patch to gst-plugins-base that updates calculations for timestamps during packet loss to give a better estimate

Comment 11 tcdgreenwood 2014-05-20 14:42:05 UTC

Created attachment 276872 [details] [review]
Patch for gst-plugins-good that gives rtptime exact times during packet loss

Comment 12 tcdgreenwood 2014-05-20 14:46:09 UTC

Created attachment 276873 [details] [review]
Patch for gst-plugins-bad that gives the opus codec the correct timings for PLC

These 3 patches combine to allow exact timings to be provided into the codec when packets are lost.  Without this the codec has no idea of how long to correct for and it turns out that this gives much improved packet loss concealment.  In real life tests with WebRTC we found that opus audio went from unintelligible to pretty good with the same level of packet loss.

Comment 13 tcdgreenwood 2014-05-20 14:54:38 UTC

BTW sorry this took so long I have had the patches for a while but no time to upload them!  Please note that the patches contain one TODO on some code that I didn't know how to deal with and you may want to review which debug statements are required.  I'm happy to make any changes you want.

I am also not sure if the approach is correct, but I couldn't find another way to give exact timings for PLC.

Comment 14 Olivier Crête 2014-06-03 20:00:07 UTC

Review of attachment 276873 [details] [review]:

The depayloader part is not opus specific at all

::: ext/opus/gstrtpopusdepay.c
@@ +94,3 @@
 static gboolean
+gst_rtp_opus_depay_packet_lost (GstRTPBaseDepayload * depayload,
+    GstEvent * event)

This probably all belongs in GstRtpBaseDepayload in gst-plugins-base, it's not Opus specific at all.

Comment 15 Olivier Crête 2014-06-03 20:04:45 UTC

Review of attachment 276871 [details] [review]:

::: gst-libs/gst/audio/gstaudiodecoder.c
@@ +808,2 @@
   /* normalize to bool */
+  dec->priv->agg = !!res;

Please don't make unrelated changes. and run your commits through the commit hook that uses gst-indent. "! !res" is what gst-indent wants.

@@ +1816,3 @@
+           audio ends (though can make a reasonable guess based on the presumption
+           of constant ptime) */
+        if (!GST_CLOCK_TIME_IS_VALID (dec->priv->out_ts)) {

You probably don't meant to use "!" here.

Comment 16 Olivier Crête 2014-06-03 20:17:10 UTC

Review of attachment 276872 [details] [review]:

I'm not convinced that this patch is correct, removing the call to apply_offset() looks quite suspect.

And actually, why do you pass the duration as rtptime and then convert it back to nanoseconds in the depayloader instead of just using the regular duration here?

::: gst/rtpmanager/gstrtpjitterbuffer.c
@@ +1507,3 @@
 apply_offset (GstRtpJitterBuffer * jitterbuffer, GstClockTime timestamp)
 {
+  GstClockTime offsetTime = timestamp;

I'm not sure why you just rename the variable here, this is not useful or needed.

Comment 17 Tommy Falk 2014-08-12 14:20:20 UTC

Just wanted to chime in on this bug report. The problem with the OPUS decoder always producing 120ms concealment frames regardless of the length actual gap event is really limiting the audio quality for low latency applications where late/lost frames are frequent.

BR,
TOmmy

Comment 18 Vincent Penquerc'h 2015-01-13 12:00:38 UTC

Indeed, there seems to be remnants of NULL buffer vs zero sized buffers checks needing fixing. About the timestamps, since opusenc will encode samples in chunks of well known times (multiples of 2.5 ms), you should get gaps that also match these, unless something messes with the timestamps between opusenc and opusdec. Does RTP not send proper timestamps in this case ?

Comment 19 Vincent Penquerc'h 2015-01-14 10:51:33 UTC

Created attachment 294503 [details] [review]
alternative patch for frame size determintion on FEC/PLC cases

Here's another possible patch.
I think (not 100% sure) that the duration should always be taken from the missing data. This patch also keeps track of any leftover samples when the number of missing samples is not a multiple of the minimum frame size. This leftover is carried over non missing frames, which I'm not sure is the best thing to do (as opposed to re-zeroing it when an actual packet is received). Doing it this way will desync less over time I think.

Tested with:

gst-launch-1.0 uridecodebin uri=file://path/to/audio.flac  ! audioconvert ! audioresample !  opusenc frame-size=X inband-fec=1 ! identity drop-probability=0.1 ! opusdec use-inband-fec=1 plc=1 ! pulsesink

X being 2, 5, 10, and using either use-inband-fec=1 and plc=1. This required a change to identity to send gap events when dropping, which I'll commit shortly.

It sounds better than the original but I don't know how it compares to Chrome.

Comment 20 Carlos Rafael Giani 2015-04-20 13:09:30 UTC

Created attachment 301992 [details] [review]
New patch for PLC frame size calculation

I wrote a new patch based on the idea from Vincent Penquerc'h . It does essentially the same as his patch, with the following differences:

(1) The calculations are done with the durations instead of the sample count. The criterion for Opus' PLC to work is that frame lengths must be a multiple of 2.5ms, so it makes sense to do the calculations directly in the time domain.

(2) The patch had a bug that caused decoding errors. Since the plc_samples value got rounded (and not just truncated), the leftover samples count could become negative. Furthermore, PLC intervals which are smaller than 2.5ms were not being handled properly. The new patch fixes these issues.

Comment 21 Carlos Rafael Giani 2015-07-20 16:09:12 UTC

I'd like to push my patch for inclusion in 1.6. We run the Opus PLC all the time in our system, and it works well. We saw no stability issues or timestamp problems with it.

Comment 22 Nicolas Dufresne (ndufresne) 2015-07-20 17:17:25 UTC

Can you cleanup the patchset, anything that is not relevant should be marked as obsolete. In my quick lookup for reviews, when I see patches marked "need-work" that has not be obsoleted by another patch, I skip that bug assuming it's not ready yet.

Comment 23 Carlos Rafael Giani 2015-07-20 17:21:05 UTC

I honestly don't understand enough about what the other patches exactly do or achieve to be able to clean them up. It seems to me as if they are fixing a related, but distinct issue. Perhaps they should be handled in a separate bugzilla entry.

Comment 24 Olivier Crête 2015-07-20 17:26:53 UTC

Comment on attachment 301992 [details] [review]
New patch for PLC frame size calculation

Merged, please re-open if something else is needed!

commit a595874aac5d8b7a5c628ef34d749d558c791d5f
Author: Carlos Rafael Giani <dv@pseudoterminal.org>
Date:   Mon Apr 20 15:04:56 2015 +0200

    opusdec: Fix PLC frame size calculations
    
    Previously, PLC frames always had a length of 120ms, which caused audio
    quality degradation and synchronization errors. Fix this by calculating an
    appropriate length for the PLC frame.
    
    The length must be a multiple of 2.5ms. Calculate a multiple of 2.5ms that
    is nearest to the current PLC length. Any leftover PLC length that didn't
    make it into this frame is accumulated for the next PLC frame.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=725167