Bug 651615 – [vorbisenc] Too small jitter tolerance

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 651615 - [vorbisenc] Too small jitter tolerance


Summary:	[vorbisenc] Too small jitter tolerance


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	0.10.34
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	0.10.36
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	656211 (view as bug list)
Depends on:
Blocks:	650960

Reported:	2011-06-01 08:13 UTC by John A. Barbuto
Modified:	2011-10-19 08:46 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Relax overly-tight jitter tolerances in gstvobisenc (4.05 KB, patch) 2011-08-09 08:26 UTC, xiphmont	committed	Details \| Review
Correct sample number rounding resulting in timestamp jitter (1.31 KB, patch) 2011-08-09 08:27 UTC, xiphmont	committed	Details \| Review

Description John A. Barbuto 2011-06-01 08:13:50 UTC

Hello,

After upgrading from Ubuntu 10.10 (Maverick) to 11.04 (Natty), I'm finding that files transcoded by apps from FLAC to Ogg Vorbis won't play on my Android phone. Both banshee and soundconverter are showing this problem; files I've transcoded by hand with oggenc or gst-launch (with the following command) aren't:

gst-launch-0.10 filesrc location=foo.flac ! flacdec ! audioconvert ! vorbisenc quality=0.6 ! oggmux ! filesink location=foo.ogg

I've created two Vorbis files of Pink Floyd's "Time" for comparison, one that was transcoded with gst-launch and the other with soundconverter:

http://www.barbuto.org/Time.works.ogg
http://www.barbuto.org/Time.doesntwork.ogg

Though they both have the same sampling/bitrate settings, the latter is slightly larger.

I initially submitted a bug report with Android (http://code.google.com/p/android/issues/detail?id=17260), but it's increasingly looking like a gstreamer issue.

I submitted a bug report with Ubuntu as well: https://bugs.launchpad.net/ubuntu/+source/gstreamer0.10/+bug/790522
The Ubuntu changelog can be found at http://packages.ubuntu.com/natty/gstreamer0.10-plugins-base

I upgraded gstreamer to 0.10.34 using the PPA at https://launchpad.net/~gstreamer-developers/+archive/ppa, but it didn't fix the problem.

Comment 1 Sebastian Dröge (slomo) 2011-06-01 08:21:52 UTC

The doesntwork version has discontinuities for some reason:

$ ogginfo Time.doesntwork.ogg
[...]
WARNING: sequence number gap in stream 1. Got page 23 when expecting page 22. Indicates missing data.
WARNING: discontinuity in stream (1)
WARNING: discontinuity in stream (1)
[...]


Could you get a debug log from soundconverter with GST_DEBUG=vorbisenc:5,flacdec:5,ogg*:5 ?

Also, does the same happen if you transcode it with gst-launch and plug an audiorate element before vorbisenc?

Comment 2 John A. Barbuto 2011-06-01 09:01:12 UTC

Plugging in audiorate didn't make a difference, the file played fine.
The debug log from soundconverter is located at http://www.barbuto.org/debug.log.gz

Comment 3 Sebastian Dröge (slomo) 2011-06-02 08:52:19 UTC

There are also many discontinuities detected by vorbisenc, which will result in exactly this behaviour. 


But using audiorate before vorbisenc should fix this then. Could you also create a new debug log with audiorate before vorbisenc and GST_DEBUG=vorbisenc:5,flacdec:5,ogg*:5,audiorate:5

Comment 4 John A. Barbuto 2011-06-02 09:21:50 UTC

Done, it's located at http://www.barbuto.org/gst-launch-audiorate.debug.log.gz

Comment 5 xiphmont 2011-08-09 08:26:19 UTC

Created attachment 193463 [details] [review]
Relax overly-tight jitter tolerances in gstvobisenc

vorbisenc currently reacts in a rather draconian fashion if input
timestamps are more than 1/2 sample off what it considers ideal. If data
is 'too late' it truncates buffers, if it is 'too soon' it completely
shuts down encode and restarts it.  This is causing vorbisenc to produce
corrupt output when encoding data produced by sources with bugs that
produce a smple or two of jitter (eg, flacdec)

Comment 6 xiphmont 2011-08-09 08:27:27 UTC

Created attachment 193464 [details] [review]
Correct sample number rounding resulting in timestamp jitter

flacdec converts the src timestamp to a sample number, uses that internally, then reconverts the sample number to a timestamp for the output buffer.  Unfortunately, sample numbers can't be represented in an integer number of nanoseconds, and the conversion process was truncating rather than rounding, resulting in sample numbers and output timestamps that were often off by a full sample.

This corrects the time->sample convesion

Comment 7 xiphmont 2011-08-09 08:28:44 UTC

This looks like the same bug as Fedora bug #722667 (redhat bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=722667). That bug has a test case and patches. I'm duplicating my comments and patchs here:

It is a combination of a bug in gstflacdec and a bad decision in gstvorbisenc.

gstflacdec is converting input buffer timestamp to sample number, using that
internally, then converting sample number back to timestamp for the output
buffer. Unfortunately, it's using gst_util_uint64_scale_int to convert
timestamp to sample number (which always rounds by truncating) and since sample
numbers aren't usually expressible in an integer number of nanoseconds, the
output buffer timestamp is off by a full sample about half the time.

gstvorbisenc, on the other side, is hardwired to tolerate only a half-sample of
jitter in the input buffer; if it sees more, it treats the input as a
discontinuous stream and resets the encode process. If time 'went backwards'
due to jittering back a sample, it truncates the buffer as well.

The combination of these two behaviors is obviously catastrophic to
FLAC->Vorbis transcodes, and worse, the corruption happens without any warning
whatsoever.

proposed patch 1/2:

Patch to the vorbis encoder element to relax jitter action threshold from .5 samples to 3 samples.

proposed patch 2/2:

corrects the bug in flacdec that was causing a full sample of timestamp jitter

The patches are tested here; either one fixes the specific reported bug. The
flacdec patch is for a real bug obviously, and I think the vorbisenc patch
should be put in place to back away a bit from a potentially bad design
decision. I'm not sure what the vorbisenc code thinks it's doing with respect
to discontinuous streams, but I am sure it should not be mistriggering on
inputs with very minor timestamp jitter.

Comment 8 Sebastian Dröge (slomo) 2011-08-09 13:13:46 UTC

*** Bug 656211 has been marked as a duplicate of this bug. ***

Comment 9 Sebastian Dröge (slomo) 2011-08-11 07:09:22 UTC

Some thoughts about the patches

patch 193463: We use different tolerances everywhere unfortunately. Some encoders use a multiple milliseconds value, some multiple or one sample, some don't allow any tolerance at all, some allow one "frame" of difference. In general your patch is correct though, especially the part with the negative difference. We should reach some consensus here before pushing this though, something like MAX(N milliseconds, M samples) probably makes most sense for some values of N and M ;)

patch 193464: This one is correct too but could you check in the convert_src and convert_sink functions if any of the other conversions should use correct rounding too? From a short look I'd say all of them should be changed.

Comment 10 xiphmont 2011-08-11 07:34:24 UTC

"We should reach some consensus here before pushing this though,
"something like MAX(N milliseconds, M samples) probably makes most sense for
"some values of N and M ;)


Your framework, do as you see fit :-)


"could you check in the convert_src
"and convert_sink functions if any of the other conversions should use correct
"rounding too?

I did; none of the others should be changed.

Comment 11 Sebastian Dröge (slomo) 2011-08-23 08:12:15 UTC

commit 9cbe7c1403de3d774dc25bc41fadcd1391de952e
Author: Monty Montgomery <cmontgom@redhat.com>
Date:   Thu Jul 21 17:16:26 2011 -0400

    vorbisenc: Relax overly-tight jitter tolerances in gstvobisenc
    
    vorbisenc currently reacts in a rater draconian fashion if input
    timestamps are more than 1/2 sample off what it considers ideal. If data
    is 'too late' it truncates buffers, if it is 'too soon' it completely
    shuts down encode and restarts it.  This is causingvorbisenc to produce
    corrupt output when encoding data produced by sources with bugs that
    produce a smple or two of jitter (eg, flacdec)


commit 799c8e3d04456ce0b22c03de66d20d0a1a599643
Author: Monty Montgomery <cmontgom@redhat.com>
Date:   Thu Jul 21 17:23:28 2011 -0400

    flacdec: Correct sample number rounding resulting in timestamp jitter
    
    flacdec converts the src timestamp to a sample number, uses that internally,
    
    This corrects the time->sample convesion



The jitter tolerance will be unified by using the audio encoder base class that is going to be in gst-plugins-base soonish.

Comment 12 Mark Nauwelaerts 2011-10-19 08:46:59 UTC

FWIW, as mentioned above, audioencoder now takes care of such jitter tolerance, and vorbisenc has been ported to it.