Bug 607641 – snapping when converting aac (from flv) to mp3/vorbis

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 607641 - snapping when converting aac (from flv) to mp3/vorbis


Summary:	snapping when converting aac (from flv) to mp3/vorbis


Status:	RESOLVED NOTABUG

Product:	GStreamer
Classification:	Platform
Component:	dont know
Version:	0.10.25
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	git master
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2010-01-21 09:39 UTC by Gautier Portet
Modified:	2011-11-21 21:58 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Gautier Portet 2010-01-21 09:39:10 UTC

Download a HD flv from youtube. Try to convert the sound to mp3 or vorbis. You can clearly ear snapping. Maybe some form of buffer holes or clipping, can't say, but anyway it sounds awful.

Tested on Ubuntu karmic. With both soundconverter and gst-launch.

gst-launch-0.10 --gst-debug=2 filesrc location=test.flv name=src ! decodebin name=decoder ! audioconvert ! vorbisenc ! oggmux  ! filesink location=test.ogg

The only warning reported by gst-launch:
0:00:34.248300504  9551      0x16f7580 WARN               vorbisenc vorbisenc.c:1197:gst_vorbis_enc_chain:<vorbisenc0> Buffer is older than previous timestamp + duration (0:01:12.446000000< 0:01:12.446219954), cannot handle. Clipping buffer.
0:00:34.250140797  9551      0x16f7580 WARN               vorbisenc vorbisenc.c:1220:gst_vorbis_enc_chain:<vorbisenc0> Buffer is discontinuous, flushing encoder and restarting (Discont from 0:01:12.449553288 to 0:01:12.493000000)

If I insert a "wavenc ! wavparse" in the pipeline, the sound is ok.

Comment 1 Gautier Portet 2010-01-21 09:42:11 UTC

soundconverter bug entry: https://bugs.launchpad.net/ubuntu/+source/soundconverter/+bug/508767

Comment 2 Gautier Portet 2010-01-21 09:48:42 UTC

I can't read the distorted ogg vorbis file with Audacity, and sox report the following error when converting it to wav: "sox WARN vorbis: Warning: hole in stream; probably harmless"
The file is definitely broken.

Comment 3 Tim-Philipp Müller 2010-01-21 09:55:00 UTC

Tried adding an audiorate in front of vorbisenc?

Comment 4 Gautier Portet 2010-01-21 10:02:53 UTC

I tried adding audiorate, queue, decodebin2, nothing improved the sound.
That's why I pass it to people knowing gstreamer better.

Comment 5 Thiago Sousa Santos 2010-01-29 17:46:59 UTC

Piece of fakesink's verbose output for filesrc ! flvdemux ! faad ! fakesink

( 4096 bytes, timestamp: 0:00:00.023000000, duration: 0:00:00.023219955, offset:
( 4096 bytes, timestamp: 0:00:00.046000000, duration: 0:00:00.023219955, offset:
( 4096 bytes, timestamp: 0:00:00.070000000, duration: 0:00:00.023219955, offset:
( 4096 bytes, timestamp: 0:00:00.093000000, duration: 0:00:00.023219955, offset:
( 4096 bytes, timestamp: 0:00:00.116000000, duration: 0:00:00.023219955, offset:

And the fakesink's output for filesrc ! flvdemux ! fakesink

(    9 bytes, timestamp: 0:00:00.023000000, duration: none, offset: 1, offset_en
(    9 bytes, timestamp: 0:00:00.046000000, duration: none, offset: 2, offset_en
(    9 bytes, timestamp: 0:00:00.070000000, duration: none, offset: 3, offset_en
(    9 bytes, timestamp: 0:00:00.093000000, duration: none, offset: 4, offset_en
(    9 bytes, timestamp: 0:00:00.116000000, duration: none, offset: 5, offset_en


Seems like faad is adding keeping upstream timestamps and adding its own durations without caring if they keep stream continuous.

Comment 6 Mark Nauwelaerts 2010-02-01 10:54:52 UTC

IMHO, faad can not really be blamed here:
* it is keeping up with upstream timestamps (good)
* (presumably) outgoing buffers have correct duration (good)
(that is, corresponding to the number of samples for the given samplerate)

So, the "discontinuity" is inherent (upstream), and it would be detected sooner or later downstream.  However, if doing playback, the sink would not be so fanatically picky and allow for some (e.g. 20ms) jitter.  (IIRC) vorbisenc does not even allow for a (single?) sample variation.  That would practically force using audiorate or a wavenc/wavparse trick in a lot of cases (and even then results might not be so good as simply accepting minor jitter that averages out).

As such, I would expect encoder (e.g. vorbisenc) to be consistent with sink (and not to start creating discont or whatever if the sink would not).

Comment 7 Gautier Portet 2011-11-21 21:58:20 UTC

OK, I'll hide under a rock for a while. I've retried fixing this, and adding an audiorate indeed fixed the stream. How dumb do I look now ?