GNOME Bugzilla – Bug 607641
snapping when converting aac (from flv) to mp3/vorbis
Last modified: 2011-11-21 21:58:20 UTC
Download a HD flv from youtube. Try to convert the sound to mp3 or vorbis. You can clearly ear snapping. Maybe some form of buffer holes or clipping, can't say, but anyway it sounds awful. Tested on Ubuntu karmic. With both soundconverter and gst-launch. gst-launch-0.10 --gst-debug=2 filesrc location=test.flv name=src ! decodebin name=decoder ! audioconvert ! vorbisenc ! oggmux ! filesink location=test.ogg The only warning reported by gst-launch: 0:00:34.248300504 9551 0x16f7580 WARN vorbisenc vorbisenc.c:1197:gst_vorbis_enc_chain:<vorbisenc0> Buffer is older than previous timestamp + duration (0:01:12.446000000< 0:01:12.446219954), cannot handle. Clipping buffer. 0:00:34.250140797 9551 0x16f7580 WARN vorbisenc vorbisenc.c:1220:gst_vorbis_enc_chain:<vorbisenc0> Buffer is discontinuous, flushing encoder and restarting (Discont from 0:01:12.449553288 to 0:01:12.493000000) If I insert a "wavenc ! wavparse" in the pipeline, the sound is ok.
soundconverter bug entry: https://bugs.launchpad.net/ubuntu/+source/soundconverter/+bug/508767
I can't read the distorted ogg vorbis file with Audacity, and sox report the following error when converting it to wav: "sox WARN vorbis: Warning: hole in stream; probably harmless" The file is definitely broken.
Tried adding an audiorate in front of vorbisenc?
I tried adding audiorate, queue, decodebin2, nothing improved the sound. That's why I pass it to people knowing gstreamer better.
Piece of fakesink's verbose output for filesrc ! flvdemux ! faad ! fakesink ( 4096 bytes, timestamp: 0:00:00.023000000, duration: 0:00:00.023219955, offset: ( 4096 bytes, timestamp: 0:00:00.046000000, duration: 0:00:00.023219955, offset: ( 4096 bytes, timestamp: 0:00:00.070000000, duration: 0:00:00.023219955, offset: ( 4096 bytes, timestamp: 0:00:00.093000000, duration: 0:00:00.023219955, offset: ( 4096 bytes, timestamp: 0:00:00.116000000, duration: 0:00:00.023219955, offset: And the fakesink's output for filesrc ! flvdemux ! fakesink ( 9 bytes, timestamp: 0:00:00.023000000, duration: none, offset: 1, offset_en ( 9 bytes, timestamp: 0:00:00.046000000, duration: none, offset: 2, offset_en ( 9 bytes, timestamp: 0:00:00.070000000, duration: none, offset: 3, offset_en ( 9 bytes, timestamp: 0:00:00.093000000, duration: none, offset: 4, offset_en ( 9 bytes, timestamp: 0:00:00.116000000, duration: none, offset: 5, offset_en Seems like faad is adding keeping upstream timestamps and adding its own durations without caring if they keep stream continuous.
IMHO, faad can not really be blamed here: * it is keeping up with upstream timestamps (good) * (presumably) outgoing buffers have correct duration (good) (that is, corresponding to the number of samples for the given samplerate) So, the "discontinuity" is inherent (upstream), and it would be detected sooner or later downstream. However, if doing playback, the sink would not be so fanatically picky and allow for some (e.g. 20ms) jitter. (IIRC) vorbisenc does not even allow for a (single?) sample variation. That would practically force using audiorate or a wavenc/wavparse trick in a lot of cases (and even then results might not be so good as simply accepting minor jitter that averages out). As such, I would expect encoder (e.g. vorbisenc) to be consistent with sink (and not to start creating discont or whatever if the sink would not).
OK, I'll hide under a rock for a while. I've retried fixing this, and adding an audiorate indeed fixed the stream. How dumb do I look now ?