GNOME Bugzilla – Bug 729314
ogg: sample-accurate decoding/encoding is broken
Last modified: 2014-10-30 15:15:32 UTC
Created attachment 275522 [details] 8 second 440 Hz sine wave with 44.1 kHz sample rate, generated by Audacity The simple pipeline hangs forever with the attached ogg file: $ gst-launch-1.0 filesrc location=sine.ogg ! decodebin ! vorbisenc ! oggmux ! filesink location=output.ogg Setting pipeline to PAUSED ... Pipeline is PREROLLING ... Redistribute latency... Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock The input ogg file is an 8 second 440 Hz sine wave with 44.1 kHz sample rate that was generated by Audacity. slomo and thiblahute helped identify the problem on IRC: < thiblahute> Transcoding stalls at 7.6 secs with that sample and no EOS ever < slomo> yeah, EOS goes out of vorbisenc but not fruther < slomo> oggmux broken < stefanha> The number of samples in that file is 352800 / 44100 < stefanha> = 8s < slomo> but last buffer out of vorbisenc is: /GstPipeline:pipeline0/GstIdentity:identity0: last-message = chain ******* (identity0:sink) (76 bytes, dts: 0:00:08.000725623, pts:0:00:08.000725623, duration: 0:00:00.002902494, offset: 8003628118, offset_end: 352960, flags: 00000000 ) 0x7fa7f0050990 < slomo> last buffer into vorbisenc is: /GstPipeline:pipeline0/GstIdentity:identity0: last-message = chain ******* (identity0:sink) (4096 bytes, dts: none, pts:0:00:07.976780045, duration: 0:00:00.023219955, offset: -1, offset_end: -1, flags: 00000000 ) 0x7f26ec053070 < slomo> so there's your problem < slomo> (maybe) < slomo> vorbis like all the other codecs has a fixed block size, but iirc it has metadata to get rid of the added samples when decoding < slomo> last buffer if decoding directly again: /GstPipeline:pipeline0/GstFakeSink:fakesink0: last-message = chain ******* (fakesink0:sink) (3972 bytes, dts: none, pts: 0:00:07.977505668, duration: 0:00:00.022494332, offset: -1, offset_end: -1, flags: 00004000 tag-memory ) 0x7f4c84052cd0 < slomo> so looking at this, it seems data is actually lost < slomo> 1411840 bytes for just decoding the file, 1411204 bytes for encoding and decoding again < stefanha> If you count the number of samples it should be 352800 < slomo> so two bugs here, oggmux broken and something else ;) can someone put that into bugzilla please? < slomo> 352960 samples for decoding directly, 352801 for reencoding < slomo> (4 bytes per sample in the numbers above) < stefanha> :/ < slomo> so definitely something wrong here :)
For the record, since I started on the oggmux issue, and went on bisecting as it worked for me, the oggmux part of this is fixed in https://bugzilla.gnome.org/show_bug.cgi?id=729315.
The first bug is kind of shared between oggdemux and vorbisdec. The sample's last buffer has 1024 samples, but the 160 last ones are clipped. However, vorbisdec does not know the buffer is the last one, so does not tell libvorbis about it, so libvorbis does not clip them. GStreamer relies on segment clipping to clip these samples. When the segment has an indefinite end, no clipping will occur. When it has, the sampes will still not be clipped because oggdemux calculates the start time of the buffer from the end time minus the duration. Since clipped samples are expressed by the granulepos, this means oggdemux will timestamp the last buffer as starting before its actual start time, creating both a timestamp discontinuity and causing those end samples to be not clipped. The good fix would be to let oggdemux tell vorbisdec a packet is an EOS packet. It has that information, but it is discarded when pushing packets downstream. One could use a custom buffer flag, or a GstMeta for this. These would be Ogg specific I guess, since they are for the e_o_s flag in Ogg packets. Comments on whether this is acceptable, or if there is another preferred way to convey this information ? The second bug I've not looked at yet, but since there is a single sample off, I reckon it's something scaling and not rounding in vorbisenc.
Created attachment 277941 [details] [review] fix last buffer timestamp in oggdemux This patch fixes oggdemux calculating the wrong timestamp on the last buffer when samples are to be clipped. This fixes the extra samples being decoded when the segment has an end. When the segment is open ended, there is still a need to let vorbisdec know about the last buffer being the last. Either with a flag, meta, or by buffering one buffer in vorbisdec (at the cost of extra latency).
Similar issue for the encoding side: libvorbis generates a 1024 sample last block, and reports a lower number of actual samples. oggmux looks at the block and sees 1024 samples duration, and is unaware that libvorbis indicated a lower number, so sets the granpos to +1024, which will lead to those (dummy) samples being part of the stream. OFFSET and OFFSET_END are already used, not sure where I could pass this information to oggmux atm.
(In reply to comment #4) > Similar issue for the encoding side: libvorbis generates a 1024 sample last > block, and reports a lower number of actual samples. oggmux looks at the block > and sees 1024 samples duration, and is unaware that libvorbis indicated a lower > number, so sets the granpos to +1024, which will lead to those (dummy) samples > being part of the stream. OFFSET and OFFSET_END are already used, not sure > where I could pass this information to oggmux atm. vorbisenc should probably update the segment with a clipped value so that oggmux sets the granulepos correctly to the clipped value, similar to the demuxer case.
Yes, that seems good, thanks.
Created attachment 277952 [details] [review] vorbisenc: push updated segment end
Created attachment 277953 [details] [review] oggmux: set correct granpos based on segment stop time
Created attachment 277954 [details] [review] oggmux: set correct grapos on last page
With all those patches, the command line and file in the report produce a 8 second Ogg/Vorbis output.
(In reply to comment #7) > Created an attachment (id=277952) [details] [review] > vorbisenc: push updated segment end Same change is probably needed also in flacenc, speexenc, celtenc, opusenc, theoraenc, vp8enc and all the other Ogg compatible encoders
Created attachment 278188 [details] [review] opusenc: set segment stop time on stream end
It seems we can't know in advance whether the FLAC data we're asked to write is the last data. I don't think this makes sense for video, as we're getting one frame at a time. I'll check Speex next.
Created attachment 278191 [details] [review] speexenc: update segment stop time Only 99% sure speex uses the same semantics for granpos. I asked on IRC to double check.
The CELT elements were removed. Kate has duration in data packets. That leaves FLAC, I'll dig some more into this one, as it's the one where sample accuracy is the most desirable.
Created attachment 278194 [details] [review] flacenc: set segment stop time on last buffer That seems to be right, but I'm not seeing the write callback called after FLAC__stream_encoder_finish is called so not 100% it'd work. It may be that libFLAC always outputs buffers of whatever size was passed in, in which case this patch would be unnecessary, but in case it sometimes does call the write callback, this will ensure the segment is updated.
Pushed. The original command yields a 8.0 seconds file as expected.