Bug 356692 – wavparse drops final sample in most files

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 356692 - wavparse drops final sample in most files


Summary:	wavparse drops final sample in most files


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-good
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	0.10.6
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2006-09-19 08:43 UTC by Michael Smith
Modified:	2007-03-02 13:31 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Ceiling division in duration and offset calculation (2.05 KB, patch) 2006-10-13 18:18 UTC, René Stadler	none	Details \| Review
Ceiling division in duration and offset calculation (2) (2.43 KB, patch) 2006-11-16 17:37 UTC, René Stadler	none	Details \| Review

Description Michael Smith 2006-09-19 08:43:22 UTC

When reading a standard PCM file, wavparse usually drops the final sample.

It appears to be a rounding error when going from samples -> time -> samples through a convoluted codepath involving segment and seek handling.

Comment 1 René Stadler 2006-09-19 12:32:13 UTC

Apparently rather the last frame (number-of-channels samples).

Comment 2 René Stadler 2006-10-12 18:44:33 UTC

Just identified this as a rounding problem (or rather lack thereof).  More detailed explanation and maybe a solution follows later.

Comment 3 René Stadler 2006-10-13 17:56:38 UTC

In gst_wavparse_stream_headers (line 1245):
   duration = gst_util_uint64_scale_int (wav->datasize, GST_SECOND, wav->bps);
In gst_wavparse_perform_seek (line 856):
   wav->end_offset = gst_util_uint64_scale_int (stop, wav->bps, GST_SECOND);

The first one turns the data size into the duration and the second one is supposed to invert that.  Of course this does not always work, consider a simple test file with 1024*10 samples in 16-bit integer, 44100 Hz mono format.  Then wav->bps will be 88200 and the data size 20480.  The complete file size is 20524 (20480 + 44) bytes.  Reading such a file with wavparse and then reencoding with wavenc turns this into 20522 bytes.  This is the bug, and here is the cause:

gst_util_uint64_scale_int (val, num, denom) is just val * num / denom, optionally performing this in 96 bits if that is needed.  This means that the combination of the two lines of code cited above computes the equivalent of this expression:

     20480 * 10^9 / 88200 * 88200 / 10^9 (which evaluates to 20479!)

     ^^^^^^^^^^^^^^^^^^^^
                        \- duration, passed to line 856 as "stop"

Two lines later (858), 1 is subtracted from that to make sure we point to the start of a frame, leaving us with 20478.  Add 44 to that and you have the broken file size.

Solutions: One way to fix this is to modify the mentioned line 858
  wav->end_offset -= (wav->end_offset % wav->bytes_per_sample);

This currently corrects the offset towards 0 if it does not point to the start of a frame (wav->bytes_per_sample should rather be called wav->bytes_per_frame).  Making this add the number of bytes missing to complete the unfinished frame seems to fix the bug (just after that, in line 872, is a safeguard to prevent exceeding the upstream data size even).

However, I think it makes more sense to tweak the offset calculation instead.  Floor division seems to be inappropriate here, the problem can also be fixed by making either (or both) uses of gst_util_uint64_scale_int a ceil division.  This would need new gst utililty functions to get this right (I remember Michael mentioning the lack of these on IRC once).  For now, I have a little patch for wavparse that uses this (which is obviously not perfect):

#define uint64_scale_int_round_up(val, nom, denom) \
       gst_util_uint64_scale_int (val + (denom - 1) / nom, nom, denom)

Comment 4 René Stadler 2006-10-13 18:18:14 UTC

Created attachment 74648 [details] [review]
Ceiling division in duration and offset calculation

Comment 5 René Stadler 2006-11-16 17:37:52 UTC

Created attachment 76710 [details] [review]
Ceiling division in duration and offset calculation (2)

The first possible solution I mention in comment #3 is wrong, it would not work for 8-bit mono pcm files.

Attached is an improved patch.  It does the same thing as the old one, but is more correct.  Instead of adding denominator - 1 before dividing (which is awkward to do in conjuction with overflow prevention), this uses the modulo operation (in an overflow safe manner) to find out if the division has a remainder.

Comment 6 Stefan Sauer (gstreamer, gtkdoc dev) 2007-03-02 13:31:28 UTC

2007-03-02  Stefan Kost  <ensonic@users.sf.net>

	Patch by: René Stadler <mail@renestadler.de>

	* gst/wavparse/gstwavparse.c: (uint64_ceiling_scale_int),
	(gst_wavparse_perform_seek), (gst_wavparse_stream_headers),
	(gst_wavparse_stream_data):
	  Handle rounding better to not drop last sample frame. Fixes #356692