Bug 755301 – audioconvert: Integer->Float conversion creates values slightly smaller than -1.0

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 755301 - audioconvert: Integer->Float conversion creates values slightly smaller than -1.0


Summary:	audioconvert: Integer->Float conversion creates values slightly smaller than ...


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-plugins-base
Version:	unspecified
Hardware:	Other Windows

Importance:	Normal normal
Target Milestone:	1.7.1
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-09-20 13:06 UTC by sebastien.fenet
Modified:	2015-11-03 11:14 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Dummy code to illustrate the issue (3.84 KB, text/x-python) 2015-09-20 22:56 UTC, sebastien.fenet		Details
audioconvert: Divide by 2147483648 when converting to floats to stay in [-1.0, 1.0] range (2.87 KB, patch) 2015-09-22 11:35 UTC, Sebastian Dröge (slomo)	none	Details \| Review

Description sebastien.fenet 2015-09-20 13:06:58 UTC

I'm using GStreamer 1.0 under Python (through PyGI).
I have a little issue with audioconvert.

I want to convert some S16LE samples to F64LE.

As far as I know, this should simply be a division by 2**(16-1).

For a strange reason, the factor used in Audioconvert is slightly lower (something like 1+5e-10 times lower).

The result is that the output signal may have samples greater than 1 (I think I'm not the only one to observe this: http://stackoverflow.com/questions/16780851/get-gstreamer-audioconvert-to-output-signed-32-bit-floats-from-1-to-1)

Comment 1 Sebastian Dröge (slomo) 2015-09-20 17:53:32 UTC

Can you provide a testcase that shows this?

Comment 2 sebastien.fenet 2015-09-20 22:56:40 UTC

Created attachment 311725 [details]
Dummy code to illustrate the issue

Comment 3 sebastien.fenet 2015-09-20 22:57:21 UTC

(In reply to Sebastian Dröge (slomo) from comment #1)
> Can you provide a testcase that shows this?

Well, if you open any S16LE wav file whose minimum value reaches MIN_INT16 and converts it with a pipeline like: 
filesrc ! audioconvert ! audio/x-raw, format=F64LE ! appsink
you will see that the data you obtain has a minimum value below -1.

Just to illustrate, I wrote a small snippet of code that. It builds a rectangular signal with MIN_INT16 and MAX_INT16 values and converts it to float.

I attach it here (I hope you can run Python and PyGI).

Comment 4 sebastien.fenet 2015-09-20 22:58:29 UTC

Please keep me posted :-)

Comment 5 Sebastian Dröge (slomo) 2015-09-21 08:27:59 UTC

Thanks, the problem here is that we divide by 2147483647.0. Which is the maximum signed integer value for 32 bits, but -2147483648 is also in range. So -2147483648 would be converted to something slightly below -1.

Want to provide a patch for fixing these conversions? Alternatively one could also clamp the results to the [-1,1] range instead of changing the factor we use for the division.

Comment 6 Stefan Sauer (gstreamer, gtkdoc dev) 2015-09-21 08:48:06 UTC

The simple fix would be to divide by the bigger divisor 2147483648, so that we never get out-of-range.

If we want to handle the assymetry, we can also do
out = (in > 0) ?  in / 2147483647.0 : in / 2147483648.0

Comment 7 Sebastian Dröge (slomo) 2015-09-21 09:13:02 UTC

Let's divide by the bigger number then :)

Comment 8 sebastien.fenet 2015-09-21 09:46:03 UTC

(In reply to Sebastian Dröge (slomo) from comment #7)
> Let's divide by the bigger number then :)

I agree with this. I don't think it would be very nice to have an asymmetric conversion, depending on the sign of the input sample (linear PCM assumes that the space between each quantified value is equal).

If you are interested, there's a listing of various softwares' methods here:
http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-there.html

You'll notice that it also raises the question of the conversion back... Should we multiply by the lower number to ensure that we will not overflow INT16? (which means we sacrifice bit-transparency...).

There's no perfect method anyway, since we got from one asymmetric range to a symmetric one.

Comment 9 sebastien.fenet 2015-09-21 11:30:55 UTC

As far as providing the patch is concerned, I have never contributed yet. I have essentially been using Gstreamer in my Python codes. This being said, if you think it's a good idea to step in, I'll be happy to try. I would just need precise guidelines (btw, I'm mostly working with Windows).

Comment 10 Sebastian Dröge (slomo) 2015-09-21 12:08:25 UTC

Building GStreamer on Windows is a bit time consuming, but if you want to try it: http://cgit.freedesktop.org/gstreamer/cerbero/

See the README in there. Otherwise I could also make a patch for that

Comment 11 sebastien.fenet 2015-09-21 21:57:03 UTC

I've started having a look but it's indeed quite challenging to have all this running. I will go on working on it but I don't know how long it will take to come up with something (if I eventually do...). So, if you don't mind, I think the best option for now is that you do the patch.

Comment 12 Sebastian Dröge (slomo) 2015-09-22 11:35:30 UTC

Created attachment 311849 [details] [review]
audioconvert: Divide by 2147483648 when converting to floats to stay in [-1.0, 1.0] range

Breaks unit tests, don't merge yet.

Comment 13 Sebastian Dröge (slomo) 2015-09-22 11:39:18 UTC

(In reply to Sebastian Dröge (slomo) from comment #12)
> Created attachment 311849 [details] [review] [review]
> audioconvert: Divide by 2147483648 when converting to floats to stay in
> [-1.0, 1.0] range
> 
> Breaks unit tests, don't merge yet.

Now the problem here is that 2147483648.0 and 2147483647.0 are the same value for 32 bit floats. Printing that value even gives us 2147483648.0, and we were using that one before.

For doubles the above patch should make a difference (and it does, the unit test becomes unhappy).

Comment 14 Sebastian Dröge (slomo) 2015-10-13 09:43:00 UTC

Any ideas, opinions, comments?

Comment 15 sebastien.fenet 2015-10-13 14:10:08 UTC

Hi Sebastian,

Could you enlighten me about the workflow in Gstreamer?

My initial usecase was to convert a 16 bits Integer raw audio to Float.

From what I understood in your comments (correct me if I'm wrong), Gstreamer converts, at some point, the 16 bits Integer audio to 32 bits integer.

And then, it goes from this 32 bits integer to a 32 bits float by dividing by the factor you said before. 

Is that correct?

If it's indeed how it works, the problem is, in my opinion, this intermediary conversion to 32 bits integer. I actually don't see its practical interest and it will, for sure, pose a problem a resolution when you try to go from integer to float with the same number of bits...

Comment 16 Sebastian Dröge (slomo) 2015-10-13 14:26:12 UTC

Well, 16 bit integer to 32 bit integer is lossless. If you go from 16 bit integer to 32 bit floats, you have exactly the same problems :)

The reason for going via 32 bit integers first is that we reduce the amount of conversion code a lot. We always go via 32 bit integers or 64 bit floats as an intermediate format.

Comment 17 sebastien.fenet 2015-10-13 19:59:27 UTC

Ok, I see. In my opinion, it is better to work with 64 bits float as intermediate format.

Anyway, I agree that 16 bits -> 32 bits is lossless but you don't have the same problem. 

Look what happens if I work directly with Int16... I take my signal and I divide it by the (assymetric) max value of Int16: m1 = 2**(16-1). If I had chosen the other (bad) option, I would have divided by m2 = 2**(16-1)-1. You can easily check that the 32bits-floating-point representation of m1 and m2 is not the same (this is basically because we go from 16 bits resolved integers to 32 bits resolved floating point numbers).

On the other hand, when you do the same with Int32, you find two integers m1 and m2 that are too close to be resolved when switching to float32.

However, there is something that puzzles me in what you described. Basically, you told us that the current dividing factor is somehow: float32(max_Int32-1). Then, you observed that float32(max_Int32-1)=float32(max_Int32). So, if I sum up, GStreamer's current code divides by float32(max_Int32). What I really don't understand is how your Int32 encoded signal divided by this float32(max_Int32) can be greater than 1... Any clue on this or did I misunderstand something?

Comment 18 sebastien.fenet 2015-10-13 20:36:49 UTC

Sorry for multi-posting but I found the answer to my question by myself...

The initial test I made was Int16 -> Float64.

I've just tested Int16 -> Float32 and it works as expected!! That is because of Sebastian's remark... The dividing factor was wrong but as float32(wrong_factor)=float32(good_factor), the result is still OK.

So, Sebastian, I would say that your fix in the code is OK. I was not able to find the unit test in log in your previous posts... What is wrong with the tests?

Comment 19 Sebastian Dröge (slomo) 2015-10-13 21:22:46 UTC

(In reply to sebastien.fenet from comment #17)
> Ok, I see. In my opinion, it is better to work with 64 bits float as
> intermediate format.

64 bit floats are much slower than 32 bit integers, especially on embedded platforms :)

> Anyway, I agree that 16 bits -> 32 bits is lossless but you don't have the
> same problem. 
> [...]

Indeed, good point!

(In reply to sebastien.fenet from comment #18)
> Sorry for multi-posting but I found the answer to my question by myself...
> 
> The initial test I made was Int16 -> Float64.
> 
> I've just tested Int16 -> Float32 and it works as expected!! That is because
> of Sebastian's remark... The dividing factor was wrong but as
> float32(wrong_factor)=float32(good_factor), the result is still OK.

Yes, I guess it's good then. For some reason I thought you observed the problem with 32 bit floats

> So, Sebastian, I would say that your fix in the code is OK. I was not able
> to find the unit test in log in your previous posts... What is wrong with
> the tests?

Run make elements/audioconvert.check in gst-plugins-base/tests/check. There are some tests that are expecting the current behaviour, they should just be updated :)

Comment 20 sebastien.fenet 2015-10-14 18:44:39 UTC

Right for the Int32 format... I tend to forget that we're not all using Gstreamer on a 64bits computer with plenty of RAM and CPU ;-)

Comment 21 Sebastian Dröge (slomo) 2015-10-19 12:29:59 UTC

It's also a matter of speed ;) Anyway, someone will need to fix the unit tests before this can be merged. Do you want? :)

Comment 22 sebastien.fenet 2015-10-21 20:28:36 UTC

...Would have loved to, but I'm still struggling to have GStreamer running on my Windows (it's not exactly what I would call "plug&play"  ^o^ ). 

Since I'm into some other project deadlines, I do not have an incredible amount of time to spend on this setup and I'm advancing little by little (just discovered that the "wget" part of the bootstrap script does not seem to run with "mingw"). So, I hope I'll succeed at some point but I'm not exactly sure when :-)

If there's any rush in modifying the unit tests, you might want to ask someone else.

Comment 23 Wim Taymans 2015-11-02 16:28:12 UTC

I will have a look.

Comment 24 Wim Taymans 2015-11-03 11:14:08 UTC

commit 9e15c895644c0c15513fec0623416286700da184
Author: Wim Taymans <wtaymans@redhat.com>
Date:   Tue Nov 3 11:44:54 2015 +0100

    audioconvert: change multiplier for int<->float conversion
    
    Use (1 << 31) as the multiplier for int<->float conversions. This makes
    sure that int->float conversions always end up with floats between
    [-1.0, 1.0].
    For the conversion from float to int, this multiplier will give the complete
    int range after we perform clipping.
    Change the unit test to take this into consideration.
    
    Fixes https://bugzilla.gnome.org/show_bug.cgi?id=755301