GNOME Bugzilla – Bug 755301
audioconvert: Integer->Float conversion creates values slightly smaller than -1.0
Last modified: 2015-11-03 11:14:08 UTC
I'm using GStreamer 1.0 under Python (through PyGI). I have a little issue with audioconvert. I want to convert some S16LE samples to F64LE. As far as I know, this should simply be a division by 2**(16-1). For a strange reason, the factor used in Audioconvert is slightly lower (something like 1+5e-10 times lower). The result is that the output signal may have samples greater than 1 (I think I'm not the only one to observe this: http://stackoverflow.com/questions/16780851/get-gstreamer-audioconvert-to-output-signed-32-bit-floats-from-1-to-1)
Can you provide a testcase that shows this?
Created attachment 311725 [details] Dummy code to illustrate the issue
(In reply to Sebastian Dröge (slomo) from comment #1) > Can you provide a testcase that shows this? Well, if you open any S16LE wav file whose minimum value reaches MIN_INT16 and converts it with a pipeline like: filesrc ! audioconvert ! audio/x-raw, format=F64LE ! appsink you will see that the data you obtain has a minimum value below -1. Just to illustrate, I wrote a small snippet of code that. It builds a rectangular signal with MIN_INT16 and MAX_INT16 values and converts it to float. I attach it here (I hope you can run Python and PyGI).
Please keep me posted :-)
Thanks, the problem here is that we divide by 2147483647.0. Which is the maximum signed integer value for 32 bits, but -2147483648 is also in range. So -2147483648 would be converted to something slightly below -1. Want to provide a patch for fixing these conversions? Alternatively one could also clamp the results to the [-1,1] range instead of changing the factor we use for the division.
The simple fix would be to divide by the bigger divisor 2147483648, so that we never get out-of-range. If we want to handle the assymetry, we can also do out = (in > 0) ? in / 2147483647.0 : in / 2147483648.0
Let's divide by the bigger number then :)
(In reply to Sebastian Dröge (slomo) from comment #7) > Let's divide by the bigger number then :) I agree with this. I don't think it would be very nice to have an asymmetric conversion, depending on the sign of the input sample (linear PCM assumes that the space between each quantified value is equal). If you are interested, there's a listing of various softwares' methods here: http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-there.html You'll notice that it also raises the question of the conversion back... Should we multiply by the lower number to ensure that we will not overflow INT16? (which means we sacrifice bit-transparency...). There's no perfect method anyway, since we got from one asymmetric range to a symmetric one.
As far as providing the patch is concerned, I have never contributed yet. I have essentially been using Gstreamer in my Python codes. This being said, if you think it's a good idea to step in, I'll be happy to try. I would just need precise guidelines (btw, I'm mostly working with Windows).
Building GStreamer on Windows is a bit time consuming, but if you want to try it: http://cgit.freedesktop.org/gstreamer/cerbero/ See the README in there. Otherwise I could also make a patch for that
I've started having a look but it's indeed quite challenging to have all this running. I will go on working on it but I don't know how long it will take to come up with something (if I eventually do...). So, if you don't mind, I think the best option for now is that you do the patch.
Created attachment 311849 [details] [review] audioconvert: Divide by 2147483648 when converting to floats to stay in [-1.0, 1.0] range Breaks unit tests, don't merge yet.
(In reply to Sebastian Dröge (slomo) from comment #12) > Created attachment 311849 [details] [review] [review] > audioconvert: Divide by 2147483648 when converting to floats to stay in > [-1.0, 1.0] range > > Breaks unit tests, don't merge yet. Now the problem here is that 2147483648.0 and 2147483647.0 are the same value for 32 bit floats. Printing that value even gives us 2147483648.0, and we were using that one before. For doubles the above patch should make a difference (and it does, the unit test becomes unhappy).
Any ideas, opinions, comments?
Hi Sebastian, Could you enlighten me about the workflow in Gstreamer? My initial usecase was to convert a 16 bits Integer raw audio to Float. From what I understood in your comments (correct me if I'm wrong), Gstreamer converts, at some point, the 16 bits Integer audio to 32 bits integer. And then, it goes from this 32 bits integer to a 32 bits float by dividing by the factor you said before. Is that correct? If it's indeed how it works, the problem is, in my opinion, this intermediary conversion to 32 bits integer. I actually don't see its practical interest and it will, for sure, pose a problem a resolution when you try to go from integer to float with the same number of bits...
Well, 16 bit integer to 32 bit integer is lossless. If you go from 16 bit integer to 32 bit floats, you have exactly the same problems :) The reason for going via 32 bit integers first is that we reduce the amount of conversion code a lot. We always go via 32 bit integers or 64 bit floats as an intermediate format.
Ok, I see. In my opinion, it is better to work with 64 bits float as intermediate format. Anyway, I agree that 16 bits -> 32 bits is lossless but you don't have the same problem. Look what happens if I work directly with Int16... I take my signal and I divide it by the (assymetric) max value of Int16: m1 = 2**(16-1). If I had chosen the other (bad) option, I would have divided by m2 = 2**(16-1)-1. You can easily check that the 32bits-floating-point representation of m1 and m2 is not the same (this is basically because we go from 16 bits resolved integers to 32 bits resolved floating point numbers). On the other hand, when you do the same with Int32, you find two integers m1 and m2 that are too close to be resolved when switching to float32. However, there is something that puzzles me in what you described. Basically, you told us that the current dividing factor is somehow: float32(max_Int32-1). Then, you observed that float32(max_Int32-1)=float32(max_Int32). So, if I sum up, GStreamer's current code divides by float32(max_Int32). What I really don't understand is how your Int32 encoded signal divided by this float32(max_Int32) can be greater than 1... Any clue on this or did I misunderstand something?
Sorry for multi-posting but I found the answer to my question by myself... The initial test I made was Int16 -> Float64. I've just tested Int16 -> Float32 and it works as expected!! That is because of Sebastian's remark... The dividing factor was wrong but as float32(wrong_factor)=float32(good_factor), the result is still OK. So, Sebastian, I would say that your fix in the code is OK. I was not able to find the unit test in log in your previous posts... What is wrong with the tests?
(In reply to sebastien.fenet from comment #17) > Ok, I see. In my opinion, it is better to work with 64 bits float as > intermediate format. 64 bit floats are much slower than 32 bit integers, especially on embedded platforms :) > Anyway, I agree that 16 bits -> 32 bits is lossless but you don't have the > same problem. > [...] Indeed, good point! (In reply to sebastien.fenet from comment #18) > Sorry for multi-posting but I found the answer to my question by myself... > > The initial test I made was Int16 -> Float64. > > I've just tested Int16 -> Float32 and it works as expected!! That is because > of Sebastian's remark... The dividing factor was wrong but as > float32(wrong_factor)=float32(good_factor), the result is still OK. Yes, I guess it's good then. For some reason I thought you observed the problem with 32 bit floats > So, Sebastian, I would say that your fix in the code is OK. I was not able > to find the unit test in log in your previous posts... What is wrong with > the tests? Run make elements/audioconvert.check in gst-plugins-base/tests/check. There are some tests that are expecting the current behaviour, they should just be updated :)
Right for the Int32 format... I tend to forget that we're not all using Gstreamer on a 64bits computer with plenty of RAM and CPU ;-)
It's also a matter of speed ;) Anyway, someone will need to fix the unit tests before this can be merged. Do you want? :)
...Would have loved to, but I'm still struggling to have GStreamer running on my Windows (it's not exactly what I would call "plug&play" ^o^ ). Since I'm into some other project deadlines, I do not have an incredible amount of time to spend on this setup and I'm advancing little by little (just discovered that the "wget" part of the bootstrap script does not seem to run with "mingw"). So, I hope I'll succeed at some point but I'm not exactly sure when :-) If there's any rush in modifying the unit tests, you might want to ask someone else.
I will have a look.
commit 9e15c895644c0c15513fec0623416286700da184 Author: Wim Taymans <wtaymans@redhat.com> Date: Tue Nov 3 11:44:54 2015 +0100 audioconvert: change multiplier for int<->float conversion Use (1 << 31) as the multiplier for int<->float conversions. This makes sure that int->float conversions always end up with floats between [-1.0, 1.0]. For the conversion from float to int, this multiplier will give the complete int range after we perform clipping. Change the unit test to take this into consideration. Fixes https://bugzilla.gnome.org/show_bug.cgi?id=755301