GNOME Bugzilla – Bug 697715
floating point precision problem in check test gst/gstvalue
Last modified: 2017-11-16 11:08:45 UTC
The tests fail when building for 32 bit for us. I tracked the problem down to this: we do gst_date_time_new(....., 10.000001). This does g_date_time_new internally on the same usec value. Problem is that g_date_time_new rounds the seconds value _down_ to the closest usec. It's easy to think that we are in the clear here. But 10.000001 in IEEE 754 representation is not quite 10.000001, more like 10.0000009999999999. Rounding this value down gives 10.0. It's impossible to represent 10.000001 exactly using IEEE 754, and we should never rely on the given value to be exactly what it seems, it could just as well be slightly higher or lower. Play around with this one if you'd like to convince yourself that I am right ;) http://www.h-schmidt.net/FloatConverter/ I think the reason that it doesn't happen on 64 bit build is because of extended precision kicking in for some of the calculations. I haven't really investigated this further though... Question is, should we 'fix' (I managed to get it working by adding another half usec to the values, hopefully making the representation become a little over the usec boundry) the tests that obviously rely on a broken assumption: that what you put in is what you get out. Or should we just skip these test, or introduce some level of variance in the accepted results?
What do you mean by "extended precision kicking in for some of the calculations"? 64-bit CPUs don't have any additional floating point precision.
Your analysis about the floating point value is of course correct. The relevant code should instead be doing something like this: { double x; double xi; int usec; x = 10.000001; xi = floor(x); usec = floor ((x-xi)*1000000); if (xi + (usec + 1) * 0.000001 <= x) { usec++; } printf("usec %d\n", usec); return 0; }
(In reply to comment #1) > What do you mean by "extended precision kicking in for some of the > calculations"? 64-bit CPUs don't have any additional floating point precision. I meant, but I realize I might be wrong, that the calculations might be handled in 80 bit etxtended fp registers for x86. But I think that is true for 32 bit as well... But any number of things could be different between 64 bit and 32 bit, so it was just speculation.
(In reply to comment #2) > Your analysis about the floating point value is of course correct. > > The relevant code should instead be doing something like this: > > { > double x; > double xi; > int usec; > x = 10.000001; > > xi = floor(x); > usec = floor ((x-xi)*1000000); > if (xi + (usec + 1) * 0.000001 <= x) { > usec++; > } > > printf("usec %d\n", usec); > > return 0; > } Problem is that we use g_date_time_new on a gdouble. And in the case where you provide the gdouble you have some control over it, but when it is derived from a parsed caps string we have less control. The problem is in my mind that we use date_time_new at all and expect usec precision.
As I see it one of three things have a bug: 1. The documentation + tests, it could be that the user shouldn't expect to use caps string parsing or gst_date_time_new and expect usec precision 2. The string parsing, it should not use gst_date_time_new but another more accurate way of building the datetime (doesn't solve users who doesn't realize that when using gst_date_time_new directly that they need to be carful with their values) 3. gst_date_time_new, it should somehow be able to handle the case 10.00000099999... But in that case we introduce a semantic difference compared to g_date_time_new that might be confusing.
The answer is 3. It should make the adjustment as in #2, because that's the correct way to do floating point code. There is no change in semantics. Reassigning to glib, as that's where the bug is.
>{ > double x; > double xi; > int usec; > x = 10.000001; > > xi = floor(x); > usec = floor ((x-xi)*1000000); > if (xi + (usec + 1) * 0.000001 <= x) { > usec++; > } > > printf("usec %d\n", usec); > > return 0; > } I understand what this does, it's basically testing to see if the provided number could be a badly approximated usec number. Is this an ok fix in this case? The g_date_time_new claims to round down to the closest usec, and doing the above we break that behaviour for values very close to the usec boundary. But is that a problem in real life?
Created attachment 241465 [details] [review] Fix for g_date_time_new(), plus test Patchy-patch for gdatetime.c in glib.
Review of attachment 241465 [details] [review]: I still think it is in principle wrong to rely on this type of behaviour. I do agree that this is correct in practice though.... Just a (somewhat reaching) example: seconds = 1.00000099999999999 would give an incorrect 1 seconds and 1 usec. Even though the function claims to round down... However, the benifit of being able to write 1.000001 and getting 1 usec isntead of 0 far outweighs this unlikely example... Best case would have been to seperate seconds and usecs in the API, but that's a little late now I guess :)
Any news on this? Is the supplied patch ok?
Review of attachment 241465 [details] [review]: Looks OK to me. Thanks for the patch, and sorry for the delay in review. I’ll push it with a couple of bug references added to the comments/test to direct people here if they have questions.
Comment on attachment 241465 [details] [review] Fix for g_date_time_new(), plus test commit 3cfac71d09dded67485f153fa76d6f063dbb6a76 (HEAD -> master, origin/master, origin/HEAD) Author: David Schleef <ds@schleef.org> Date: Sat Apr 13 11:26:34 2013 -0700 gdatetime: fix floating-point conversion Conversion from floating point to integer requires special care. https://bugzilla.gnome.org/show_bug.cgi?id=697715 glib/gdatetime.c | 13 ++++++++++++- glib/tests/gdatetime.c | 16 ++++++++++++++++ 2 files changed, 28 insertions(+), 1 deletion(-)