Bug 697715 – floating point precision problem in check test gst/gstvalue

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 697715 - floating point precision problem in check test gst/gstvalue


Summary:	floating point precision problem in check test gst/gstvalue


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	David Schleef
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-04-10 13:49 UTC by Alexander Schrab
Modified:	2017-11-16 11:08 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Fix for g_date_time_new(), plus test (2.51 KB, patch) 2013-04-13 18:40 UTC, David Schleef	committed	Details \| Review

Description Alexander Schrab 2013-04-10 13:49:16 UTC

The tests fail when building for 32 bit for us. I tracked the problem down to this:

we do gst_date_time_new(....., 10.000001). This does g_date_time_new internally on the same usec value. Problem is that g_date_time_new rounds the seconds value _down_ to the closest usec. It's easy to think that we are in the clear here. But 10.000001 in IEEE 754 representation is not quite 10.000001, more like 10.0000009999999999. Rounding this value down gives 10.0. It's impossible to represent 10.000001 exactly using IEEE 754, and we should never rely on the given value to be exactly what it seems, it could just as well be slightly higher or lower. Play around with this one if you'd like to convince yourself that I am right ;)

http://www.h-schmidt.net/FloatConverter/

I think the reason that it doesn't happen on 64 bit build is because of extended precision kicking in for some of the calculations. I haven't really investigated this further though...

Question is, should we 'fix' (I managed to get it working by adding another half usec to the values, hopefully making the representation become a little over the usec boundry) the tests that obviously rely on a broken assumption: that what you put in is what you get out. Or should we just skip these test, or introduce some level of variance in the accepted results?

Comment 1 David Schleef 2013-04-10 23:25:19 UTC

What do you mean by "extended precision kicking in for some of the calculations"?  64-bit CPUs don't have any additional floating point precision.

Comment 2 David Schleef 2013-04-10 23:59:51 UTC

Your analysis about the floating point value is of course correct.

The relevant code should instead be doing something like this:

{
  double x;
  double xi;
  int usec;
  x = 10.000001;

  xi = floor(x);
  usec = floor ((x-xi)*1000000);
  if (xi + (usec + 1) * 0.000001 <= x) {
    usec++;
  }

  printf("usec %d\n", usec);

  return 0;
}

Comment 3 Alexander Schrab 2013-04-11 07:50:35 UTC

(In reply to comment #1)
> What do you mean by "extended precision kicking in for some of the
> calculations"?  64-bit CPUs don't have any additional floating point precision.

I meant, but I realize I might be wrong, that the calculations might be handled in 80 bit etxtended fp registers for x86. But I think that is true for 32 bit as well... But any number of things could be different between 64 bit and 32 bit, so it was just speculation.

Comment 4 Alexander Schrab 2013-04-11 07:53:34 UTC

(In reply to comment #2)
> Your analysis about the floating point value is of course correct.
> 
> The relevant code should instead be doing something like this:
> 
> {
>   double x;
>   double xi;
>   int usec;
>   x = 10.000001;
> 
>   xi = floor(x);
>   usec = floor ((x-xi)*1000000);
>   if (xi + (usec + 1) * 0.000001 <= x) {
>     usec++;
>   }
> 
>   printf("usec %d\n", usec);
> 
>   return 0;
> }

Problem is that we use g_date_time_new on a gdouble. And in the case where you provide the gdouble you have some control over it, but when it is derived from a parsed caps string we have less control. The problem is in my mind that we use date_time_new at all and expect usec precision.

Comment 5 Alexander Schrab 2013-04-11 07:59:18 UTC

As I see it one of three things have a bug:

1. The documentation + tests, it could be that the user shouldn't expect to use caps string parsing or gst_date_time_new and expect usec precision
2. The string parsing, it should not use gst_date_time_new but another more accurate way of building the datetime (doesn't solve users who doesn't realize that when using gst_date_time_new directly that they need to be carful with their values)
3. gst_date_time_new, it should somehow be able to handle the case 10.00000099999... But in that case we introduce a semantic difference compared to g_date_time_new that might be confusing.

Comment 6 David Schleef 2013-04-11 20:19:18 UTC

The answer is 3.  It should make the adjustment as in #2, because that's the correct way to do floating point code.  There is no change in semantics.

Reassigning to glib, as that's where the bug is.

Comment 7 Alexander Schrab 2013-04-12 06:04:19 UTC

>{
>   double x;
>   double xi;
>   int usec;
>   x = 10.000001;
> 
>   xi = floor(x);
>   usec = floor ((x-xi)*1000000);
>   if (xi + (usec + 1) * 0.000001 <= x) {
>     usec++;
>   }
> 
>   printf("usec %d\n", usec);
> 
>   return 0;
> }

I understand what this does, it's basically testing to see if the provided number could be a badly approximated usec number. Is this an ok fix in this case? The g_date_time_new claims to round down to the closest usec, and doing the above we break that behaviour for values very close to the usec boundary. But is that a problem in real life?

Comment 8 David Schleef 2013-04-13 18:40:18 UTC

Created attachment 241465 [details] [review]
Fix for g_date_time_new(), plus test

Patchy-patch for gdatetime.c in glib.

Comment 9 Alexander Schrab 2013-04-15 07:21:41 UTC

Review of attachment 241465 [details] [review]:

I still think it is in principle wrong to rely on this type of behaviour. I do agree that this is correct in practice though.... Just a (somewhat reaching) example:

seconds = 1.00000099999999999

would give an incorrect 1 seconds and 1 usec. Even though the function claims to round down... However, the benifit of being able to write 1.000001 and getting 1 usec isntead of 0 far outweighs this unlikely example... Best case would have been to seperate seconds and usecs in the API, but that's a little late now I guess :)

Comment 10 Alexander Schrab 2013-06-11 11:55:25 UTC

Any news on this? Is the supplied patch ok?

Comment 11 Philip Withnall 2017-11-16 11:08:08 UTC

Review of attachment 241465 [details] [review]:

Looks OK to me. Thanks for the patch, and sorry for the delay in review.

I’ll push it with a couple of bug references added to the comments/test to direct people here if they have questions.

Comment 12 Philip Withnall 2017-11-16 11:08:37 UTC

Comment on attachment 241465 [details] [review]
Fix for g_date_time_new(), plus test

commit 3cfac71d09dded67485f153fa76d6f063dbb6a76 (HEAD -> master, origin/master, origin/HEAD)
Author: David Schleef <ds@schleef.org>
Date:   Sat Apr 13 11:26:34 2013 -0700

    gdatetime: fix floating-point conversion
    
    Conversion from floating point to integer requires special care.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=697715

 glib/gdatetime.c       | 13 ++++++++++++-
 glib/tests/gdatetime.c | 16 ++++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)