GNOME Bugzilla – Bug 515297
Rounding errors
Last modified: 2008-04-07 15:30:20 UTC
To produce bug: 1. start Gnumeric 2. Type in any cell "1.535" 3. Click with right mouse button on cell, choose "Cell format" 4. Set format to "Numeric", select to digits to display. 5. You will see 1.53 instead of 1.54. This bug occurs also with 1.535, 1.545, 1.555 etc. It happens when the last significant digit is surrounded by to 5's (it's true at least for hundredths).
A computer does not store 1.535. The closest base two representation for this number is slightly smaller than 1.535 so the correct rounding of the number stored is 1.53. So this is not a bug.
*** This bug has been marked as a duplicate of 509965 ***
But it's completely nonsense from mathematical point of view. May be you should fix your string-to-float convertion subroutine?
No, you need to listen a whole lot more carefully. But first a demonstration. This is what Perl does, which as it turns out is precisely what the C library does: # perl -e 'printf "%.2f\n", "1.535";' 1.53 And here is why, in too much detail: Gnumeric works using the C double type, which comes down to the IEEE-754 standard on any machine you care about. It's a base-2 number system. Any number entered or computed will be rounded to the nearest representable number available. For the string "1.535" that comes down to r=1.5349999999999999200639422269887290894985198974609375. As you can see, that is a little bit less than what you wanted. For reference, the next representable number available is... s=1.5350000000000003641531520770513452589511871337890625. Given r, the correct round-to-2-decimals string is "1.53". The math is sound. It is just not math on the field of real numbers, but on the field of IEEE-754 numbers. All modern CPUs use that and all spreadsheets that I know of do too when they deal with floating point. Where spreadsheets differ is that some deliberately round certain numbers the wrong way for display. That pleases [blinds] people who have just entered a base-10 number, but it corrupts computed results that just happen to be one of the unlucky numbers.
I would really love to hear your reasoning supporting "But it's completely nonsense from mathematical point of view."
In some cases evene one hundredth make is significant. In that cases we should use fixed-point numbers, I think. So there wouldn't be such stupid errors related to existing standards for floating point numbers.
What we "should" or "should not" do really doesn't matter. Computers can only represent finitely many numbers. There are infinitely many "fixed-point numbers" (whatever they really are. SO you have to choose which ones you can represent and which ones you can't. Of course as a user of a spreadsheet you are always free to write the spread sheet to use exact values only as long as you know what they are. How you would do that? Well, representable integers are exact, so if you need exact precision to 1/1000 just write your formulas in 1/1000. If you enter 1535 it will stay that. (Of course it seems you are running into the problem when you set your format to ignore digits so I doubt that you really need 3 digit precision. And still, how is it "completely nonsense from mathematical point of view."
Ok, I'll find another way to make my calculations. "Completely nonsense ..." - it is about rounding rules. It seems to me as rounding to closest integer of 153.5 giving result 153. Of course it is not true. And that I meant. Double precision floating point number has 8 bytes of data (long double - 10). If I used all this bytes to represent fixed point number with hundreds I would have numbers from -2^64/100 to 2^63/100. It would be enough for me. I would like to have "fixed point" data type in GNumeric. I think it is not a problem to add this type of cell content to program. The problem is in reasoning.
Envel wrote: "It seems to me as rounding to closest integer of 153.5 giving result 153. Of course it is not true. And that I meant." I am not quite sure what you meant with that. Virtually everybody would want to round 153.5 to 154 and this is what Gnumeric in fact does. The number you mention here is exactly representable. You were previously writing about 1.535. That number would also in your "exact hundreds" version you are now talking about be not exactly representable.
Please don't test my english knowledge. The problem is not actual now. Thanks for the audience. I won't use automatic rounding. This would solve my problem.
*** Bug 526406 has been marked as a duplicate of this bug. ***
And just to throw even more chaos into the mix, the correct rounding of 1.545 is 1.54 according to ASTM E-29 (and I think an ISO standard as well, but I can't find it). For rounding from 5 (with no trailing non-zero digits), you always round to an even number, ensuring an even distribution of round-ups and round-downs. Because: 1 2 3 4 round down. 6 7 8 9 round up. So we split rounding from 5 and say always round to an even number, and statistically it will work out about half and half. There is allowance for this in IEEE 745 with the RoundTiestoEven rounding scheme. I have tested this with 1545 and round(A1, -1) and I get 1550, which IS incorrect. And if I followed this discussion properly, 1545 is an exactly representable number. Have I missed something here?
If 1545 is stored as a floating point number (ie, as mantissa 545 with exponent 3 then this does not look like an exact representable number). The integer 1545 would be exactly representable as a base 2 integer, but for obvious reasons a single type of representation is used for all numbers. Moreover, ASTM E-29 has nothing to do with how the ROUND function works. FOr that you may want to check OOXML which has recently bbecome an ISO format covering certain spreadsheet functions and/or the OpenFormula draft specification for its ROUND function. Finally, I really don't see how "ASTM E 29 REV B Standard Practice for Using Significant Digits in Test Data to Determine Conformance with Specifications" has anything at all to do with spreadsheet functions.
Well, let me tell you that we certainly don't record test data with a pencil and paper any more. That standard will govern reporting of numbers for anythinghaving to with ASTM anywhere, ever. Which is no small thing. I can completely see your point how that standard has little to do with spreadsheet functions, but I couldn't find anything more applicable really quickly, like ISO. I did note that IEEE745 does accommodate a RoundTiesToEven rounding scheme, and I think that _does_ have something to do with spreadsheet functions. I've checked what I could find at the OpenFormula specification, and the round function appears to be round up, which I guess is why Gnumeric rounds up. Sure shows me; I should have just checked there first. Will you at least grant me that round to even schemes are more accurate, whether or not they ever get incorporated in Gnumeric?
I don't know what "more accurate" means in this context. As far as I am concerned something is either accurate or not. In my opinion, you do make a good case that there should be a roundtiestoeven function. So I would suggest that you open an enhancement request asking for such a function (and mention the standards you have refered to in this report.)
Thanks, I will open an enhancement request. As to "more accurate" consider: 3.5 is _exactly_ halfway between 3.0 and 4.0. Always choosing to round this value up to 4.0 biases the results to be higher, and rounding errors accumulate. For large numerical simulations and other algorithms like numerical solutions to differential equations and statistical calculations, this error can be significant. By rounding to even, roughly half these values will be rounded up, and the other half rounded down (assuming your data follow a normal or any other symmetric distribution), thereby reducing the accumulated round off error and making your results more accurate.
Well, if you data truly follows a normal distribution, the probability to have exactly ...5 is 0 so it doesn't really matter what you do with it. If we were to round exact 1.5 (which is representable) to 2 we would probably see a large number of bug reports and only very few people (I guess you included) would applaud.