GNOME Bugzilla – Bug 170000
Value formatting when excessive precision is requested
Last modified: 2010-07-28 19:58:59 UTC
Version details: 1.4.2 Distribution/Version: Fedora Core 3 I have a spreadsheet wherein some values are systematically being screwed up. I suspect that the following Excel-style formatting is somehow involved: _(* #,##0.00000000000000_);_(* \(#,##0.00\);_(* "-"??_);_(@_) Although I didn't specify any numbers with more than 2 decimals, they are treated wrongly internally because the following values are shown in the spreadsheet: A B 1 500.00000000000010 2 (10.00) 490.00000000000010 3 15.00000000000000 505.00000000000010 4 (150.55) 354.45000000000010 5 (5.45) 349.00000000000010 6 (37.20) 311.80000000000010 7 (25.00) 286.80000000000010 8 5,000.00000000000100 5,286.80000000000200 9 (35.00) 5,251.80000000000200 10 (185.95) 5,065.85000000000200 E.g. cell B1 was entered as "500", and so was "5000" in cell A8. I strongly believe that this is a critical bug, as we're talking data loss here. Obviously, I normally don't display 14 decimals. However, as these small bits add up, you can eventually see that something is wrong because the value displayed in the "information area" reveils those bits as well. I will attached an example spreadsheet, although you can easily reproduce yourself (just use "=$B1+$A2" as the formula in B2 and autofill to B10, then change columns A and B to the quoted formatting).
Created attachment 38571 [details] Example spreadsheet
The real question is _what is in the cell_, independent of how it is being displayed. If you have 500 in there, there is no data loss. Therefore... Please try entering =A8-5000 in a new cell and format it as general. Do you get zero?
Doing "=A8-5000" gives zero, also does "=B10-5065.85". However, from a user-perspective (read "bug severity"), this is unimportant. Consider the following simple scenario: Gnumeric: 1. Select cell B10. 2. Press [Ctrl]+[C]. Gedit: 3. Press [Ctrl]+[V]. Results: The transferred value is "5,065.85000000000200" instead of "5065.85", so this really is a data corruption problem, albeit indirectly. Same story when writing the value on a piece of paper. BTW, it also looks like the small bits are propagating to some extent through the formula, although it doesn't look like an exact sum. Quite strange.
This is simply a data-formatting issue. The number of decimals requested combined with the integral value's digits exceeds the resolution of "double". For the integers (which have an exact representation as "double") the right thing to do is certainly displaying zeroes. I'll have a look. For the non-integers I doubt zeroes would be an entirely correct rendering of the actual value represented. (The number system used does not have the exact value 5065.85.) Nevertheless, I'll see if I can fake it.
Fixed in cvs HEAD.
Backported to 1.4 series.
Appears to be improved for most cases in 1.4.3, though I still see the behaviour in some places, e.g. instead of "955.22" it shows "955.220000000003" (which is also displayed in the information area with a "normal" cell precision of only 2 decimals). Unfortunately, it also seems to be related to the context and cell location as I can't easily reproduce the problem in a lightweight spreadsheet.
You are really doing something wrong here. There is no "955.22". There are instead numbers like "955.2200000001xyz" and "955.219999999999xyz". This is a fundamental consequence of using binary floating point numbers. If you ask for enough decimals, Gnumeric is fully correct in not giving you "955.22" so you should stop asking for more decimals than your numbers are defined with. At that level Gnumeric is simply showing you the number that you have (which may very well differ from the one you thought you had).
Please don't start insulting people. I didn't request 12 decimals just for fun obviously, that was to troubleshoot the problem. One can simply see these bits in the information area.
Created attachment 38642 [details] Information area
Seeing "955.220000000003" is not a bug, but a faithful rendering of the number. You have got to stop believing that you have the mathematical 955.22
Well, just entered the values and formatting in Microsoft Excel, and it seems like they don't have any issues at all, even with a zillion decimals. Admittedly, they don't have such a fancy About box, and that's is clearly very important. Anyway, this "stupid" user will migrate to OOo ... Good afternoon, good evening and goodnight!
Nobody is call you stupid. Actually, something is wrong (but not in the display which comes down to a simple printf with "%.15g" format): x = 955.22 = f6 28 5c 8f c2 d9 8d 40 x = 955.220000000003 = 10 29 5c 8f c2 d9 8d 40 Somewhere along the line you have lost a handful of significant bits. I don't see why you wouldn't see the very same result in both Excel and OO and anything else with the same "double" type. Ok, they might print fewer decimals, i.e., hide the problem. I am not sure what that would solve. And don't worry about us wasting resources on the about box -- it doubles as a test framework for the graphs. At least it used to.
evert : no one is calling you stupid. The XLS file format has a fixed point type that stores val*100 which would give exact representation of values with 2 decimal digits. OOo does not have that, nor does gnumeric. morten : could the discrepency be the add_eps ?
jody: no, for two reasons. 1. add_epsilon is not in this code path (which is really just value_get_as_gstring). 2. it's 10 times to big. It's fairly simple to get this effect: A1=20955.22 A2=20000 A3=A1-A2 The value in A3 shown as 955.22 but we have Sum=955.220000000001. (If you resize the A column you get ...00001 there too.) Unsurprisingly, OO gets the same result. However, it does a very good job of hiding it. For one thing, "sum" steals the format of the cell. Note, that if add A4=A3-955.22 we get zero exactly -- they appear to have some suck-to-zero effect in their subtraction! A5=A3-954.22 gives a little more than one, so only zero is special. (It really is a suck-to-zero effect: if you fiddle a bit with the initial numbers you can get 955.219999... too. Subtracting 955.22 still yields zero. Let's hope no-one decides to divide by it, :-)
If I understand this correctly then we suspect that: The underlying problem is simply that we are working in a discrete number system in which decimal fractions can only be approximated. Playing with a bunch of numbers this doesn't seem to be correct: if it is just a approximation issue, then some of these values should be a touch to small, but they all seem to be about the same value too large!
There is some really wrong though: 0.75 when shown with lots of decimals yields: 0.750000000000000100000000000000 but 0.75 has an exact binary representation since it is 0.11 in binary which can be represented very nicely! Running the code: #include <stdio.h> int main (void) { double val = 0.75; printf ("%.25f", val); return 0; } gives: 0.7500000000000000000000000 shows that we should not get something weird like: 0.7500000000000001000000000
I may be mistaken, but it looks to me as if we are adding epsilon in render_number in format.c and that that code is indeed in the code path taking for rendering.
Andreas: there are two different code paths: 1. Formatting in a cell. Yes, we add epsilon and I don't see why you would get that "1" unless you goffice is old. 2. the little "Sum" button at the bottom of the screen. No add there.
My observations where in 1.4.2, which means an older goffice. That's the same version as the bug report. I will not be able to check out the cvs version until Tuesday. (Our requirements are changing so fast that it is impossible for my dial-up system to stay current.) I gather we are now showing 0,75000000000000000000000000 for 0,75?
*** Bug 608889 has been marked as a duplicate of this bug. ***
Let's combine all these problems that have essentially to do with number representation in a single bug # 509965. *** This bug has been marked as a duplicate of bug 509965 ***