GNOME Bugzilla – Bug 665707
Gnc-numeric Overflows because of incorrect denominator handling
Last modified: 2018-06-29 23:03:31 UTC
The underlying problem is that denominators are used for decimal precision, and as precision increases, the number of digits in both the denominator and numerator increases. In order to add numbers of different precision, the least common denominator must be found and both numerators multiplied by this value. A gint64 can represent at most 9.2E18 -- a large enough number on its own. But if we have 9 significant digits to the right of the decimal, then $1000 is represented as 10E9 with a 1E9 denominator. If we add that to $1000 with 8 digits of precision, the LCD is 1E8, so we multiply both numerators by that, resulting in 10E19 and 10E18 respectively. The first one overflows, ruining the calculation.
That's a very simplistic way to compute the LCD. It's really only necessary as a last resort, and is only useful for testing other methods (during which time we should use the mul128 code to do the verification). If we're combining a gnc-numeric with 10E-9 and one with 10E-8, the LCD is 10E-9.
Sorry, yes, sort of. It's 10E9, not 10E-9 (it's an int). The LCD itself is calculated by lcm128, as you say, as is the multiplication of the operands by lcd/x.denom -- but the return values of those conversions are checked for Perhaps I didn't dig deeply enough. (I intended the bug report as a reminder to me for later, and assigned it to myself -- I'm working on cleaning up the test code right now.) The problem arose because in app-utils/test-print-parse-amount the rounding in xaccPrintAmount (app-utils/gnc-ui-utils.c, around line 1180) is overflowing the addition when the number being rounded has 10E9 precision (the rounding is done by adding .5 to the value, expressed at 1 digit higher precision than the input value, or 5E9/1E10, and then truncating). True, the values overflowing are on the order of 5E18/1E9 , something that's unlikely to bother most Americans or Europeans, but a mere (dollar) millionaire in Indonesia [1] could have trouble -- and we had someone complaining a few months ago that he needed more precision for some mutual fund (IIRC the GUI only allows 1E6). Perhaps all that's needed is better error reporting, documentation, and some internal limits in gnc_numeric to keep the precision under control. [1] http://www.x-rates.com/d/IDR/USD/graph120.html
Did I understand correctly: There is a gnc-numeric overflow caused by app-utils/test-print-parse-amount that occurs with a 9 digit number? This is indeed a problem, as the code should be able to handle that. Maybe the "rounding" should be made a separate gnc_numeric operation. Or the gnc_numeric flags are not set correctly. [rant]Nevertheless the whole concept of gnc_numeric as rational numbers as opposed to fix-point decimal numbers IMHO is somewhat flawed, or at least unnecessarily complex. At the time, the team should have recognized that gnucash's requirements are sufficiently met by using fix-point decimal numbers. They can have a gint64 mantissa and some (gint32) exponent to the basis of 10. The implementation would still require similar rounding operations like the ones in this bug, though.[/rant]
(In reply to comment #3) > [rant]Nevertheless the whole concept of gnc_numeric as rational numbers as > opposed to fix-point decimal numbers IMHO is somewhat flawed, or at least > unnecessarily complex. At the time, the team should have recognized that > gnucash's requirements are sufficiently met by using fix-point decimal numbers. > They can have a gint64 mantissa and some (gint32) exponent to the basis of 10. > The implementation would still require similar rounding operations like the > ones in this bug, though.[/rant] gnc-numeric was designed when stock shares and prices were distributed as fractions (rationals) and not decimals. So it actually made some amount of sense at the time to store data that way. I asked this same question a decade ago. I don't know how hard it would be to change it now.
Gnc_numeric is reasonably well-written and self-contained (though the print function in app-utils should be moved into gnc-numeric.c; serializing is pretty clearly a class responsibility, not something to be handled by an external utility function), so replacing it with a rewrite wouldn't be too hard.
It'll be harder than you think, because a lot of other code assumes gnc-numeric is a rational number. Even the XML I/O stores it as a rational, and the SQL code stores it as two integers. Changing the structure of gnc-numeric would be more than just rewriting the internals of the class. HOWEVER, fixing this bug (stupid LCD choice) should be relatively straightforward.
Bug 662015 may be about the same issue. I haven't marked it as a duplicate because I'm not sure.
I added a test to check_add_subtract() in test-numeric.c which uses different random denominators--the existing test uses the same denominator for both numbers, which bypasses the interesting part of gnc_numeric_add()--and adjusted the trims on the numerator and denominator, e.g.: gint64 deno_a = get_random_gint32 () % (1UL << 16); gint64 na = get_random_gint64 () % (1ULL << 48); until I got it to pass NREPS value pairs. 16 and 48 bits fails, by the way: In order to pass, the total number of bits in the numerator of one value and the denominator of the other must be less than 62. That's not necessarily very useful, so I retested with decimal clamps and found that with a denominator clamp of 6 digits, the max number of numerator digits that doesn't overflow is 12, well short of the 18 one might expect. I wonder if this is a reasonable concern, though. Nowadays, denominators are all decimal, which means that lcd/denom is generally a small number; even before decimalization, the largest non-decimal denominator was 1/960 (a farthing, or quarter of an old English penny). More later, gotta go now.
Some more testing, looking at only powers-of-two and decimal denominators, with the former clamped at 10 bits and the latter at 10^6. That gains only a digit of precision, 13 instead of 12, but all of the failures are cases with wildly different denominators, e.g., 1000000 and 8. If look at only 10^n denomintors and clamp the numerators to 10^12 * denom and the denom clamped at 10^6, everything passes. Increasing the allowed denominator to 10^8 reduces the denominators by 2 digits, to 10^10 * denom. Mixing Summary: Arbitrary denominators < 1000: 15 digits Power-of-two denominators < 1024: 18 digits Power-of-ten denominators <= 10^6: 10 digits + denom digits; i.e., if the denominator has 6 digits, then the number can have a total of 16 digits. Mixed arbitrary < 1000 and Power-of-10 <= 10^6: 12 digits (1,000,000.000000) Mixed arbitrary < 1000 and Power-of-10 <= 10^4: 14 digits. (10,000,000,000.0000) Mixed binary < 1024 and Power-of-10 <= 10^6: 14 digits. The purely powers-of-10 case is quite satisfactory for anyone likely to be using Gnucash. AFAIK we've never had an overflow in the wild. I'm going to stop worrying about this and consider it a testing issue. I'll commit my testing changes so that anyone who wants to play around with it some more can do so.
GnuCash bug tracking has moved to a new Bugzilla host. This bug has been copied to https://bugs.gnucash.org/show_bug.cgi?id=665707. Please update any external references or bookmarks.