Bug 131588 – SUM algorithm properties

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 131588 - SUM algorithm properties


Summary:	SUM algorithm properties


Status:	RESOLVED FIXED

Product:	Gnumeric
Classification:	Applications
Component:	Analytics
Version:	git master
Hardware:	Other All

Importance:	Normal major
Target Milestone:	---
Assigned To:	Jody Goldberg
QA Contact:	Jody Goldberg

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2004-01-15 18:33 UTC by Alan Horkan
Modified:	2011-11-23 20:57 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Excel Spreadsheet toy - Job predictor (202.00 KB, application/octet-stream) 2004-01-15 18:41 UTC, Alan Horkan		Details
Compile with gcc -O0 vs -O2 and watch the answers change (1009 bytes, text/plain) 2004-01-20 17:09 UTC, Jody Goldberg		Details
K-I-S-S (1.89 KB, patch) 2005-01-04 18:55 UTC, Morten Welinder	none	Details \| Review

Description Alan Horkan 2004-01-15 18:33:29 UTC

attachment to follow

type a name or word in C10 and result appears in H10
eg typing 'George W Bush' will return 'Human Shield'
bob == hypnotist
monkey == mad scientist
bill clinton == dodgem driver
Jody Goldberg == Teaching Pensioners to Drive

I always wondered if these stupid junkmail spreadsheets would work properly
in Gnumeric becuase they typically use things in very unusual ways so I
really hope that it is of some use for testing purposes.

Comment 1 Alan Horkan 2004-01-15 18:41:08 UTC

Created attachment 23394 [details]
Excel Spreadsheet toy - Job predictor

Comment 2 Jody Goldberg 2004-01-20 17:08:54 UTC

yummy a compiler issue.
Compile the attachment with gcc x86 -O0 an get different answers than
with -O2.  We'll play with a simple patch to see if it has any
unwanted effects.

Comment 3 Jody Goldberg 2004-01-20 17:09:42 UTC

Created attachment 23564 [details]
Compile with gcc -O0 vs -O2 and watch the answers change

Comment 4 Morten Welinder 2004-01-23 14:31:54 UTC

Our current summation technique has some pretty bad properties,
for example, that we end up with non-integer sum of integers due
to rounding issues.

Do we have good examples where the two-pass method improves matters?

Comment 5 Andreas J. Guelzow 2004-01-23 15:09:14 UTC

The obvious candidate is a sum of 3 numbers x, y, -x where y is very
close to 0 and x is very large, but it thee numbers are very different
in size, I don't think it will improve the calculation a lot.

Sorting the numbers by absolute value would help much more.

Comment 6 Jody Goldberg 2004-02-18 21:04:58 UTC

This is kludged in 1.2.6 to use long double
Do we want to just drop the whole idea ?
Sorting isn't really feasible.  That is just too expensive for large sets.

Comment 7 Morten Welinder 2004-02-20 15:49:59 UTC

I'm inclined to just go with the trivial implementation -- that will
give us 80-bit intermediate results on i86.

If we want to get fancy, we could detect that we have a problematic
data set and use a sorting-based method when that happens. For example:

1.  Sort positives and negatives in different buckets in order of
    increasing magnitude.
2.  Set running sum to 0.
3.  While we have both positives and negatives:
3a.   Add a positive number if running sum <= 0.
3b.   Add a negative number if running sum > 0.
4.  Add whatever is left.

That leaves the question of how to determine whether to use the fancy
stuff.  I'd say...

1. |Plain sum| << max |x_i|
2. max |x_i| / min |x_i| > 2^20     (ignoring zeros)

Comment 8 Jody Goldberg 2004-02-29 04:10:37 UTC

Adopting the long double approach for now.

Comment 9 Morten Welinder 2005-01-04 18:55:05 UTC

This is still not good.  A sum of ones and zeroes does not yield an integer.
Suggested patch coming up.

Comment 10 Morten Welinder 2005-01-04 18:55:53 UTC

Created attachment 35442 [details] [review]
K-I-S-S

Comment 11 Andreas J. Guelzow 2005-01-04 19:25:17 UTC

Hey, you can't just add numbers by adding them!  :-)

Comment 12 Morten Welinder 2005-01-05 16:49:29 UTC

Oh yes, I can!

Comment 13 Morten Welinder 2011-11-23 20:57:41 UTC

See also 664656