After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 635064 - Regression tool should calculate residuals
Regression tool should calculate residuals
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: Statistics Tools
1.10.x
Other Linux
: Normal enhancement
: ---
Assigned To: Andreas J. Guelzow
Andreas J. Guelzow
Depends on:
Blocks:
 
 
Reported: 2010-11-17 10:57 UTC by Jose Gonzalez
Modified: 2010-11-26 02:06 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Jose Gonzalez 2010-11-17 10:57:33 UTC
Hi,

what do you think about calculating residuals (residuals, residual plots, standardized residuals and line fit plots) as options in the regression tool? It should be quite easy (table+a couple of arithmetic operations already done by gnumeric) and it would close the gap with M$ Office.
Comment 1 Jean Bréfort 2010-11-17 13:09:53 UTC
Looks like a good idea.
Comment 2 Andreas J. Guelzow 2010-11-17 14:35:16 UTC
line fit plots can already be created. I fail to see the usefulness for the other items and residuals are trivially calculated for those who really need them.
Comment 3 Jose Gonzalez 2010-11-17 17:31:31 UTC
Andreas, I agree. But, why should we force the user to do it when other software (i.e. Excel) does it automatically and when it's a pretty common thing to calculate?
Comment 4 Andreas J. Guelzow 2010-11-17 18:58:34 UTC
Jose, is it really a "pretty common thing to calculate"? The only time I have seen residuals calculated is when Statistics is being taught. And the purpose there was to understand regression. In that case it would be better to calculate it directly rather than have a tool do it for you.
Comment 5 Jose Gonzalez 2010-11-17 19:24:40 UTC
well... if you are using the Regression tool in the Statistics package of a particular software... yes, it's quite common. You have to take into account the context :) Obviously, if you only multiply and use the charting package, it's not common.
Comment 6 Andreas J. Guelzow 2010-11-17 19:33:48 UTC
Jose, perhaps you could elaborate _why_ somebody would want to calculate the residuals. (They are not else but a normally distributed random number with mean 0, so assuming that calculating a regression makes sense in the first place, they are pretty meaningless.)

(I understand from your comment that they are frequently provided by software packages but that alone may not mean much more than that the developers found ait a simple feature to add. It does not say that there is any statistical usefulness to those residuals.)
Comment 7 Jose Gonzalez 2010-11-17 19:47:39 UTC
"They are not else but a normally distributed random number with mean 0," sorry, but you are wrong. That's the whole point of doing the residuals analysis, to know if they have any kind of pattern (they are not random) and to know whether you regression model is correct.
Comment 8 Andreas J. Guelzow 2010-11-17 20:00:20 UTC
Well, no. 

When you are performing a regression analysis you should already have assumed (or "know") that the model is appropriate. 

Apparently you are talking about something different, namely a data analysis that should happen prior to the regression analysis (with its own data). In that case I can see that you calculate a best fit line and have a look at the residuals. But in this case you surely don't just look at the residuals but perform some kind of tests on the residuals to see whether they are consistent with what you should expect if the assumptions of a regression analysis were true.

What kind of tests do you normally perform on those residuals?
Comment 9 Jose Gonzalez 2010-11-17 20:16:29 UTC
Andreas, I don't know you, but when I design a model, I am not always right, so I test it. Or in other words, I do not assume that my model is correct (or even appropriate).

In any case, this is only a enhance request, it's not important, it's just more quick for me/us (users). I can keep doing my analysis manually, I just thought that it could be a quick/nice feature to implement and to put gnumeric in the Excel's regression level.
Comment 10 Andreas J. Guelzow 2010-11-26 02:06:02 UTC
I assume that the residuals are desired in the "multiple linear regression" case but not the "multiple 2-variable regressions".

This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.

This missed release 1.10.12 by a hair so will be in 1.10.13.