GNOME Bugzilla – Bug 635064
Regression tool should calculate residuals
Last modified: 2010-11-26 02:06:02 UTC
Hi, what do you think about calculating residuals (residuals, residual plots, standardized residuals and line fit plots) as options in the regression tool? It should be quite easy (table+a couple of arithmetic operations already done by gnumeric) and it would close the gap with M$ Office.
Looks like a good idea.
line fit plots can already be created. I fail to see the usefulness for the other items and residuals are trivially calculated for those who really need them.
Andreas, I agree. But, why should we force the user to do it when other software (i.e. Excel) does it automatically and when it's a pretty common thing to calculate?
Jose, is it really a "pretty common thing to calculate"? The only time I have seen residuals calculated is when Statistics is being taught. And the purpose there was to understand regression. In that case it would be better to calculate it directly rather than have a tool do it for you.
well... if you are using the Regression tool in the Statistics package of a particular software... yes, it's quite common. You have to take into account the context :) Obviously, if you only multiply and use the charting package, it's not common.
Jose, perhaps you could elaborate _why_ somebody would want to calculate the residuals. (They are not else but a normally distributed random number with mean 0, so assuming that calculating a regression makes sense in the first place, they are pretty meaningless.) (I understand from your comment that they are frequently provided by software packages but that alone may not mean much more than that the developers found ait a simple feature to add. It does not say that there is any statistical usefulness to those residuals.)
"They are not else but a normally distributed random number with mean 0," sorry, but you are wrong. That's the whole point of doing the residuals analysis, to know if they have any kind of pattern (they are not random) and to know whether you regression model is correct.
Well, no. When you are performing a regression analysis you should already have assumed (or "know") that the model is appropriate. Apparently you are talking about something different, namely a data analysis that should happen prior to the regression analysis (with its own data). In that case I can see that you calculate a best fit line and have a look at the residuals. But in this case you surely don't just look at the residuals but perform some kind of tests on the residuals to see whether they are consistent with what you should expect if the assumptions of a regression analysis were true. What kind of tests do you normally perform on those residuals?
Andreas, I don't know you, but when I design a model, I am not always right, so I test it. Or in other words, I do not assume that my model is correct (or even appropriate). In any case, this is only a enhance request, it's not important, it's just more quick for me/us (users). I can keep doing my analysis manually, I just thought that it could be a quick/nice feature to implement and to put gnumeric in the Excel's regression level.
I assume that the residuals are desired in the "multiple linear regression" case but not the "multiple 2-variable regressions". This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report. This missed release 1.10.12 by a hair so will be in 1.10.13.