GNOME Bugzilla – Bug 742996
Missing data in histogram
Last modified: 2015-01-16 17:51:22 UTC
Hi Jean, Thanks for maintaining gnumeric. Here's the reminder you asked for. I happened to notice that the otherwise fine statistical histogram chart can sometimes omit data. A screen shot is attached. I created it by doing ... chart-icon > statistics > histogram. I expect you can diagnose the cause better than I, but I wonder if its automatic calculation of either a.) the chart's bins or b.) X axis' limits might be improved. At one point, you seemed to think the chart's bins should probably start at −0,1. I hope that helps, Kingsley
Nothing appears to be attached. (Bugzilla sometimes drops initial attachments.) Also, a sample file might be useful.
Created attachment 294630 [details] screenshot
Created attachment 294631 [details] sample file
This is squarely an issue of the chosen bins. With the appropriate data values, the chosen bins can be such that there are data values equal to the lower limit of the smallest bin. Since the bins contain the data values at the upper limit but not at the lower limit, some values may not fall into any bin. Please note that he histogram tool allows for much more options than using just the histogram chart.
The bins are automatic in this case, so all data should be included, imho.
I mostly agree with Jean here: automatic bins should not exclude this much data. I can imagine ignoring a few (~1-2%) outliers, but this hides a quarter of the data.
My comment #4 was intended to describe what is happening (so it can be fixed) it is probably reasonable that automatic bins don't loose any data points. (This is quite separate from the fact that I think that using automatically created bis is usually a bad idea.)