GNOME Bugzilla – Bug 744200
axis of XYZ plot goes crazy for no apparent reason
Last modified: 2015-02-14 15:25:28 UTC
Created attachment 296422 [details] non-crazy XYZ plot Observed behavior: By way of background, see attached screenshot non-crazy-axis.png This XYZ plot seems OK. Then compare with crazy-axis.png. The Y-axis has gone crazy. The *only* thing I changed to get to this situation was to type an x in cell B38. That changed one type of invalid data to another. I cannot begin to imagine how this could affect the plot in any way. For good measure, also attached is crazy-axis-axis.png which is the chart properties dialog, showing that Ymax is fixed at 8. The spreadsheet I used for this is also attached. Remarks: This is 100% reproducible chez moi. I tried running it under valgrind with various malloc-fill settings, and still all the same observations.
Created attachment 296423 [details] axis gone crazy
Created attachment 296424 [details] chart properties dialog; Ymax is fixed at 8
Created attachment 296425 [details] spreadsheet for crazy axis on XYZ plot
OK, I am able to reproduce.
Found the issue, this is clearly non tivial. The test for uniform variation did not took properly infinites and nan values into account. It succeeds before you change B38 and fails after, so the axis becomes discrete and you see the first eight rows (because the axis is clipped at 8). Anyway, I also need to fix #744202 because of the clipping issue.
Created attachment 296611 [details] [review] Proposed patch
Review of attachment 296611 [details] [review]: should ues isnan instead of go_finite since intinity can be sorted.
This problem has been fixed in our software repository. The fix will go into the next software release. Once that release is available, you may want to check for a software upgrade provided by your Linux distribution. After your distribution has provided you with the updated package - and if you have some time - please feel encouraged to verify the fix by changing the status of this bug report to VERIFIED. If the updated package does not fix the reported issue, please reopen this bug report.
Things are better but not fixed. a) Changing cell B38 to "x" is now irrelevant, as it should be, so that's good progress. b) Also, fixing the logic within the go_range_increasing() routine is a good thing. HOWEVER: The patch makes the program LESS USEFUL to me for the following reasons: 1) There seems to a heuristic that says if there are any string-type cells in the axis data array, the axis switches to being a "category" type axis. Rather than having major ticks and minor ticks, it has categories between ticks. I understand that having categories is useful. However, using invalid data to produce a disconnected plot is also useful. The observed mode-switching behavior on XYZ plots is undocumented, unexpected, and inconsistent with the behavior of plain old XY plots. I consider it a bug. Numerical axis behavior is necessary for producing disconnected plots. Desired/expected behavior: There should be some documented systematic way of controlling whether we get numerical-axis behavior versus category-axis behavior. A checkbox on the GUI would be reasonable. 2) At the very least, non-alpha data such as #DIV/0! or #NAME! should not provoke a switch to category-axis behavior. Such values are not useful as labels, and they would be useful for producing a disconnected plot. This would undoubtedly be confusing to some users, but it would be a relatively quick way to provide disconnected-plot capability without having to modify the GUI. 3) When the Y axis max is set to "auto", it calculates a crazy value for the max number of categories. I observe a max value that is not equal to the numerical maximum of the Y axis data, not equal to the number of cells specified in the Y axis data, not equal to the number of valid cells, or anything else that I can think of that would make sense.
The issue does not come from the strings but from the unordered y values. We don't support surfaces with unordered values for now (because we did not implement any code about surfaces crossing). Your y values grow from 0 to 7.75 and next valid value is 6 which is lower than 7.75. This is the issue. We have an enhancement request (#591478) about supporting several series in surface plots which, if implemented would allow unordered x or y values, but nobody volunteered to play with that.
It used to work! The recent patch makes things less useful. Using invalid data to produce disconnected plots is a very powerful technique. It is used routinely in plain old XY plots. A lot of things that "could" be done using N plots are done verrrry much more easily using one plot with N disconnected pieces. See e.g. https://www.av8n.com/physics/spreadsheet-tips.htm#sec-field which is from https://www.av8n.com/physics/symplectic-integrator.htm#fig-symplectic-harm and see also https://www.av8n.com/physics/spreadsheet-tips.htm#fig-dice-bars-horiz-250 Before the recent patch, this worked for XYZ plots. See the already-attached screen shots and the gnumeric file that produced them. Could we please have a way to get the thing to just do what the user specifies, using the numerical values as is, without converting them to categories, whether or not they are monotonic? It used to work!
Disconnected plots still work. You just need to have ordered x and y values. It worked with non monotonic values because the ordering check was buggy. It should not have worked. I intend to add an option one day about orderin the values in xy and xyz plots, but as already said I'm very buzy (and somewhat old).
I understand about busy. But I don't understand about "should not have worked". Arranging the data to be monotonic is not an option when the desired result is a layer cake. The code to do this is already there. If the code is allowed to run, it does the right thing AFAICT. Could we please just let the code run, even when the data is non-monotonic? If there are multiple disconnected pieces, I can sorta understand that the data should be /piecewise/ monotonic. I could even live with a restriction that says all pieces have to have the same direction (all increasing or all decreasing), if that simplifies the code. (I suspect the existing code might implicitly require this.) It may be that the general case is hard, but code -- existing code -- that does /some/ useful things is better than code that doesn't.
The immediate solution for your own use, is to build goffice from sources and patch the surface code (remove the monotonic data test). This stands in plugins/plot_surface/gog-xyz.c, line 198 for the x values and line 221 for the y values.
Created attachment 296826 [details] xyz plot, not monotonic, not single-valued ... split cylinder As suggested, I extirpated the tests for monontonic axis data. a) The results are beautiful, as you can see from the attached cylinder.png The existing graphics routines seem to handle non-monotonic data just fine. b) The results conform to the documentation: https://help.gnome.org/users/gnumeric/stable/gnumeric.html#sect-graphs-overview-types-suface AFAICT the documentation does not say that the axis data is required to be monotonic. AFAICT the documentation does not mention categories in this context. Assuming that non-monotonic data should be category data is an incorrect assumption. It is not helpful, not necessary, not expected, not consistent with the documentation, and not consistent with the way plain old XY plots work. Expected/desired behavior: Just plot the data that the user provides, whether or not it is monotonic.
This is a simple shape, more complicated samples would fail. Please file a specific enhancement request.