GNOME Bugzilla – Bug 750627
[ods] ~ operator
Last modified: 2015-06-19 01:07:51 UTC
The file... http://cgit.freedesktop.org/libreoffice/core/plain/sc/qa/unit/data/ods/functions.ods ...shows that we import certain functions that came from xl 2003+ as unknown. Statistics!A4 has =COM.MICROSOFT.EXPON.DIST(3,0.5,1), for example, and we evaluate that as #NAME?
Considering that with respect to EXPON.DIST: "The function is new in Excel 2010, so is not available in earlier versions of Excel. However, the Expon.Dist function is simply a renamed version of the Expondist function, that is available in earlier versions of Excel." it is strange that any ODF file would contain COM.MICROSOFT.EXPON.DIST when it should have just been the ODF function EXPONDIST. I will add the appropriate mapping later today/
Hmm... My copy of localc also gets #NAME? when reading this file. Excel seems to read it, complain over it, and discard all expressions in favour of the stored values. Ugh! So where does that leave us? I am not sure. The file at http://cgit.freedesktop.org/libreoffice/core/plain/sc/qa/unit/data/contentCSV/statistical-functions.csv suggests that some version of LO can read this.
Those functions that were just new names for existing old functions are now handled. There are still some new functions left to map correctly.
Definitely improved. The failure in B2 puzzles me. The cell imports as "=rsq([]range1,[]range2)" which evaluated to #NAME?, presumably due to the ranges. But Statistics has ranges called "range1" and "range2".
The ODF file contains table:formula="of:=RSQ(range1;range2)" I don't know (yet) where the [] come from. Note that =rsq(range1,range2) would evaluate just fine.
WE are now left with the [] problem and a bunch of functions we do not seem to have equivalent versions of.
The function issues have now been all resolved. B2 (and B3 and B8) still have a problem. For B2 we are parsing table:formula="of:=RSQ(range1;range2)" getting an expression, setting cell B2 to that expression and then see =rsq([]range1,[]range2) I am not even sure what expression that really is!
See parse-util.c function std_expr_name_handler near line 1292. There are, somehow, two different names defined with the name "range1". One is a placeholder, one is not. (gdb) p thename->pos $22 = {eval = {col = 1, row = 1}, sheet = 0x0, wb = 0x7f1690} (gdb) p thename $23 = (const GnmNamedExpr *) 0xa758a0 (gdb) p expr_name_lookup (out->pp, expr_name_name (thename)) $24 = (GnmNamedExpr *) 0x124c4a0 (gdb) p *$24 $25 = {ref_count = 1, name = 0x4eaf480, pos = {eval = {col = 14, row = 5}, sheet = 0x9199c0, wb = 0x7f1690}, dependents = 0x0, texpr = 0x1a5af60, is_placeholder = 0, is_hidden = 0, is_permanent = 0, is_editable = 1, scope = 0x983920} (gdb) p *$24->name $26 = {str = 0xb8e0d0 "range1"} (gdb) p *thename $27 = {ref_count = 6, name = 0x4eaf480, pos = {eval = {col = 1, row = 1}, sheet = 0x0, wb = 0x7f1690}, Python Exception <class 'gdb.error'> There is no member named keys.: dependents = 0x9ec860, texpr = 0x15bc020, is_placeholder = 1, is_hidden = 0, is_permanent = 0, is_editable = 1, scope = 0x8d5140} (gdb) p *thename->name $28 = {str = 0xb8e0d0 "range1"}
In fact this is not just the same name, but the pointer to the name string appears to be the same !? When we parse <table:named-range table:name="range1" table:base-cell-address="$Statistical.$O$6" table:cell-range-address="$Statistical.$O$1:.$O$7"/> We call expr_name_add (&pp, name, texpr, NULL, TRUE, NULL); In expr_name_add this should replace the existing place holder. Since the names are defined at the end of an ODF file, we create place holders as the formulas are parsed.
The name is a GOString so the pointers are the same simply because the names are. The whole named expression placeholder concept is ripe for a rethink.
The placeholder we create when parsing the formula ends up in workbook scope. The named expression we create later is in sheet scope. My first reaction would be to shift the handling of named expressions into the pre-parsing of the file, but at that time we cannot parse the formula yet since we do not know the sizes of the sheets.
There is a bunch of things wrong with how we handle the named expressions in ODF import/export. I will dig into that.
For .gnumeric we do... 1. Create the names as soon as we can with an empty definition. 2. Collect the expressions (as strings) as set them at the end.
Already fixed in my tree. I will commit later.
This problem has been fixed in the unstable development version. The fix will be available in the next major software release. You may need to upgrade your Linux distribution to obtain that newer version.
I still get an error on load: Spreadsheet!A12 Unable to parse 'AREAS(([.A1:.B3]~[.F2]~[.G1]))' ('Invalid expression')
Does Gnumeric have the ~ operator? That is the "reference union operator">
I guess in Gnumeric that would be =areas((A1:B3,F2,G1))
Created attachment 305509 [details] [review] sample patch While the attached patch works in this specific case it is clearly not the correct solution. We could easily encounter formulas that contain the tilde and the regular argument seperator. Unfortunately we seem to be able to only specify a single argument separator. Note that the ODF standard even says: "Reference union. Displayed as the function parameter separator in some implementations." Replacing all tildes with semicolons of course causes many other issues. Perhaps we need to change how convs->arg_sep is used by adding a second convs->arg_sep_alt.
Created attachment 305511 [details] [review] proposed patch Since this patch touches the parser I would like somebody else to review it. Note that nobody seems to do "reference union" correctly. In my version of LO =SUM(A5:C5~B4:B6) with every cell of interest containing a 1 evaluates to 6 (and I would think that 5 would be more sensible if it is truly a union. So having ~ really act like an argument separator looks appropriate to me.
Review of attachment 305511 [details] [review]: committed