After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 642850 - Crash on resaving a LibreOffice created xlsx file
Crash on resaving a LibreOffice created xlsx file
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export MS Excel (tm)
1.10.x
Other Linux
: Normal critical
: ---
Assigned To: Jody Goldberg
Jody Goldberg
Depends on:
Blocks:
 
 
Reported: 2011-02-21 03:57 UTC by Jean-François Fortin Tam
Modified: 2011-02-22 05:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
test case (39.99 KB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)
2011-02-21 03:57 UTC, Jean-François Fortin Tam
Details

Description Jean-François Fortin Tam 2011-02-21 03:57:01 UTC
Created attachment 181434 [details]
test case

The attached file seems to cause a weird problem in gnumeric. Go to the third sheet ("4.3") and look at the formula of cell E26. It will appear as:

=$D$7*(D26-F14)+(1-$D$7)*E25

However, if you look at it in Excel 2007 or in LibreOffice calc, it is:
=$E$7*(D26/F14)+(1-$E$7)*E25

There are other weird things that seem to be going on with this file (the solver not working in most cases; try minimizing the yellow cell while varying the blue/cyan cells).
Comment 1 Andreas J. Guelzow 2011-02-21 05:02:26 UTC
Ok in the file proper we have:
<c r="E26" s="6"><f t="shared" ref="E26:E37" si="2">$E$7*(D26/F14)+(1-$E$7)*E25</f><v>305.90903532301792</v></c>
Comment 2 Andreas J. Guelzow 2011-02-21 16:57:52 UTC
Trying to reswave the unchanged file in teh same format (with a differnet name) yields:

Program received signal SIGSEGV, Segmentation fault.
0x01018733 in g_str_hash (v=0x0) at gstring.c:142
142	  for (p = v; *p != '\0'; p++)

back trace: 

  • #0 g_str_hash
    at gstring.c line 142
  • #1 g_hash_table_lookup_node
    at ghash.c line 312
  • #2 g_hash_table_lookup_extended
    at ghash.c line 935
  • #3 xlsx_write_comments
    at xlsx-write.c line 1355
  • #4 xlsx_write_sheet
    at xlsx-write.c line 1413
  • #5 xlsx_write_workbook
    at xlsx-write.c line 1566
  • #6 xlsx_file_save
    at xlsx-write.c line 1666

Comment 3 Andreas J. Guelzow 2011-02-21 17:15:18 UTC
On sheet "4.4", cell F15 we see a complete different issue.
OOo shows the formula:  =C15-AVERAGE($C$15:$C$26)
and Gnumeric:           =average($C$13:$C$24)

the file appears to contain:
<c r="F15" s="26"><f t="shared" ref="F15:F26" si="0">C15-AVERAGE($C$15:$C$26)</f><v>373.25</v></c>



Note also that if I resave this file in OOo (in the same format) the file becomes completely useless for Gnumeric (brief warning about number ids, then the file seems to contain charts and borders only).
Comment 4 Andreas J. Guelzow 2011-02-21 17:17:48 UTC
Further to comment 3, the text is still there white on white but the column and row headers are gone... (I didn't know we could/can do that.)
Comment 5 Andreas J. Guelzow 2011-02-21 20:34:06 UTC
I have identified the issue with these formulas:

We are looking at shared formulas. In this file the IDs for the shared formulas are not unique, They appear to be unique within a sheet but the ids on 4.3 are already used on 4.2. As a consequence we are not reading those new expressions but reuse the expressions from the previous sheet.

I will need to check whether the specs allows shared expressions across sheets or only within sheets.

Note that the file at hand was not created by Excel (but by LibreOffice) so its existence by itself does not guarantee that expressions are only shared within a sheet. The fact that it appears to be correctly read by Excel could just mean that Excel reads all expressions and happens to overwrite existing ones.
Comment 6 Andreas J. Guelzow 2011-02-21 21:01:47 UTC
If I read the ECMA spec correctly: 

"Just as strings in cells can be extremely pervasive and redundant in a sheet (and therefore must be optimized), formulas are also extremely pervasive in a sheet, and often can be optimized. Consider the
table in the above example, where column H contains a formula that sums the numbers in columns E through G, for each row. The only difference between the formulas in H6:H12 is that the reference increases by 1 row from one row to the next. Therefore, an optimization is created where only the formula in H6 needs to be written out, with some additional information indicating how far to propagate the formula once loaded. This enables the loading application to load and parse only the first of the shared formulas, and then more quickly apply the necessary transforms to produce the additional related formulas in subsequent cells. Note that while formulas can be shared, it is desirable to enable easy access to the contents of a cell. Therefore, it is allowed that all formulas can be written out, but only the primary formula in a shared formula need be loaded and parsed."

then the sharing is just restricted to the "ref" region given: 
<f t="shared" ref="E26:E37" si="2">
ie. si="2" would only be valid within ref="E26:E37" ?!
Comment 7 Andreas J. Guelzow 2011-02-21 21:14:03 UTC
The above quote come from the informative part of the specs. In the normative part it just says:
-----------------
Shared formula. If a cell contains the same formula as another cell, the “shared” value can be used for the t attribute and the si attribute can be used to refer to the cell containing the formula. Two formulas are considered to be the same when their respective representations in R1C1-reference notation, are the same."
-----------------

but we also have on "ref": 
-----------------
Range of cells which the formula applies to. Only required for shared formula, array formula or data table. Only written on the master formula, not subsequent formulas belonging to the same shared group, array, or data table.
-----------------
Comment 8 Andreas J. Guelzow 2011-02-22 04:27:11 UTC
The description of "si" seems to be the most clear explanation: 
---------------------------------------------------------------
When a formula is a shared formula (t value is shared) then this value indicates the group to which this particular cell's formula belongs. The first formula in a group of shared formulas is saved in the f element. This is considered the 'master' formula cell.
Subsequent cells sharing this formula need not have the formula written in their f element. Instead, the attribute si value for a particular cell is used to figure what the formula expression should be based on the cell's relative location to the master formula cell.
A cell is shared only when si is used and t is shared. The formula expression for a cell that is specified to be part of a shared formula (and is not the master) shall be ignored, and the master formula shall override.
If a master cell of a shared formula range specifies that a particular cell is part of the shared formula range, and that particular cell does not use the si and t attributes to indicate that it is shared, then the particular cell's formula shall override the shared master formula. If this cell occurs in the middle of a range of shared formula cells, the earlier and later formulas shall continue sharing the master formula, and the cell in question shall not share the formula of the master cell formula.
Loading and handling of a cell and formula using an si attribute and whose t value is shared, located outside the range specified in the master cell associated with the si group, is implementation defined.
Master cell references on the same sheet shall not overlap with each other.

---------------------------------------------------------------
Comment 9 Andreas J. Guelzow 2011-02-22 04:42:36 UTC
This problem of incorrect formulas has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.

The crash described in comment #2 is still outstanding.
Comment 10 Andreas J. Guelzow 2011-02-22 05:07:25 UTC
Since this crash is really in glib (and the documentation clearly indicates that passing NULL should be fine) I have filed the glib bug #642944.
Comment 11 Andreas J. Guelzow 2011-02-22 05:21:18 UTC
This problem (crash described in comment #2) has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.