After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 685530 - Excessive save time for xlsx
Excessive save time for xlsx
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: import/export MS Excel (tm)
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Jody Goldberg
Jody Goldberg
: 662058 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2012-10-04 21:42 UTC by Andreas J. Guelzow
Modified: 2012-10-07 23:31 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
sample file (13.60 KB, application/x-gnumeric)
2012-10-04 21:42 UTC, Andreas J. Guelzow
  Details
Partial patch (7.65 KB, patch)
2012-10-07 12:47 UTC, Morten Welinder
none Details | Review
Updated patch (8.65 KB, patch)
2012-10-07 20:02 UTC, Morten Welinder
none Details | Review

Description Andreas J. Guelzow 2012-10-04 21:42:30 UTC
Created attachment 225851 [details]
sample file

It takes a really long time to save some pretty small files to xlsx (ECMA version 1).

A sample file is attached.

Note that this file (<14KB) becomes 33MB when saved as xlsx.
Comment 1 Morten Welinder 2012-10-06 14:19:06 UTC
The file (and probably time) comes from a crazy number of these:

 <sheetData>
...
   <row r="51" spans="1:256" ht="19.85" customHeight="1">
      <c r="A51" s="6" t="str">
        <v>2.2.2.  initiative</v>
      </c>
      <c r="B51" s="6"/>
      <c r="C51" s="6"/>
      <c r="D51" s="6"/>
      <c r="E51" s="6"/>
      <c r="F51" s="6"/>
      <c r="G51" s="6"/>
      <c r="H51" s="1"/>
      <c r="I51" s="1"/>
      <c r="J51" s="1"/>
      <c r="K51" s="1"/>
      <c r="L51" s="1"/>
...
Comment 2 Morten Welinder 2012-10-06 14:28:07 UTC
Specifically:

# egrep -c '<c r="[A-Z]+[0-9]+" s="1"/>' /tmp/ddd/xl/worksheets/sheet1.xml 
16768348

That's a lot of empty cells with style #1!
Comment 3 Morten Welinder 2012-10-06 14:44:38 UTC
Plan of action:

1. Teach xlsx_write_col and xlsx_write_cols to write the most common
   style as the "style" attribute.

2. Teach xlsx_write_cells to only write existing sheets.

3. Make new xlsx_write_empty_cells_styles and have it create records only
   for non-existing cells with styles that are not the col's most common.

I think we have all the pieces to make this happen.
Comment 4 Morten Welinder 2012-10-06 14:55:47 UTC
This will also fix bug 662058.
Comment 5 Morten Welinder 2012-10-07 12:47:21 UTC
Created attachment 225986 [details] [review]
Partial patch

This reduces file size by three orders of magnitude and speeds things up a
good deal.
Comment 6 Morten Welinder 2012-10-07 20:02:06 UTC
Created attachment 225997 [details] [review]
Updated patch

This also skips boring rows fairly fast.
Comment 7 Morten Welinder 2012-10-07 20:04:08 UTC
> fairly fast

Fast enough to make the file from bug 662058 save instantly.
Comment 8 Morten Welinder 2012-10-07 20:04:28 UTC
*** Bug 662058 has been marked as a duplicate of this bug. ***
Comment 9 Morten Welinder 2012-10-07 23:31:43 UTC
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.