GNOME Bugzilla – Bug 685530
Excessive save time for xlsx
Last modified: 2012-10-07 23:31:43 UTC
Created attachment 225851 [details] sample file It takes a really long time to save some pretty small files to xlsx (ECMA version 1). A sample file is attached. Note that this file (<14KB) becomes 33MB when saved as xlsx.
The file (and probably time) comes from a crazy number of these: <sheetData> ... <row r="51" spans="1:256" ht="19.85" customHeight="1"> <c r="A51" s="6" t="str"> <v>2.2.2. initiative</v> </c> <c r="B51" s="6"/> <c r="C51" s="6"/> <c r="D51" s="6"/> <c r="E51" s="6"/> <c r="F51" s="6"/> <c r="G51" s="6"/> <c r="H51" s="1"/> <c r="I51" s="1"/> <c r="J51" s="1"/> <c r="K51" s="1"/> <c r="L51" s="1"/> ...
Specifically: # egrep -c '<c r="[A-Z]+[0-9]+" s="1"/>' /tmp/ddd/xl/worksheets/sheet1.xml 16768348 That's a lot of empty cells with style #1!
Plan of action: 1. Teach xlsx_write_col and xlsx_write_cols to write the most common style as the "style" attribute. 2. Teach xlsx_write_cells to only write existing sheets. 3. Make new xlsx_write_empty_cells_styles and have it create records only for non-existing cells with styles that are not the col's most common. I think we have all the pieces to make this happen.
This will also fix bug 662058.
Created attachment 225986 [details] [review] Partial patch This reduces file size by three orders of magnitude and speeds things up a good deal.
Created attachment 225997 [details] [review] Updated patch This also skips boring rows fairly fast.
> fairly fast Fast enough to make the file from bug 662058 save instantly.
*** Bug 662058 has been marked as a duplicate of this bug. ***
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.