GNOME Bugzilla – Bug 68723
Sample .gnumeric file that takes a long time to load.
Last modified: 2004-12-22 21:47:04 UTC
Loading a native gnumeric file which contains a block of 256x256 cells, each with a single if statement, takes a very long time (2-5 minutes depending on the system). See the file I attached. I have a longer description in the OpenOffice bug tracking database (OOo takes a long time to load this file as well), see bug #2800. Interesting to note, once the file is loaded, it is quite responsive. But that load time! Also, the memory used by gnumeric is approx 64MB after loading this file, about 1/3 the memory used by OOo!
Created attachment 6410 [details] The gnumeric native file of the 256x256 if statement block.
The file itself looks straight forward. There are 2 things you can do to improve load times right now. 1) Try using File Import, with the xml sax importer. It reads exactly the same format using a different mechanism and should cut down on memory thrash on startup. 2) How was this file generated ? There are 256x256 copies of the same expression! If you were to go into gnumeric copy the expression from 1 cell and paste it into the entire region all of those copies would be merged. File and memory usage would also fall. We have long term plans to implement an expression garbage collector that would catch this sort of usage, but it is not high priority. I'll try to come back to this once the 1.1 tree has been opened.
Ok, its your fault 1.0.2 is delayed :-) I see the main bottleneck. All of the cells depend on 1 of them. Each time we add a new dependency of this type we check to see if it is already there. I'll think about adding more bucketing at this level.
Jody, concerning your comment that since all the cells have the same formula, and I could just use the copy to region feature. In my real-world file I have a huge (actually larger than the example I gave) block of *nested* if statements per cell, in which manu of the formulas are different. This example I am demonstrating is the simplest case I could build that still demonstrated the problem. I explain the history of this a bit more on the OOo bug page, but basically I am converting an Excel spreadsheet our financial people use. It contains a row for every day of the year, and a column for every cost center in the company. Ugly? Yes. But, it loads in Excel in about 2 seconds, in OOo in 1 minute, and gnumeric in 5 minutes. Thanks for the help!
This problem was too much fun to ignore. I've worked up a patch that should solve things. It was tested importing your original xls file. Existing code 640.56user 0.85system 10:55.40elapsed 97PU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (2326major+6021minor)pagefaults 0swaps New code 3.57user 0.25system 0:08.03elapsed 47PU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (2309major+6218minor)pagefaults 0swaps .gnumeric import should also benefit, but you'll want to use the sax importer to get the full impact. Sadly xml export is still dom based, so unless you've got a fair amount of memory that will be slower than necessary. The patch will go into the 1.1 branch for testing, if it works out, I'll back port to 1.0.
Uhh, wow. A slight improvement, 640s to 4s! I had assumed that OOo would be so much better than gnumeric... but now I am beginning to think not. Especially when it comes to responsive bug fixing! Thanks Jody. PS: I assume you mean "xml import" not "xml export" in your comment?
Actually I do mean export. DOM is memory intensive. It requires approx 4xfile_size memory to store the tree. Note that this is the uncompressed file size. For large files that is significant. We are testing a sax based importer that will become the main importer for 1.2, but the exporter is still DOM based until we find time to replace it.
Jody, should this get marked "Fixed"? Other question, do you think this could be enabled by default in the 1.0.x branch (I saw your message about the backport on the mailing list)? I ask the second question b/c if the answer is no, then does that mean the first time it is available by default will be in the gnome2 port of gnumeric? (Which will be, presumably, a very long time off in terms of making it out to the general public).
I'll leave the bug open until it is enabled in the 1.0 branch. However, I will not enable it for a while. There needs to be at least a release or two of the unstable branch to get this tested. I take 'stable' seriously, and don't want to change something this central without lots of testing. I'm confident that it is correct, but I'm not certain. You can easily enable it yourself in gnumeric/src/evalc.c. Just look for the #undef
I am not going to enable this in the stable tree, it seems to work nicely in 1.1, but it is too significant a change to put into stable.
The URL field has been removed from bugzilla.gnome.org. This URL was in the old URL field, and is being added as a comment so that the data is not lost. Please email bugmaster@gnome.org if you have any questions. URL: http://www.openoffice.org/issues/show_bug.cgi?id=2800