GNOME Bugzilla – Bug 445617
First-time selection of largish area is slow
Last modified: 2007-06-15 14:33:04 UTC
Steps to reproduce: 1. Load a largish CSV, many thousands of rows 2. Click on any column header 3. Application freezes Stack trace: Other information: Even smaller CSV files will cause a several second hang when you click on a column header for the first time. There is no CPU load, so I don't think it is a huge loop. I am running amd64 version of ubuntu 7.04 and compiled from SVN source as well as trying 1.7.10 downloaded source.
Which column header are you talking about? In the configurable text import dialog on in the normal spreadsheet window?
The normal spreadsheet window. I'm able to autofilter and scroll around a large spreadsheet just fine, but as soon as I click on a column header the program stops responding.
This is strange. There is not much happening inside Gnumeric when you (left) click on a column header. Could you please try two separate things: 1) At the bottom of a spreadsheet window, there is an item that reads Sum=0. PLease right click on it and select "Count". THe item will change to Count=0. Does that change anything? 2) If youselect a different theme, do you still have this effect? Thanks
And try doing export G_SLICE=debug-blocks gnumeric if you are running a fairly recent glib. (See http://bugzilla.gnome.org/show_bug.cgi?id=438456)
Tried setting it to "Count" rather than "Sum" with no effect. Also tried 3-4 different themes default to Ubuntu 7.04 with no effect. export G_SLICE=debug-blocks didn't produce any extra output to the console About the only thing unusual about my set-up is probably running the 64-bit ubuntu. Everything else should be pretty typical. I think I've narrowed it down to a performance issue. With a smaller CSV i still see a delay (that looks like a freeze, no progress bar) the first time I click on a column header, but it does eventually come back. After that everything is pretty snappy, so it must do some indexing the first time you select a column?
I typically work on a 64 bit machine so there is nothing unusual. I am of course compiling gnumeric myself and the system is Debian AMD64 so that is different from yours. I also don't see anything special happening the first time I select a column...
I have the same problem on an older G5 Mac using Macports, Gnumeric 1.7.9. Immediate hang with about 90% cpu load on selecting a column or pressing CTRL-A to select everything. On a newer Intel Mac doing this the first time takes about 3 seconds, then it's instantanious. Sheet in question is a csv file (auto-detected, extension is .xls) with about 6000 rows, some 8 MB of data. Opening the same file as .gnumeric hangs on load (tested on the Intel Mac), saving it as Excel 97/2000/XP shows the same symptomps as the csv file. How do I start debugging this? I'm willing do to a 'insert debug statements, recompile, reproduce' cycle - but I don't even know where to start reading the code :).
We are rendering all cells in the selected range. With 64k cells, for example, that will take a while. We might be able to speed up things, at least if there are no merged cells nearby.
+ Trace 140765
Nowing nothing about the gnumeric internals, rendering all cells in the selected range seems to be deferable to when they're actually displayed? Anyhow, I have two more comments: -The delay happens on the first 'select a column' - but not afterwards, even if you select a different column. This makes it seem unlikely to me that 'rendering the selected range' is actually the problem. -While this is apperantly a smaller problem on a modern system, it does make gnumeric completly unusable on said older G5 even for moderate spread sheet sizes ( 6k cells per rows, ie. a 'selected range' 10x smaller than 64k).
Generally we do defer rendering until things are displayed. Certain operations (like fit-column) cause all cells in a column to be rendered. I am not 100% sure why rendering all cells is being triggered here. At least in the common case of every cell being non-merged and no-spanned, there does not seem to be an obvious need. For me, with 64k rows, it takes just 2-3 seconds. Enough to notice, but not enough to be a problem. I don't understand why you are seeing such a big problem with 6k rows, even if you cpu is (say) 10 times slower. I wonder if you system is starting to swap. (We are keeping way too much data around and might in some circumstances.)
Morten: There has to bbe something else going on. Apparently they only see the slowdown on the first column they select. If they select another (previously never selected) column there is no slowdown. The rendering issue would affect both equally.
I don't think this is just an issue with old machines, it's a general issue with large spreadsheets with more than a couple thousand rows. I've got a modern Dell 2x Core Duo running 64 bit Ubuntu with 4 gigs of RAM and the slowdown is very significant. I can certainly add debug statements somewhere if you need more information.
Created attachment 89970 [details] [review] Hackishj patch This patch seems to solve the issue. It is not production quality though: some of the calculation needs to be virtualized into the SheetControl class.
Andreas: I believe we render whole rows only due to spanning.
Morten's patch resolves my speed issue, it is almost instantaneous now, even with 100k rows. Thanks a ton!
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report. Note: anyone who runs a clipboard daemon which requests the rendered region from Gnumeric every time you select a new area is not going to see the benefits of any such patch.