After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 381011 - Cell copy animation slows down the GUI and eats the CPU
Cell copy animation slows down the GUI and eats the CPU
Status: RESOLVED FIXED
Product: Gnumeric
Classification: Applications
Component: GUI
git master
Other All
: Normal major
: ---
Assigned To: Jody Goldberg
Jody Goldberg
: 377399 611405 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2006-11-30 19:09 UTC by Chris Haidinyak
Modified: 2015-02-08 07:38 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Business spreadsheet (129.34 KB, application/vnd.oasis.opendocument.spreadsheet)
2010-11-29 23:02 UTC, Frank
  Details
profiling output from sysprof (93.81 KB, application/gzip)
2014-01-12 18:18 UTC, Jean-François Fortin Tam
  Details
cairo-trace output (423.59 KB, application/x-bzip)
2014-01-14 02:16 UTC, Jean-François Fortin Tam
  Details
cairo-trace output (493.33 KB, application/x-bzip)
2014-01-14 23:40 UTC, Jean-François Fortin Tam
  Details
Proposed patch (851 bytes, patch)
2014-01-21 13:22 UTC, Jean Bréfort
none Details | Review

Description Chris Haidinyak 2006-11-30 19:09:38 UTC
Please describe the problem:
>     I am using gnumeric a lot for large spreadsheets and have found 
> that its performance suffers drastically when I copy large areas (20 
> cols x 1000
> rows) to different areas of a spreadsheet. This degradation of 
> performance continues and/or worsens unless I reset the copy area to a 
> single cell. Is this easy to fix? Anyway, thanks for a great product !

UPDATE: see below

Steps to reproduce:
1. create large spreadsheet (100 cols by 5000 rows
2. copy a large section of the spreadsheet (25x2000)
3. notice slowdown
4. copy 1x1 area and slowdown stops
5. think is it xorg + gnumeric interaction with outlining dashes?



Actual results:
desktop slows to a crawl

Expected results:
normal desktop speed

Does this happen every time?
yes

Other information:
Update : When running top, xorg and gnumeric show running at high speed (high percentage using top) when clipboard is CTRL+C activated. Once a 1x1 CTRL+C is done to change the contents of the clipboard, the top sysload drops to 2%
Comment 1 Morten Welinder 2006-11-30 20:47:56 UTC
Your CapsLock key is broken or stuck.

For the record, I see something like this also on cvs HEAD.  It persists
after I press Escape to clear the "ants", so I doubt the X server is
involved.
Comment 2 Andreas J. Guelzow 2006-11-30 21:37:33 UTC
For me it does not persist when I press Escape. I also note that Gnumeric uses a huge amount of memory while the slowdown is in process. (Well, I don't see much of a slowdown just observed that gnumeric suddenly uses a huge amount of memory. The machine I am working on is pretty speedy.)
Comment 3 Andreas J. Guelzow 2006-11-30 21:39:20 UTC
*** Bug 377399 has been marked as a duplicate of this bug. ***
Comment 4 Andreas J. Guelzow 2010-05-05 08:09:37 UTC
If I copy a 5000 rows by 25 column area, Xorg uses suddenly 25% of CPU. No klipper or so involved.
Comment 5 Andreas J. Guelzow 2010-05-05 08:10:25 UTC
*** Bug 611405 has been marked as a duplicate of this bug. ***
Comment 6 Frank 2010-11-29 20:23:18 UTC
Similar problem. I use a 7 sheet spreadsheet with links between sheets. Sheets range from 170 rows by 19 columns. When saving the file or making multiple cell changes the screen turns gray and will not respond for up to 3-4 minutes. Sometimes the entire system is affected and at times only Gnumeric suffers.
System is AMD dual core with 6GiB memory. Running in Ubuntu 10.04. Adjusting settings in preferences has not helped.

Any additional info needed?
Comment 7 Morten Welinder 2010-11-29 20:56:35 UTC
Your description over at 
http://ubuntu-ky.ubuntuforums.org/showthread.php?t=1633699 sounds different.


My guess for your problem is that something takes way too much time
during save to odf.  We recently fixed one such problem, but there
might be more.  Please attach the file in question.
Comment 8 Andreas J. Guelzow 2010-11-29 21:45:35 UTC
Morten, slowdown when "making multiple cell changes" or "make format changes" can have nothing to do with saving as ODF. (Of course there is the possibility that when reading the ODF file the style structure created may be suboptimal.)
Comment 9 Frank 2010-11-29 23:02:04 UTC
Created attachment 175504 [details]
Business spreadsheet

For testing for slowing program down or it becoming unresponsive when saving.
Comment 10 Frank 2010-11-29 23:02:46 UTC
I am sending the file that I am having problems with. 

When imported from the ODS format the date columns lose the mm/dd/yy setup. It has been bogging down when changing the entire column from the generic date numbers, ie ... 39904 which = date of 04/01/09. Even when entering a new row of data the bog down also occurs.


It still bogs down when saving as an ODS file or to the gnumeric format. This time it has taken more than 5 minutes. The program becomes unresponsive and I have to force shut down of Gnumeric.
Comment 11 Morten Welinder 2010-11-30 14:23:49 UTC
Frank: your problem is unrelated to the problem this bug was opened for.
Your dates-load-as-numbers problem is now bug 636131.

I do not see your slowdown-saving-to-ods problem anymore, and I believe it
has been fixed recently.
Comment 12 Jean-François Fortin Tam 2014-01-12 18:12:07 UTC
Hi Morten, I see this with 1.12.8. I'm not sure why the problem didn't hit me years earlier, but in the past few releases of gnumeric in the GNOME Shell era, I've felt significantly more affected by this.

It is caused by the animated cell dot/dash borders, not the contents being copied. I'm not sure if it is accentuated at zoom levels higher than 100%, but the base problem is very easy to reproduce:

1. Drag around to change the cell selections very quickly.
   In this case, notice it is fluid and responsive.
2. Select a single cell, ctrl+C
3. GOTO 1, notice the reduced responsiveness
4. Select a big rectangle of cells, ctrl+C
5. GOTO 1, notice that X is using 20-25% of the CPU (even on a Core2 Quad),
   that the entire UI is laggy, you can't play with cell selections fluidly.

This is demonstrated here: http://jeff.ecchi.ca/public/gnumeric-381011.webm

Interestingly enough... LibreOffice Calc (version 4.1) also has the same problem, though much less severe.
Comment 13 Jean-François Fortin Tam 2014-01-12 18:18:15 UTC
Created attachment 266076 [details]
profiling output from sysprof

If that can help...
Comment 14 Morten Welinder 2014-01-12 19:50:11 UTC
Thanks.  That is certainly progress.

That doesn't really look like it's our fault, even if eventually we might have
to be the ones working around it.

Is there a way to run without Gnome Shell for you?  If that is possible and
it does not show any slowdown, then a bug report for Gnome Shell would be in
order.
Comment 15 Morten Welinder 2014-01-12 20:13:24 UTC
Btw., it's highly likely that the problem described in comment 12 is
unrelated to the initial problem.
Comment 16 Jean-François Fortin Tam 2014-01-12 21:47:04 UTC
Alright, I compared my two Fedora 20 x64 computers and got some puzzling results.

The test: selecting B2 to R27 and pressing ctrl+C, then looking at CPU usage in htop

Laptop (Intel graphics at 1366x768):
- XFCE with no compositing: 14% cpu usage by gnumeric, 12% usage by Xorg
- XFCE + mutter: same thing
- GNOME Shell: same thing. 15% by gnumeric, 15% by Xorg.

When you press Escape, CPU usage falls to 0.

As for my desktop computer, I only tested one configuration: GNOME Shell. It runs the open-source "radeonsi" drivers on a Radeon HD 7770 (by no means crappy hardware :) at 1920x1080. Doing the same test on it yields 70-80% CPU usage from Xorg, but strangely enough, 0% for gnumeric.

What we can conclude from this so far is that it hammers the graphics stack much harder on the radeon. The side-effects on the laptop are much less obvious than on on my desktop, though still a bit perceptible.

*However*, as you can see, the laptop with the "flawless drivers" still managed to eat 15% of the CPU doing arguably "nothing", with the same numbers showing up regardless of if the test was done in a GNOME3 environment or not, so I think we can discard the gnome-shell/mutter hypothesis.

All in all, it seems to me like the basic problem remains: the selection-copy animation is hammering the graphics stack (which may or may not react in a catastrophic way) and it would be interesting to see if we could avoid doing "too much work" on that front or use a different approach entirely (why does it need animation to begin with? have you considered a static border, or making the contents temporarily semi-transparent?)
Comment 17 Morten Welinder 2014-01-13 02:44:05 UTC
For reference, on my old clunker of a laptop I see 6% and 3% Xorg (which
includes handling "top" updates and other stuff going on).  So nothing,
really.

> All in all, it seems to me like the basic problem remains: the selection-copy
> animation is hammering the graphics stack (which may or may not react in a
> catastrophic way) and it would be interesting to see if we could avoid doing
> "too much work" on that front or use a different approach entirely (why does it
> need animation to begin with? have you considered a static border, or making
> the contents temporarily semi-transparent?)

We do it this way because we did it that way 15 years ago.  It didn't put
a dent in the cpu budget with computers back then.  I would guess the
behaviour was chosen to look like what Excel does.

I looked into what we actually do.  It's really simple, but let me spell
it out in great detail:

1. We set a two-item dashed line style using cairo_set_dash.
2. We set a line width using cairo_set_line_width
3. We set one of the two ant colours.
4. We set a rectangle path using cairo_rectangle
5. We draw a line along that path using cairo_stroke_preserve.
   [That draws half the ant pattern, say the black bits.]
6. We set the other ant colour.
7. We set the stipple with an offset of one "ant" length
8. We draw a line along that path using cairo_stroke.
   [That draws half the ant pattern, say the wite bits.]

Repeat that every 150ms.

The short, simplified version: we draw 50 axis-parallel lines per second.

That's is not by any measure hammering the graphics stack.  It wasn't 15 years
ago and it certainly isn't today.  We could do something else, but without
understanding why what we do is causing trouble there is little point.  The
monster is just going to pop up elsewhere.
Comment 18 Morten Welinder 2014-01-13 03:20:37 UTC
I have reached out to the cairo people in the hope this rings a bell with them.

https://bugs.freedesktop.org/show_bug.cgi?id=73531
Comment 19 Jean-François Fortin Tam 2014-01-13 03:42:03 UTC
As I hinted I'm quite puzzled about this myself... and I find your argument of "this used to work fine for over a decade" quite striking/compelling indeed. But then my current guess is the only one I have so far, the symptom really makes it look like something induced by a graphics operation around the "walking ants" (thanks for reminding me of that term :)

I'm a bit at a loss as to what else to investigate or conjecture at this point. Let me know if there's some other troubleshooting information I can provide.
Comment 20 Morten Welinder 2014-01-14 01:35:37 UTC
Please try

    cairo-trace gnumeric

then start the anting and let it sit for a minute or two, then exit.
That should generate a trace file of all cairo activity.

(I actually had to compile my own copy of cairo to make that really work,
but your milage may vary.)
Comment 21 Jean-François Fortin Tam 2014-01-14 02:16:08 UTC
Created attachment 266223 [details]
cairo-trace output
Comment 22 Morten Welinder 2014-01-14 18:44:02 UTC
Unfortunately that trace shows the same problem I saw: cairo-perf-trace
complains about errors in it.  I had to recompile from git to get a working
trace.
Comment 23 Jean-François Fortin Tam 2014-01-14 23:40:52 UTC
Created attachment 266311 [details]
cairo-trace output

I git cloned cairo, ./autogen.sh && make, and then I did:

CAIRO_TRACE_SO=pathtocairogitcheckout/util/cairo-trace/.libs/libcairo-trace.so sh util/cairo-trace/cairo-trace gnumeric

Not sure if that's the right way. Result attached. Tried running ./perf/cairo-perf-trace on it and it segfaulted...
Comment 24 Morten Welinder 2014-01-15 20:21:03 UTC
Ok, clearly not working.  I don't get a segfault, but complaints over
some parameter.

Realistically this will have to wait until I can reproduce.
Comment 25 Morten Welinder 2014-01-20 03:31:06 UTC
Ok, this is our fault and I can reproduce.

It's canvas related, so adding Jean Brefort.

Reproduce:
1. Start Gnumeric.
2. Zoom to 25%
3. Select large area.
4. Ctrl-C.
--> cpu usage goes way up

(Lots of cells, lots of pixels.)

All we _want_ to do is draw the ant-cursor repeatedly.  What we actually end
up doing is invalidating the whole rectangle bounding box and thus redrawing
that.

The code in question cb_item_cursor_animation in item-cursor.c where we
invalidate the item.  I tried to change to this, but I still see very high
load.

		GocCanvas *canvas = item->canvas;
		double x0, y0, x1, y1;
		double scale = canvas->pixels_per_unit;
		double th = 4 / scale;
		goc_item_get_bounds (item, &x0, &y0, &x1, &y1);
		goc_canvas_invalidate (canvas, x0, y0, x1, y0 + th);
		goc_canvas_invalidate (canvas, x0, y1 - th, x1, y1);
		goc_canvas_invalidate (canvas, x0, y0, x0 + th, y1);
		goc_canvas_invalidate (canvas, x1 - th, y0, x1, y1);
Comment 26 Jean-François Fortin Tam 2014-01-20 18:26:54 UTC
Interesting. FWIW, I filed a bug report on LibreOffice calc too - if the cause of the problem on their end is similar, maybe a common solution can arise, who knows. https://bugs.freedesktop.org/show_bug.cgi?id=73841
Comment 27 Jean Bréfort 2014-01-20 21:22:53 UTC
Actually the issue seems to date from before the cairo era. We probably redraw all the cells inside the copies area even after the changes proposed by Morten. We probably should just draw the cursor without relying on draw events. I'll try something related based on how things ork in abiword.
Comment 28 Morten Welinder 2014-01-20 22:46:20 UTC
I think you're right that we redraw everything even with my changes, but
only as long as the big area is selected.  After click another cell, the
still-ongoing ant animation affects the cpu much less.

But, sure -- if we can avoid the pointless invalidation altogether I am
all for it!
Comment 29 Jean Bréfort 2014-01-21 13:22:50 UTC
Created attachment 266865 [details] [review]
Proposed patch

Looks like it is easier than I thought because we fortunately align the cursor on pixel borders whatever the the zoom level. Note that if we want to support some transparency, we would need something a bit more complex (like keeping a copy of the underlying pixels).
Please test.
Comment 30 Morten Welinder 2014-01-21 14:49:23 UTC
I really like the idea, but I cannot get it to work.

The code in question is being reached a handful of times, but the actual
drawing is not visible on the screen.
Comment 31 Jean Bréfort 2014-01-21 14:59:44 UTC
Weird, which gtk+ version are you using. Mine is 3.8.6 and I see the ants.
Comment 32 Morten Welinder 2014-01-21 23:48:08 UTC
This problem has been fixed in our software repository. The fix will go into the next software release. Thank you for your bug report.
Comment 33 Morten Welinder 2014-01-29 04:26:17 UTC
Jean: both Andreas and I see the non-working ants when
setting->themes->other settings->Controls is set to "Mint-X" whereas the ants
are working when Controls is set to Adwaita.

The differences is that the surface created by gdk_cairo_create with Mint-X
has an (0,0,0,0) clip region according to cairo_clip_extents.

Calling cairo_reset_clip works, but is explicitly not allowed by the docs for
gdk_cairo_create.

Ideas?
Comment 34 Jean Bréfort 2014-01-29 07:32:48 UTC
I see several possibilities:

1. consider it is a theme or Gtk+ bug
2. if clipped, go back to the old and slow solution
3. use Qt instead

1 and 2 are not incompatible.
Comment 35 Morten Welinder 2014-01-29 18:48:39 UTC
Probably fixed by gtk+ commit 5773cf237c7bf08f03a308f7952b6cf66840dbc4
which went into gtk+ 3.9.2

Current Gnumeric detects the situation and uses the old method.