Bug 161342 – Vte slow with mc and vim

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 161342 - Vte slow with mc and vim


Summary:	Vte slow with mc and vim


Status:	RESOLVED FIXED

Product:	vte
Classification:	Core
Component:	general
Version:	0.11.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Behdad Esfahbod
QA Contact:	VTE Maintainers

URL:
Whiteboard:

Duplicates:	96879 (view as bug list)
Depends on:
Blocks:

Reported:	2004-12-15 09:14 UTC by Tomas Mraz
Modified:	2007-01-18 16:20 UTC

See Also:
GNOME target:	---
GNOME version:	2.9/2.10

Attachments
Invalidate only the cells. (120.52 KB, patch) 2007-01-10 14:02 UTC, Chris Wilson	none	Details \| Review
Invalidate only the cells (and nothing else!) (12.27 KB, patch) 2007-01-15 13:18 UTC, Chris Wilson	none	Details \| Review

Description Tomas Mraz 2004-12-15 09:14:42 UTC

Expand gnome-terminal to fullscreen, run mc.

Change to a directory with so much files to fill the mc panel. Try to list
through the files in the panel using arrow keys.

See how the scrolling of the selected line is slow compared to other terminals
(konsole, xterm).

Comment 1 Kjartan Maraas 2004-12-16 14:04:47 UTC

What version of vte/gnome-terminal are you using? I don't see any difference
when scrolling in xterm. I used /usr/lib containing 2400 files and it was just
about the same.

Comment 2 Tomas Mraz 2004-12-17 12:25:27 UTC

I use gnome-terminal-2.8.0-2/vte-0.11.11-15 from Fedora Core development.

The effect was much worse with the gnome-terminal/vte form Fedora Core 3 (almost
unpatched vte-0.11.11) as in the current development packages there are some
added perf patches which make the scrolling faster. However the main difference
against konsole/xterm is that the selected file line moves without skips on them
but in the vte/gnome-terminal it starts without skipping and then it starts to
skip over files as the redraw is not fast enough. It seems to me that the
version without the patches prefered redraws against speed so actualy the
movement was slow and it stopped a long time after releasing the cursor key.
This effect is pronounced if you use a really high resolution+small font.

Comment 3 Dennis Krul (dweazle) 2005-08-18 14:39:04 UTC

gnome-terminal is indeed a bit slower than xterm and (for example) aterm. I
agree that it could be better.

I can kind of reproduce the 'mc file skipping' behaviour the submitter mentioned
using gnome-terminal-2.10.0 / vte-0.11.13 (debian unstable) in fullscreen mode
(1280x1024, nvidia gf2 with opensource nv driver, athlon 2200). It's not a very
smooth experience, but it's usable nevertheless. 

The only difference with the submitter is that if I stop pressing the cursor key
the scrolling stops immediately (as expected). 

Maybe the submitter could upgrade (or already has?) to a more recent version of
gnome-terminal and vte to see if there are any improvements. Any information
regarding hardware setup might be useful too.

Comment 4 Tomas Mraz 2005-08-18 15:43:49 UTC

I'm using now the vte-0.11.13 with basically the same result as in comment 2.
The slow response to keyup event is now not too pronounced but sometimes it
happens - HW is Dell Latitude D800 (Pentium M, Nvidia GeForce, 1680x1050).
Konsole or xterm work fine.

Comment 5 Kjartan Maraas 2005-08-29 09:03:47 UTC

Some of our scrolling issues were fixed in 0.11.14 by shorcircuiting an
optimization for scrolling the whole terminal content. Could you see if this helps?

Comment 6 Tomas Mraz 2005-08-29 20:10:47 UTC

It is better and smoother than in 0.11.13 but still not as smooth as konsole in
KDE 3.4.2.

Maybe interesting observation: the selected line scrolls smoother on the bottom
of the terminal window (almost fine) than on the top of it.

I've also tried simple test with timing cat with a large file on the terminal
and with the konsole it takes about 40% shorter time than with gnome-terminal. 

Xterm is of course even faster but this is expected as it doesn't use the
freetype fonts and so on.

Comment 7 Guilherme de Siqueira Pastore 2006-02-28 09:49:35 UTC

Could you please try it now, with vte 0.11.20? I'm quite sure that this could be closed by now.

Comment 8 Tomas Mraz 2006-02-28 14:39:37 UTC

Actually it the selected bar movement is still not as smooth as in konsole. The bar moves fast but it skips lines. It was smoother (also probably slower but I am not sure with that) in vte-0.11.18.

Comment 9 Behdad Esfahbod 2006-02-28 20:09:58 UTC

I'm still working on this.  I've found the cause, working on a patch.

Comment 10 Behdad Esfahbod 2006-03-06 00:40:18 UTC

*** Bug 96879 has been marked as a duplicate of this bug. ***

Comment 11 Behdad Esfahbod 2006-03-06 00:42:27 UTC

What's hapenning is that currently vte keeps a bounding box of all updates, and invalidates (aka updates) that bounding box.  This means, when you move your cursor in vim, the rectangle including the cursor and the part of the status bar that shows cursor location is updated.  That's about half of the screen on average.

Comment 12 Whitney Jackson 2006-04-17 01:16:19 UTC

Just wondering if any progress has been made on this.  I'm still seeing it using vte-0.12.0-1 on FC5.  It doesn't even need to be vim.  I have very slow jerky scrolling using less and more as well.  By the way I started noticing this problem much more when I started using a dual monitor setup.  I'm not sure why.  Also, while scrolling up and down through a file I see the Xorg process use up an entire CPU.

Comment 13 Chris Wilson 2007-01-10 14:02:17 UTC

Created attachment 79945 [details] [review]
Invalidate only the cells.

Attached is a patch that only invalidates the cells and not the bbox. Unfortunately, it also contains many [small] speedups for hotspots whilst callgrinding - although on my system the normal process is display limited.

Comment 14 Behdad Esfahbod 2007-01-10 17:02:13 UTC

Wow, your kicking some serious a$$ here!

I love these stuff, however, there's no way I can review and commit this as one patch.  Please break it up into logical units.  I suggest you take a look at git-svn and stacked-git.  They make the life way easier to manage patchsets.  And probably open a separate bug for the other bits.

Just a few comments:

  * In your configure.in changes, the "[default=...]" is wrong, as square brackets are quotation chars in configure.in.  You need to do "@<:@default=...@:>@" for example.

  * If you add new asserts, you should be really really really sure that if it hits we better crash.  I personally hate asserts and prefer to do a warning of some kind and proceed.  And g_error may be preferrable to assert, but if looking at all the callers, a condition really should be true, assert is fine.

  * As for the main point of this bug, I have a patch locally that uses a GdkRegion to invalidate the cells only, and not the bbox, but the overhead of updating the region made it slower for most purposes, although the case of vim and mc was improved.  Your approach may be different, but you may want to measure speed difference of just that part of the patch.   I use "ls -R /usr/lib; time ls -R /usr/lib" as one measure.


Do you have interesting before/after numbers?

Comment 15 Chris Wilson 2007-01-11 21:47:56 UTC

I'll be back in a bit... Still learning git (having recently discovered git-svn), and about to throw st-git into the mix as well.

Off the top of my head, I recall that the repeated invalidates did have a significant impact on the timings, but that most of the invalidates were for 0x0 cells.

Lately, all my timings have been within measurement error, except when callgrinding ie not display limited.

Comment 16 Chris Wilson 2007-01-15 13:18:21 UTC

Created attachment 80315 [details] [review]
Invalidate only the cells (and nothing else!)

Clean version. And now for some timing info...
time hexdump 16M.wmv

vanilla:
local    0m27.228s
remote   3m9.866s

vim-speedup:
local    0m27.357s
remote   3m2.644s

all (more performance tweaking):
local    0m24.302s
remote    2m31.238s

ie we introduce negligble overhead.

Comment 17 Behdad Esfahbod 2007-01-15 18:47:51 UTC

Nice.  Lets get this in for next week's release.

A couple of points:

- Now that you are keeping a list yourself, you can switch from GdkRegion to GdkRectangle.  A GdkRegion is a list of (nonoverlapping) GdkRectangles.

- I just skimmed the patch.   You are generating one rect per consecutive cells, right?

- Where are the other speedups?  I like to get them in before next week and spend the rest of the cycle bugfixing them.

Comment 18 Chris Wilson 2007-01-15 20:09:47 UTC

(In reply to comment #17)
> - Now that you are keeping a list yourself, you can switch from GdkRegion to
> GdkRectangle.  A GdkRegion is a list of (nonoverlapping) GdkRectangles.
I'd considered whether GdkRegion was needless overhead, but then I'd still need to g_slice_alloc(GdkRectangle) [4 ints for the rectangle vs 6 ints+pointer for the region] and then transform them into GdkRegions during the expose processing. Or I may have missed something.

> - I just skimmed the patch.   You are generating one rect per consecutive
> cells, right?
Yes. At first I invalidated each individual cell and noticed a performance hit (obviously!). I then went off to recoup that hit from other hotspots in the code. As I cleaned up the patch, I realised how stupid that was and needed to only reset the bbox if we moved the cursor during an escape sequence... Actually that could be if we moved the cursor outside of our current bbox.

> - Where are the other speedups?  I like to get them in before next week and
> spend the rest of the cycle bugfixing them.
Many and varied, but none major. I think from the performance stand-point, the biggest wins are by not using zero-terminated GArrays (which of course we are adding to one unichar at a time!). I am working up a series of patches for review.

Comment 19 Behdad Esfahbod 2007-01-15 20:34:10 UTC

(In reply to comment #18)
> (In reply to comment #17)
> > - Now that you are keeping a list yourself, you can switch from GdkRegion to
> > GdkRectangle.  A GdkRegion is a list of (nonoverlapping) GdkRectangles.
> I'd considered whether GdkRegion was needless overhead, but then I'd still need
> to g_slice_alloc(GdkRectangle) [4 ints for the rectangle vs 6 ints+pointer for
> the region] and then transform them into GdkRegions during the expose
> processing. Or I may have missed something.

gdk_window_invalidate_rect()


> > - I just skimmed the patch.   You are generating one rect per consecutive
> > cells, right?
> Yes. At first I invalidated each individual cell and noticed a performance hit
> (obviously!). I then went off to recoup that hit from other hotspots in the
> code. As I cleaned up the patch, I realised how stupid that was and needed to
> only reset the bbox if we moved the cursor during an escape sequence...
> Actually that could be if we moved the cursor outside of our current bbox.

Yeah, I did one per cell and got the overhead too, so it sat there in my checkout to do the grouping.  Thanks for doing this.
 
> > - Where are the other speedups?  I like to get them in before next week and
> > spend the rest of the cycle bugfixing them.
> Many and varied, but none major. I think from the performance stand-point, the
> biggest wins are by not using zero-terminated GArrays (which of course we are
> adding to one unichar at a time!). I am working up a series of patches for
> review.

Yeah, that avoids memory fragmentation that is a major problem with vte right now.  We kinda need our own memory manager for the rows, one that allocates large chunks of memory (multiples of megs) per vte widget, and allocates rows from this only.  We can then walk it from time to time and pack the rows, reclaiming any holes in the memory, like garbage collectors do.

Comment 20 Chris Wilson 2007-01-15 21:32:46 UTC

(In reply to comment #19)
> (In reply to comment #18)
> > (In reply to comment #17)
> > > - Now that you are keeping a list yourself, you can switch from GdkRegion to
> > > GdkRectangle.  A GdkRegion is a list of (nonoverlapping) GdkRectangles.
> > I'd considered whether GdkRegion was needless overhead, but then I'd still need
> > to g_slice_alloc(GdkRectangle) [4 ints for the rectangle vs 6 ints+pointer for
> > the region] and then transform them into GdkRegions during the expose
> > processing. Or I may have missed something.
> 
> gdk_window_invalidate_rect()

Internally it does gdk_region_rectangle()...gdk_region_destroy(). So we gain by fitting into the smaller GSlice magazine, but lose on the extra g_slice_alloc(). 

 
> > > - Where are the other speedups?  I like to get them in before next week and
> > > spend the rest of the cycle bugfixing them.
> > Many and varied, but none major. I think from the performance stand-point, the
> > biggest wins are by not using zero-terminated GArrays (which of course we are
> > adding to one unichar at a time!). I am working up a series of patches for
> > review.
> 
> Yeah, that avoids memory fragmentation that is a major problem with vte right
> now.  We kinda need our own memory manager for the rows, one that allocates
> large chunks of memory (multiples of megs) per vte widget, and allocates rows
> from this only.  We can then walk it from time to time and pack the rows,
> reclaiming any holes in the memory, like garbage collectors do.

Now that's more work than I've done here!  I only meant using g_array_new(FALSE, FALSE, ...) instead of g_array_new(TRUE, TRUE, ...). I have however seen that bug report [bug #342338] and have myself cursed the lack of reuse of the RowData etc - most noticable is the amount of time we spend in g_realloc due to vte_g_array_fill.

Comment 21 Chris Wilson 2007-01-17 17:37:36 UTC

r1429: 2007-01-17  Chris Wilson  <chris@chris-wilson.co.uk>

        Only invalidate groups of inserted chars and not the whole bbox.
        This speeds up applications like vim which maintain a status line
        at the bottom of the screen. Fixes bug 161342.

        * src/vte-private.h:
        * src/vte.c: (update_regions), (_vte_invalidate_cells),
        (_vte_invalidate_all), (_vte_terminal_scroll_region),
        (vte_terminal_process_incoming), (reset_update_regions),
        (remove_update_timeout), (update_repeat_timeout), (update_timeout):

Comment 22 Behdad Esfahbod 2007-01-17 18:35:05 UTC

Thanks.

Comment 23 Behdad Esfahbod 2007-01-17 22:16:38 UTC

Humm, still seeing it.  configure vte with --enable-debugging and run with VTE_DEBUG_FLAGS=updates ./vte, then vim and move cursor.

Comment 24 Chris Wilson 2007-01-18 02:52:58 UTC

When adding the group bbox, I reset it to the current cursor (cut'n'paste) which was not desired.

Please check r1451: 2007-01-18  Chris Wilson  <chris@chris-wilson.co.uk>

	Bug 161342 revisited. Reset the box to +-inf and not the current
	cursor, otherwise you just reproduce the slow behaviour (an overly
	large invalidation area).

	* src/vte.c: (vte_terminal_process_incoming):
		Reset bbox to +-inf.

[I'm going to get some sleep before checking through the possible accessibility regression.]

Comment 25 Behdad Esfahbod 2007-01-18 06:42:44 UTC

Works like a charm!

Thanks.  I see some very broken behavior on x86_64.  Will go through the compiler warnings to see if they help.

Comment 26 Chris Wilson 2007-01-18 09:15:14 UTC

Typical, the machine that's off the network. I didn't think I introduced any 64bitisms, but...

Comment 27 Behdad Esfahbod 2007-01-18 16:20:39 UTC

My fault.  Working after "make install".