GNOME Bugzilla – Bug 122656
Text selection consumes 100% CPU and is very slow
Last modified: 2007-02-07 14:17:05 UTC
When I select some text in gnome-terminal window, it can take sometimes upto 3 seconds after the mouse click to show the highlighted text. During this time the CPU usage is ~100%, almost equally shared by gnome-terminal and the X server. GDK_USE_XFT is enabled. No other application has problem selecting text and fast scrolling with scroll-bar using the same TT fonts. There was a bug open for the performance of gnome-terminal and text highlight issue, but I can't find it anymore. You can close if you refer that bug here. I would really like this issue to be taken up seriously because gnome-terminal is so slow compared to other terminals. More than anything, I want to understand this issue and understand why nobody cares if gnome-terminal has problems on slower CPUs/cards.
People care, you do for example. The question is perhaps why they aren't doing anything to fix it - but we could ask you that question too. ;-) Query for all the bugs in here, start helping out... you'll soon learn why.
OK, HP, give me a hint here. What is it about vte that it uses same libraries as let's say galeon for text rendering, but is much slower when highlighting and scrolling? I might be mistaken and totally not getting it here. But let's list some of the differences.
It's an entirely different codebase, for starters. Thousands of lines of distinct code involved. Listing isn't going to help, you need to run a profile with tools such as valgrind, speedprof, oprofile, etc.
Soeren Sandmann had a good presentation at GUADEC about how to do profiling and some of the tools, that might get someone started.
HP: I didn't mean a diff. I meant at the top level, in a line or two. like vte does "this" while galeon doesn't need do "this" for rendering text. For me both are rendering text. This box that I am typing in is showing characters with same font as vte, using the same xfs, same X, as I type. They appear much faster on screen compared to vte. If I use the scrollbar and go wild at it, it doesn't show just one top line for a second and then refresh the whole page, the refresh is not even noticeable. If I select, highlight is fast. After some reading on other bugs here, I feel that URL and regex handling are the main reasons for highlight slowdown on Solaris. Is it possible to make these as compile time options? I think gnome-terminal is great even w/o these! Nalin, you seem to be totally disinterested when people thrash vte performance. I haven't seen a single comment from you on this anywhere. why is that? do you believe the problem is not real?
I found something weird by accident today. Couldn't explain. Can you? In one gnome-terminal, I did a rlogin to same m/c and account. I ran my usual test of "cd /usr/lib;time /bin/ls -R". It took 0.2 seconds. Ctrl-d and again "cd /usr/lib;time /bin/ls -R". It took 13 seconds. Repeat as many times and similar results. What does this mean?
Created attachment 20120 [details] [review] diff -up terminal-screen.c.ORG terminal-screen.c
patch for gnome-terminal and vte. first patch(terminal-screen.c) is a performance bug fix and second patch just allows highlighting of patterns to be turned off with a compile time macro. I think I can spot patterns better than Solaris's regex library...:)) no more 100% cpu usage, no more mess with "what did I just paste...". scrolling is a different story!!
Created attachment 20121 [details] [review] diff -up vte.c.ORG vte.c
It turns out that its not Nalin's problem after all. I did some digging on the scrolling part. I found the problem but don't know how to solve it. A little vte tune up(see the third patch file) speeds up scrolling by a factor of 3(see the end of this note). I enabled VTE_DEBUG and noticed that it goes to read 4096 bytes from tty FD but manages only around 500 bytes each time. After that, it renders those and goes on to read more. If I do a rlogin into same m/c with same login, somehow same read on same fd returns the max possible and it renders all. In fact it renders around 4000 chars each time compared to 500 each without rlogin. To my mind, there are two possibilities(since I am really as dumb as pile of rocks): rlogin's sockets somehow return data which speeds up vte's rendering and rlogin can bombard it with huge amounts of data while original tty FD couldn't because vte was still busy rendering complex data it returned. Or its not AA/XFT or vte which is slowing things down. Its the limit on the read/write buffer of the tty that's the bottleneck. But somehow when I rlogin, this same FD has higher buffer size(4096). Only change is that this FD is duped onto the newly opened sockets for rlogin. I notice that there is an ioctl to include "ldterm" module(pty.c) which has a flow control parameter LDCHUNK=512 bytes. I don't know if its related. Some streams guru can probably comment on it. linux doesn't have such a "ldterm" module(or may be I couldn't find) and that's probably why I don't see any slowdown on linux. I couldn't find any way to set the buffer size limits on the terminal stream. On solaris it seems, they are not settable(man streamio). The attached patch for vte.c sets priority for input to LOW, so we don't keep going back to reading when we know the read is slow, and hence don't sit on the CPU. This definitely brings the CPU usage down. The patch also increases the wait-for-next-processing(VTE_COALESCE_TIMEOUT) timeout to 50 msecs from 2 msecs. This makes sure that we process bigger chunks of information with an optimal delay(instead of too small a delay and wasting CPU cycles running around). Speed up after doing this definitely proves that its not the rendering which is slow. Nalin can probably comment on the side-effects of increasing this timeout. So, "time /bin/ls -R /usr/lib" now returns 4 secs. Not long ago, it used to be 23 seconds. I am happy with the gnome-terminal now. It doesn't kill my CPU anymore. It scrolls fast enough for my eyes. The selection returns quick and doesn't kill the cpu. And ctrl-click still opens up epiphany on a URL, if I can spot them i.e....:)) If anybody wants to test it out, here is roughly how: cd vte-0.11.10/src apply pty.c patch(4th attachment). This removes the middle man gnome-pty-helper on solaris. If you see any weird terminal behaviour, roll it back. apply second vte.c patch(3rd attachmen). make CFLAGS+="-DVTE_DONT_HIGHLIGHT_PATTERNS" make install cd gnome-terminal/src apply terminal-screen.c patch(1st attachment). make install Kill all gnome-terminals, start a new gnome-terminal. Increase size to 80x50 and do a "cd /usr/lib; time /bin/ls -R". If you don't see an improvement, remember the statement I made about rocks....:))
Created attachment 20160 [details] [review] diff -up vte.c.ORG vte.c
Created attachment 20161 [details] [review] diff -up pty.c.ORG pty.c
OK, pty.c patch should be ignored. Doesn't make much difference, but screws up some standalone vte control characters. gnome-terminal doesn't have any problems though.
Finally, had the time to confirm that on linux it reads and processes 4K chunks. Now, some streams guru needs to pitch in!!
I found the solution to the smaller read chunk problem on solaris. On solaris 2.6 streams, the high water mark for read queue is 1024 by default. I wrote a small streams module which sets the water marks to 5120 on both master and slave, and inserted it into pty.c. Now, streams can read/write 4096 byte chunks.
so much for caring...:(
I don't think I understand the problem. I'll focus on the selection part, leaving the scrolling for later. What exactly is it that is slow? Please note I am referring to your initial report. Is it slow when you select text, or when you paste text using the middle button click? Btw, what is `m/c'?
Removing the FIXED from the summary, as this has not yet been fixed...
It is fixed on my machine(m/c)...:) Selection was slow because every click of the button was matching patterns when we needed to match only if CTRL or 3rd mouse was pressed. So, you would see a pause in selection with 100% cpu used. terminal-screen.c patch fixes that. vte.c patch also moves vte pattern matching and hiliting to a compile time switch. That works around the problem that when you randomly move the mouse in the window with lot of text, the cpu use goes up. Again it is searching for patterns for hiliting and solaris regex library can't keep up. With output scrolling, the issue is well defined for solaris 2.6. The stream queues have high water mark of 1024 by default. So, vte processes smaller chunks of data. Takes longer. It consumes more CPU because it doesn't wait long enough for I/O to complete (2 ms) and goes back. I haven't included the streams patch because it requires addition of a streams module(two files) which will need to be separately compiled and put in /kernel/strmod dir on solaris. And of course, its solaris 2.6 specific. The results of these patches have been astonishing on my 2.6. cpu hogging is down, output scrolling is faster than xterm. The g-t now is as usable on my solaris as it is on my linux box.
Many proposal patches in this bug, but no one seems to look at them why ?
Can SOMEBODY please apply terminal-screen.c patch? its still not there in latest gnome-terminal-2.5.4. It's a clean patch which WILL improve text-selection CPU usage in gnome-terminal. It is so damn obvious that you don't need to match string against regular expressions until the time you need to, which is inside the "if (CTRL pressed)" statements. Havoc??
This sounds like a duplicate of bug #93775.
I have to test this again. In Gnome-2.4.1, the patches (vte, terminal) worked great and Terminal was fast as it could be on Sun Rays. I have tried Gnome-2.5.92, original (clean) gnome-terminal is slowing down just like before in the old days without patches. I've tried to use the same patches as in G2.4. The speed is better, but I'm afraid, it's still not as good as it was in G2.4. I will try to confirm/check this soon, now I'm not on Solaris.
So I have double-checked it. I have patched vte and gnome-terminal. It worked in gnome-terminal-2.4.1 and 2.4.2, but does not work in gnome-terminal 2.5.92 :( Seems the slowdown does not occur always, but it's still there :(. Tried on Solaris 9/SPARC.
vte.c patch can help fast scroll, but have you ever tried to see how you feel when you type it fast? Like hold 'a' down key and you will see it skips the frame that make feel like slow. -#define VTE_COALESCE_TIMEOUT 2 +#define VTE_COALESCE_TIMEOUT 50 This is what it caused, so I change from 50 to 25 or 15 and now it feels more smoother and no skip frame when you type fast. It doesn't change anything on fast scroll. Benchmark: ==================================== $ time ls -R /usr/ports w/out patch: -21.89 real -0.28 user -1.12 sys hold 'a' key and no skip frame; it feels smoother and faster. W/ patch: -13.91 real -0.24 user -1.13 sys hold 'a' key and skip frame like at the every three 'a'; it feels slower. W/ patch and 50 -> 25: -13.91 real -0.33 user -1.04 sys hold 'a' key and no skip frame; it feels smoother and faster. W/ patch and 50 -> 15: -14.08 real -0.28 user -1.08 sys hold 'a' key and no skip frame; it feels smoother and faster. ==================================== I personal have chosen 15; it's still fast, feel smoother and no skip. BTW: Tested on FreeBSD 5.2-CURRENT w/ AthlonXP 2000+ and 512mb ram.
Created attachment 28510 [details] [review] Patch for g-t selection issue for 2.6.1
Sunil, does the latest patch need any of the previous patches for vte or GT? I've done some experiments today (before this patch was submitted) and I've had some good performance. Almost as good as it should be. I will definitely try the latest patch.
no. its just that the latest patch applies against 2.6.1 while earlier ones applied against 2.4
Well i have tried it now. Did not fix my issues, or at least, not very well :( Selecting text is still slow. The delay is up to 2 seconds. I'm trying this on Sun Ray, which is connected to another server via dtlogin. Don't know id this could be the cause... No patches other than this was applied to GT, and vte was patched only to contain -lglib-2.0, which is missing dependency while compiling with Forte on Solaris. Any hints what to try next?
Now I've had a little more of luck. I haved used your patch together with attached patch for vte. The patch is a collection of various patches, maybe even yours, but all are from bugzilla entries. Recompiled vte and gt and looks good. Please, take a look at that.
Created attachment 28543 [details] [review] Patch for vte to be used together with Sunil's gt patch SEEMS to work for me on Forte compiler and Sol 9/SPARC. Please try.
I have applied both the Patch for g-t selection issue for 2.6.1 and the Patch for vtw to be used together with Sunil's gt patch to vte-0.11.11 and gnome-terminal-2.7.3. While I had to massage Sunil's patch a bit to duplicate the changes in the new release, I now notice no slowness in the terminal at all in scrolling or highlighting. The underlining of URLs on mouseover is now also instantaneous, right clicking them also brings up the correct menu options. This is on a Solaris 2.8 platform. I would recommend these two patches be incorporated into the next release.
On solaris 8, I experience the same highlight slowness problem with gnome-terminal 2.4.0.1, but not for vte, even when there's a lot of text in the buffer. Any idea on how to track this down further?
I can confirm that vte even without any patches performs ok, but I still DO have performance problems in gnome-terminal 2.6.2. Please, dear maintainer, could you look for a problem there? Thanks.
I'm just doing a little cross-referencing work on vte performance bugs since someone posted a patch in 137864 that did part of what a patch in bug 143914 does, and it would have helped to avoid this duplication of work... (In other words, the comments below aren't strictly related to this patch): Note that there is also a performance patch in bug 143914 (that appear unrelated to the ones here, though I didn't look that closely), as noted above there's one that was just added to bug 137864 that is a strict subset of the one in bug 143914, and comments 29 and 30 of bug 137864 look very interesting (from the maintainer of rxvt-unicode; he compares algorithms and methods used by many different terminal emulation programs and widgets and states his experience).
Can we get this stuff reviewed and resolved?
Can anyone interested in this bug check the latest vte release and tell me which of the problems (and patches) in here are still relevant?
Let's see.. I'll just focus on the selection issues for now.. I personally think that the vte patch adding DVTE_DONT_HIGHLIGHT_PATTERNS doesn't make too much sense, as I'd like to fix all the selection speed issues we encounter instead of adding yet another codepath where we simply don't display them. The other gnome-terminal patch instead does actually make sense from a quick look at it. I'll attach an updated version that applies to HEAD. This is obviously post gnome-2.12 material as we are pretty much frozen, but I think something along the line of Sunil's terminal-screen patch should actually go in.
Created attachment 49959 [details] [review] Sunil's terminal-screen patch updated to cvs HEAD
A note to Ivan's patch in comment 31: You replace the run-time configurable regexps with hard-wired strings. As a consequence, vte_terminal_match_add() and similar calls no longer work as expected. I don't think it's a nice move. Basically I think this patch should not be added (at least not in its current form). Or, if someone would really want to apply it mainstream, at least these functions should be marked as deprecated and the docs should be updated stating that they do nothing. I don't know how much this patch speeds things up, but it removes a nice feature... :-(
Hi Egmont, I don't try to build GNOME environment on Solaris anymore, so I can't test, if performance of gnome-terminal and/or vte has changed a lot :(
A lot of performance tuning has since been made all over vte, especially IO, rendering and regex matching. Unfortunately, the terminal-screen patch cannot be used as the current match could be accessed from the popup-menu via the keyboard and so we cannot rely on right-click/ctrl-click behaviour. The replacement of the regex matching with a simple url finder is more interesting from a size/dependency/speed POV but having eliminated the needless calls into the matcher it may be wasted effort (or rather effort may be more profitably spent elsewhere). Please reopen if vte is still too slow and you have profiles showing why. ;-) Thanks.