GNOME Bugzilla – Bug 129473
Slow text rendering in SWT
Last modified: 2006-10-02 16:24:45 UTC
Please refer to https://bugs.eclipse.org/bugs/show_bug.cgi?id=37683#c18. The relevant figures are: pango 1.2.1 pango 1.2.5 ------------ --------------- Motif GTK GTK GTK GTK Fox Windows AA not AA AA not AA ----- ------ ------ ------ ------ ------ ------ gc.drawText 18854 117403 64994 80481 60995 2415 13290 gc.drawText 2 23772 118667 65188 81844 61091 2285 13180 gc.stringExtent 55718 651667 466598 337582 456776 16252 18070 gc.textExtent 132206 650287 466764 342217 458026 16472 103420 We seem to be significantly slower than Windows, Fox and Motif for the text operations. There are some sample programs in the link above which can be used to pinpoint things further, but the problem definitely seems to be in Pango.
SWT/Java benchmarks are not useful for us, unfortunately. Pango drawing speed is *expected* to be slower than GTK+-1.2 or Motif, because it is doing a lot more. Benchmarking we've done elsewhere indicates that on a 1Ghz machine, Pango can: - Lay out about 27000 8 digit numbers a second - Draw about 9000 8 digit numbers a second The main bottleneck for drawing is the performance of the RENDER extension at drawing anti-aliased text. But even at this speed, I can't see how this would be a major problem for say, a menu. A menu with a hundred strings would take between 1 and 2 milliseconds to draw. We've also spent a lot of time on Pango performance, and I don't think there is much low-hanging fruit. But if you do want us to try and help you with your problem, then what you need to do is to create a small C program that models the operations that SWT is doing when drawing the slow text. I'll attach a case it may be useful to start from.
Created attachment 22485 [details] [review] Example performance benchmark
Please reopen from NEEDINFO if you add more information.
Putting link of the dup bug here so as not to lose the discussion there: http://bugzilla.gnome.org/show_bug.cgi?id=129473
I don't mean to pry, but after reading this comment: ---- SWT/Java benchmarks are not useful for us, unfortunately. Pango drawing speed is *expected* to be slower than GTK+-1.2 or Motif, because it is doing a lot more. ---- and seeing the values from the table above (forget Motif and older Pango and forget Anti Aliasing) from Fox and Windows, I see a speed deficiency of Pango ~26x slower than Fox with drawingText-methods and ~4.6x slower than Windows. Also a speed deficiency of ~24-27x slower than Fox or Windows with string/textExtent-methods. I don't know if this is just a case of SWT misusing GTK, it seems odd that out of all the toolkit comparisons, GTK still maintains such a large speed gap. There were more suggestions and comparisons posted to the original Eclipse bug here: https://bugs.eclipse.org/bugs/show_bug.cgi?id=37683 This may help in the investigation. Owen, I get the impression from you that as long as the toolkit is "fast enough" to not cause flickering from repaints, then that's good enough and we should focus efforts elsewhere. While I might agree if GTK wasn't such a high profile library, but it is. A Gnome 2.4 desktop is marginally slower on my desktop than a KDE 3.2 desktop, why is that? I don't know, probably because GTK is 'fast enough' and QT took the time and resources to go the extra mile to optimize it. Its hard to read your posts about how Pango should be slower because it is doing a lot more, I don't doubt its doing A LOT, but so is QT, Fox, wxWindows, Quartz and Window's font rendering systems and they seem to manage quite well at maintaining excellent quality with excellent speed... so that reasoning isn't really a convincing argument. On a side note, I see your name around the web alot related to some high profile projects, so it seems to me that you are probably incredibly swamped with work right now, is it possible put some resources on investigating this so the burden doesn't fall squarly on your shoulders (and subsequently onto a todo list that is I'm sure a mile long already)?
A) I have no clue what the SWT benchmark is doing (and please don't point me at the source code for it, it's layered too deeply to be useful in any case.) So, I really can't comment on anything here. B) It's not clear if Fox is doing AA text or not, it's not clear what windowing system "Fox" is running on. C) Windows separates text going through the Uniscribe engine from "easy" text, and I don't want to do this. I want Pango performance to be good enough for everybody D) Yes, of course I'm swamped. Which is why I don't want bugs that just say "Pango is slow", I want bugs that say "This C benchmark is slow, becuase of this function is being called too much, here's a patch".
Owen, Interesting finding today, someone suggested that we use gedit as a baseline but found that even gedit runs terribly slows as well. In this comment: https://bugs.eclipse.org/bugs/show_bug.cgi?id=37683#c152 The gentleman clocked a full find/replace of the char 'a' to 'a1' took 44 seconds in a giant Java source file (StyledText, from Eclipse) while only 0.2 seconds in KWrite. In comment 151, it seems we have narrowed down Eclipse's performance problem on GTK simply to the fact that GTK/Pango paints are so expensive. There are propositions to making the paint requests in Eclipse more intelligent to simply not make calls out to GTK if not necessary, maybe GTK itself would benefit from similar strategies so all GTK-based apps would see the improvements? Anyway, I'm hoping that gedit is a small enough code base that it can be used as the "C program that exhibits the behavior" what you were looking for. I imagine that Bluefish, Ajunta can't be much different in this regard...
Find in replace performance in gedit has basically nothing to do with Pango text rendering performance. The example attached above shows what I mean by a small C program.
Owen, Good news, I got you your C test. Its the Benchmark2.java program rewritten in C to test against GTK only. Back in the original eclipse bug report (Down at the bottom) there are some preliminary system performance numbers. I hope this helps.
Created attachment 24788 [details] Rewritten GTK benchmark written in C (From Java)
Hmm, I guess I attached that wrong. Just rename its ending to .tar.gz and unzip.
Note that I really don't care ANYTHING about gdk_draw_text() it is simply not relevant. - It is drawing with a different drawing subsystem, which uses different libraries, different parts of the Xserver, and different parts of the hardware. - It is doing utterly simplistic text layout It doesn't matter. Please forget it. Leaving that aside, this benchmark is still not useful because it doesn't have any correlation with what SWT is doing in real life. Please look at my detailed program for what needs to be done in bug 135017. Note that your benchmark is fairly unrealistic because it is rendering directly to the screen, which stresses different X server and hardware paths than rendering to offscreen pixmaps, which GTK+ does almost always and I expect SWT does as well. That may be partially why that when I run your benchmark, top shows 68% of the CPU going to X and 32% going to the banchmark. That means, that if I could find a way of making Pango 100 times faster, you'd get less than a third improvement in text drawing speed.
Sorry Owen it seems I've wasted too much of your time already. I'll leave this to the Eclipse team to discuss with you if they find a need to. Thanks for being so patient with this.
Hi Owen, I'm the original author of the benchmark. The gdk_draw_text() is only there to have a comparison between X and MS Windows, that is to give an idea how fast the text rendering can be - no matter what we do, it can't get faster than that. MS Windows renders fonts faster to begin with but there is not much that can be done about that. As for "rendering to offscreen pixmaps", I doubt that SWT does that: When I run the Java benchmark, I can see it render exactly the same stuff as the C version and the Java benchmark is simple enough to see that there is no second thread, that no offscreen pixmap is created and that there is no place where that pixmap is updated to the screen while the benchmark runs. Just one question: You refer to a "program for what needs to be done in bug 135017". Do you mean the steps detailed in the last comment?
Something that is being discussed in the Eclipse bugzilla is that what Eclipse tries to do when about to draw a lot of stuff is: disableScreenUpdates(); drawLotsOfStuffToScreen(); enableScreenUpdates(); It is claimed by SWT developers (?) that GTK+ doesn't have calls for enabling / disabling on-screen rendering. Thus, when Eclipse draws a lot to screen in GTK+, the user gets to see *all* the updates as they happen, instead of just getting the end result presented to them as happens on (for example) Windows. Since I am just watching this from the side lines and aren't too familiar with either GTK+ or SWT, I can't say whether the GTK+ API has the corresponding functions / functionality. If it does, a pointer to some tutorial / API reference about what those functions are / how the behaviour can be implemented might solve a lot of problems. I have tried scanning the GTK+ and GDK APIs for relevant functions without finding anything, but I'm well aware that doesn't necessarily mean they aren't there.
AFAIR, some gtk components (like table and clist) had the ability to "freeze" and "thaw" effectively stopping all rendering of themselves when a lot of items were added. In the "thaw", they were rendered themselves completely but these components are now deprecated.
Owen, I've extended the benchmark and uploaded it to https://bugs.eclipse.org/bugs/show_bug.cgi?id=51693 as V3 of the benchmark. It now also calls stringExtend() which will eventually call pango_layout_get_size(). The benchmark number are now: gc.gdk_draw_text 79923 gc.drawText 1407 gc.stringExtent 8600 So rendering the string with Pango is slowest, getting the extents is about six times faster and rendering directly is about 10 times faster then getting the extents. Which makes me wonder: Why does it take such a long time to find out the extents for a simple string? Should copying tons of pixels around (including clipping/coloring them) take longer than adding the widths of the characters expecially when only a simple western layout (left-to-right, no ligatures, etc) is involved? Maybe adding a shortcut to Pango (if (simple-layout) return quickExtents()) would make performance for 90% of the standard cases much better.
Latest update: We removed all calls to Pango except the ones to gdk_draw_layout_with_colors() and pango_layout_get_size() without changing the benchmark times a big deal. About 95% of the time is spent in these calls. Questions: 1. We are currently rendering directly to the window. Is there a faster way? Owen, you mentioned something about offscreen pixmaps. Would that help? 2. If there is nothing which can be done about these calls, are the other options? Could we cache something?
- By "program", yes, I mean "plan of action" not "code". I'd really like to see you attack this in that way, because while the special case may be made faster, there is not a lot of that can be done to make the general case faster; so someone needs to figure out exactly what is SWT is doing and how it can be made to do less of it. (Text drawing is another matter, I know plenty of things that could be done to make AA text drawing 10 times or more faster, but X development is a huge mess at the moment.) - Special casing ASCII is not interesting to me, because as soon as I do that, application authors lose touch with how their app will perform for many of their users. - Modern processors do well in tight loops. This is why it's easy to make blitting a few bitmap characters faster than analyzing the same characters and laying them out. Plus XDrawText tends to be HW accelerated (though it's plenty fast enough in software) - Your timings above indicate that in your benchmark, the real bottleneck is Xft drawing, not Pango layout. By doing measurements of what Eclipse does in real applications, you'd be able to figure out if that is the case in real life as well. - I took a short look at the C benchmark yesterday, and other than setting the background in draw_layout_line_with_colors() it's not doing anything obviously stupid that would cause noticeable performance updates. - Applications that want to look reasonable usually draw to a backing pixmap, this prevents the user seeing clear/redraw flashing. I assume that SWT does this in some fashion, but then again, maybe not... GDK provides a nice convenient interface to this with gdk_window_begin_paint_rect/region(), though apps can also do it manually if they want. - I'm not sure what you are looking for with disableScreenUpdates()/enableScreenUpdates(); if it's redirection to an offscreen pixmap, see above. If it's simply throwing out the drawing operations, well I don't see the point, but should be easy to implement inside of SWT. - Whether drawing to an offscreen pixmap is faster or not than drawing to the screen depends a lot on particular system details. Because of deficencies in XFree86, it is occasionally *much* faster, but it's not reliably faster.
Owen, Thanks a lot for your comments. I've posted them along with some explanation in the Eclipse bug. It seems pretty clear that most of the performance issue must be outside of Pango (ie. either XFree or SWT). I've written another test case and we'll see if that turns something up which can be optimized. While running my test on Windows, I noticed that on Windows, non-AA font rendering is used. So to be able to compare the tests better (same test on same machine with Windows and Linux): Is there a way to disable AA in Pango? Googling didn't turn up anything useful. Thanks.
does anybody still face this issue nowadays or can this be closed as obsolete? i'd be glad if someone would either update or close this bug report. thanks in advance.