GNOME Bugzilla – Bug 98982
Optimization of Win32 pango rendering
Last modified: 2005-11-08 22:03:39 UTC
In the following attachment, I include a patch for pangowin32.c (from pango 1.0.5) which optimizes the rendering by avoiding extra creation/destruction of objects during critical period, and by using caches. Arno
Created attachment 12395 [details] [review] Optimization of pango rendering
Hmm. Nothing but pango_win32_render() is ever used by GTK+, so I don't think that most of this patch is going to have any effect on GTK+ performance. The stat_glyph_indices stuff probably is OK, though I wouldn't expect it to have a huge effect either ... two malloc/free per line shouldn't be that expensive. (Do you have benchmarks? There are other places in Pango where temporary buffers like this are allocated which are a lot more performance critical than this... if it is worthwhile here, it probably is worthwhile in those places too.) (However, I think it's better to use stack-allocated buffers rather than static buffers... every static variable adds a little bit to startup time and total memory footprint.)
What is expensive is not the memory allocation, it is the calls to SelectObject. Also, I am surprised that you are only interested in changes to improve performance for Gtk+. Aren't you interested also in improving performance of pango per se, when possible ? Arno
Are you actually using Pango on win32 outside of the context of GTK+? If you are using Pango inside GTK+, then most of your patch was both untested and unbenchmarked, so I'm a little suspicious of it :-) Using Pango on Win32 without GTK+ doesn't make a whole lot of sense to me, but if people are interested in it, then it might make sense to do something like this, though all the pango_*_render_layout[_line] functions really should be rewritten to use PangoLayoutIter (see recent mails on gtk-i18n-list about pango_ft2_render_*) If SelectObject is a big problem, then we really should be looking at things we can do to improve performance of rendering _from_ GTK+, not just on the infrequently used direct-use-of-Pango code path.
The gtk-1-3-win32-production branch has code to cache Windows DCs, for performance reasons. I have never been quite satisfied with that code, however, it is a rather ugly addition added as an afterthought. I would prefer not to add much code complexity for optimisation purposes until the most glaring bugs have been fixed and needed missing functionality has been added. Is SelectObject() of fonts significantly more expensive than other SelectObject() calls? (This is getting away from the topic of this bug, but...) What about DIBsections (device-independent bitmaps, in the GDK address space) vs. DDBs (device-dependent bitmaps, in kernel space, or even on the graphics card (?). (I think.) GDK currently uses DIBsections all the time. Would it perhaps be a good idea to use DDBs in some cases?
I'm investigating and asking our Windows experts about the various questions you (Tor) asked. I'll get back when I have more information. Thanks to both of you for your feedback so far, it's very valuable. Arno
Some answers to your questions: DDBs are always quicker. SelectObject is not generally costly, creating the objects is more costly. So it certainly sounds like using DDBs would be a big win. I haven't had time to measure the actual impact of my suggested patch. Arno
Doesn't look like a 1.2.0 issue in any case.
Maybe this should just be RESOLVED/INCOMPLETE? There doesn't really seem to be any real direction here for future changes.
Well, everyone seems to agree that we need to put some cache in place to improve performance, so I'd rather let this PR open to get this improvement happen. Arno
PangoRenderer probably provides a good framework for any caching that is necessary, though this bug still seems a bit vague to me.
Tor, is this bug still relevant, or can be closed?
This can certainly be closed in my opinion.