GNOME Bugzilla – Bug 162681
Font substitution isn't functioning on Windows
Last modified: 2018-05-22 12:07:57 UTC
Type or paste in characters spanning more than one font in a GTK+ program on Windows. Chinese characters in X-Chat using a non-Chinese font is a good example. The characters are replaced by middle dots, even though acceptable glyphs are present in at least one other font present on the system. On Linux, and in non-GTK+ Windows programs, the appropriate glyph is displayed using another font.
What Pango version? Sounds very much like a duplicate of bug #152997. Or, a broken pango.aliases file. Distributing (or setting up at install-time, even) a working pango.aliases file is something that people who distribute the binaries, or write installers are supposed to do.
And BTW, what you describe works fine for me in gtk-demo, with GTK+ 2.4.14 and Pango 1.4.1, or HEAD GTK+ and Pango 1.8.0. Bug #152997 was in Pango 1.6.0, which I have never distributed binaries for, but others have.
The Pango version seems to be 1.4.1. I wasn't aware one could get Pango 1.6.0 for Windows? The pango.aliases file seems to be in working order.
Can you reproduce the problem with gtk-demo? (Try the "entry completion" widget.)
I don't seem have gtk-demo, so that would be a no. Could you tell me where I can find a windows version of gtk-demo?
Should be distributed with Tor's win32 builds. I think the relevant information is where you got your Pango from; it could, I think, be a problem in how the library was built or packaged.
Well, it was either from Alex Shaduri ( http://members.lycos.co.uk/alexv6/ ) or Jernej Simončič ( http://sourceforge.net/projects/gimp-win/ ). These are the places one would get Pango from if one didn't wish to compile it oneself, or so I thought... However I do see the huge gap in version numbers between Windows and Linux. Perhaps font substitution was even added after 1.4.1. On another note, I'm afraid I've no idea where to find either Tor's win32 builds or gtk-demo...
gtk-demo is included in the GTK+ developer package from www.gimp.org/win32/downloads.html . Despite the largish version number gap between 1.4.1 and 1.6.0, the actual difference in functionality is not that huge, at least not for Windows, I think. (And 1.6.0 contains bug #152997) If by "font substitution" you mean the feature that you can mix Latin, Chinese, Hebrew, and whatnot scripts in the same Pango layout, surely that has been working since a long time ago, also on Windows (except when bug #152997 was introduced in the HEAD branch but nobody noticed for a long time). That is one of the main points of Pango.
That's why I think it ought to be working. Ask anyone using X-Chat with the aforementioned GTK distributions (dynamically linked or static), and you'll find that they don't even know that it should be working. The general consesus is that "you need to be using a Unicode font", meaning Bitstream Cyberbit or Arial Unicode. This bug is not present in X-Chat on Unix platforms, however. I'll get gtk-demo from the package you mentioned - Thanks for the link.
Created attachment 35456 [details] A few sample sentences that work in Firefox, but not in gtk-demo or X-Chat. Languages that are included in my default system font, work.
I've posted an image to demomstrate the error. I honestly can't remember which of the two GTK distributions I mentioned I'm using, but from talking to a lot of X-Chat users using Windows, I believe both have this bug. As I said, they don't even know it SHOULD be working, so it probably hasn't been for a while. As you can see from the image, the languages listed (which aren't included in the system font, but are in other fonts) work in mozilla, but not in GTK+.
And just for reference, I'm using a Japanese Windows system, so some Chinese, Cyrillic and Greek chars do display.
Well, we clearly need somebody who can reproduce this problem and can build Pango himself to debug this. Multiple scripts in the same gtk entry etc works fine for me in apps like gtk-demo and GIMP. I don't use X-Chat, and have no interest in installing it, so there is little I can do. BTW, have you tried the Pango 1.4.1 from the www.gimp.org/win32 site? Or are the files in it the same as from Jernej's installer?
No, if I remember correctly, the one I'm using now is actually the one that's bundled with Gimp on Sourceforge. Though as I mentioned earlier, it seems to affect both of these prepackaged versions. Perhaps they're actually repackaged from the same files?. The person using these packages in X-Chat seemed pretty sure it was a Pango error, however I'll think about taking it up with the person who compiled them instead, if it doesn't affect you at all.
One (probably unrelated note) is that E. Asian versions of windows have a "font linking" mechanism that we possibly should be using to drive font fallbacks, see: https://bugs.eclipse.org/bugs/show_bug.cgi?id=63571 Which links to: http://msdn.microsoft.com/workshop/misc/mlang/mlang.asp The eclipse code there shows dynamically checking for the necessary DLL. Using a PangoFontSet subclass would allow doing this lazily only when we try to find a glyph that isn't covered, though I'm not sure that's necessary. [ http://www.microsoft.com/globaldev/getwr/steps/wrg_font.mspx describes a different font fallback mechanism that occurs internal to Uniscribe, but I can't figure out how it would work... since ScriptShape() only gives glyph indices as codepoints. I wonder if it actually only applies to the internal usage of Uniscribe that ExtTextOutW will do in some circumstances and isn't available to people using Uniscribe directly? ] With the current code, aliases are only going to be used if the specified font is one of the generic aliases - see the code in pango_font_map_real_load_fontset() - right? So if for some reason, David is using an explicitely specified font then I don't think he'll get font fallback at all. And doesn't GTK+ pick up the system font through gdk/win32/gdkproperty-win32.c:gdk_screen_get_setting() at this point? fontconfig implicitly adds "Sans" to to every requested font when you do FcFontSort() to get a list of possible fonts. Maybe we should do the same here.
gdk_scree_get_setting should pickup the "system font" which is selectable by the user in Properties of Display. There in fact is code to do so but it is disable cause the default font on some win9x (even NT 4?) is a bitmap font - not handled by current Pango/win32. This is a workaround for bug #112401, not working if one uses wimp ...
The data for the font fallback mechanism in in the wrg_font.mspx link above seems to be rather simply structured, and easy to get at in the Registry, so maybe Pango should read those font names directly from the Registry, and use appropriately?
Is there a reason to do that rather than using the MLang API? The way I read it, he registry information is there iff. MLang is available.
I guess I was thinking of avoiding using COM from C.
There is a bit of additional information in: http://lists.freedesktop.org/archives/cairo/2005-January/002893.html
Owen, re: your comment #16 about fontconfig implicitly adds a "sans" to every requested font, I can't see any such code in fontconfig? But I guess the "implicitly" means that it's hard to see how that happens without knowing the code closely ;-) Anyway, I guess the problem in pangowin32 is that if one chooses some specific font like "arial" (and not the generic "sans", "serif" or "monospace" names from pango.aliases), one won't have coverage for anything except those codepoints actually covered by that font? Would a solution be to add the alias list from "sans" to any specific sans-serif font, the alias list for "serif" to any serifed font, and the alias list from "monospace" to any monospace font? Is that what you meant?
It's all done in fonts.conf - <alias> <family>Bitstream Vera Serif</family> [...] <default><family>serif</family></default> </alias> <alias> <family>Bitstream Vera Sans</family> <family>Helvetica</family> [...] <default><family>sans-serif</family></default> </alias> <alias> <family>Bitstream Vera Sans Mono</family> <family>Courier</family> [...] <default><family>monospace</family></default> </alias> <!-- If the font still has no generic name, add sans-serif --> <match target="pattern"> <test qual="all" name="family" compare="not_eq"> <string>sans-serif</string> </test> <test qual="all" name="family" compare="not_eq"> <string>serif</string> </test> <test qual="all" name="family" compare="not_eq"> <string>monospace</string> </test> <edit name="family" mode="append_last"> <string>sans-serif</string> </edit> </match>
Created attachment 49347 [details] [review] Possible enhancement So how does this patch look? I add an implementation of load_fontset to PangoWin32FontMap. If the PangoFontDescription's family refers to just one font (does not contain any comma), and if it refers to a "real" font (is found in the PangoWin32FontMap's families member), I append ",serif", ",monospace" or ",sans" to the family depending on whether it is a seriffed, monospace or sans-serif font, and then call the parent class's load_fontset.
Here's the effect after your patch: http://213.197.30.23/img/w32fontset.png Very nice, it substituted in line 1 and 4 (prevously only printed a unicode box). Primary font is "courier new" on Win '98. Is it just me, or is it using a sans substitute for a serif'ed font?
For the record, the patch above was discussed a little in the GTK+ developer meeting on 2005-07-19, see http://www.gtk.org/plan/meetings/20050719.txt . This issue is also somewhat related to bug #310700.
Why is pango_font_map_real_load_fontset called so often? Just moving the mouse about and it's called a few 100 times. Sure, it's all in hash tables but still a little frightning. Also, I think the patch adds aliases to itself, i.e. "monospace" becomes "monospace,monospace". Probably harmless. A different but related issue: If you select "monospace" from a font dialog, you may not get the first alias, it's fairly random, as all the aliases as listed in the "Style" list. e.g. monospace = lucida console,courier new,....etc When I select monospace from a font dialog, I actually get "courier new", unless I select the bottom "Normal" from the style list.
> Why is pango_font_map_real_load_fontset called so often That is probably something to be discussed in another bug report, for instance bug #104683. > Also, I think the patch adds aliases to itself, i.e. "monospace" becomes > "monospace,monospace". Ah, yes, will have to take care of that. > If you select "monospace" from a font dialog Yes, that really sucks, the behaviour of the font selector for these aliases is very confusing. Sigh. (It's not cross-platform, on X11 it behaves very cleanly.)
realo_load_fontset() is *supposed* to be called a lot. (1000 times when moving the mouse sounds a little excessive). BUt basically, any time Pango lays out text, it needs to call that function. So, it needs to be a really fast function, using caching if necessary.
*** Bug 311651 has been marked as a duplicate of this bug. ***
*** Bug 307469 has been marked as a duplicate of this bug. ***
Marking patches...
*** Bug 161027 has been marked as a duplicate of this bug. ***
Created attachment 234520 [details] [review] Tor's "possible enhancement" patch that applies against pango 1.32.6 (and still works)
Kris, if you tested it, push to master!
Yes I tested it. I will push it to master as soon as I get a chance.
To implement font fallbacks properly, I think that looking at just MLang is not enough. From what I have understood, MLang just deals with font linking. Uniscribe deals with both font linking and font fallback. Apparently, the exact fallback list used is not publicly accessible (I lost the reference for this statement). So, in essence, we have the same problem as in the Core Text backend (where we could get around the problem by using a private function for obtaining the fallback list). Either, this problem is solved by generating our own fallback lists (using font config?), which to me sounds like messy and not the right solution, or we have to somehow make it possible to work together with platform API that can perform font fallback, but hides the necessary data to do this. What I mean with the latter is that both CoreText and Uniscribe provide functions that take a string and font name as argument and figure out what glyps from what fonts to render. This is in a nutshell also what Pango does. What it be possible (and does it make sense) to make some kind of a bridge from Pango API to platform API, circumventing problems with trying to get PangoFontSets set up, etc, etc.?
Kris, The other day I found where Windows keeps its fallback list. In the Windows/Fonts directory there are these XML files: GlobalMonospace.CompositeFont GlobalSansSerif.CompositeFont GlobalSerif.CompositeFont GlobalUserInterface.CompositeFont Wouldn't be impossible to read and deal with them, if someone is determined enough. Alternatively, I hear that DirectWrite API should facilitate this, though I have not checked.
Thanks for the pointers. I did see DirectWrite, but did not look into it further because it requires Windows 7 or higher. We might still want to support GTK+ on Windows XP.
This is still an issue on Windows, though I did notice that if you specify a list of family names in the PangoFontDescription, it does select successive fonts when the first few don't have the right glyphs. My 2c as a Pango outsider: Getting a fallback list the way pangocoretext does would be great, but pangowin32 already ships with aliases hard-coded, so would it be that bad of a solution to hard-code a few fallback fonts?
Hardcoding more is fine. Suggest a patch (add Segoe families?).
I would be happy to make a patch, as soon as I can get it to compile in Windows. I haven't had any success yet so I might have to badger someone with tips
Most badger-friendly people live on #gtk+@irc.gimp.org and #msys2@irc.oftc.net
(In reply to Caleb Hearon from comment #42) > I would be happy to make a patch, as soon as I can get it to compile in > Windows. I haven't had any success yet so I might have to badger someone > with tips Here is an example of building the stack on Windows with MSVC if that is helpful: https://github.com/hexchat/gtk-win32
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/pango/issues/20.