GNOME Bugzilla – Bug 353326
Improve CJK font selection
Last modified: 2007-06-11 03:05:03 UTC
This is a followup to Owen's comment here: http://blogs.gnome.org/view/ryanl/2006/08/25/0#comments I actually didn't know that we are returning "" for HAN in pango_script_get_sample_language(). This is definitely suboptimal. First, we can at least return Chinese. That definitely works better than the current situation. Even better, we can return (probably in a new API) a list of languages. "zh,ja,ko" for example. Fontconfig already knows about this kind of usage: [behdad@home examples]$ fc-match --sort sans:lang=ja | head -n 3 sazanami-gothic.ttf: "Sazanami Gothic" "Gothic-Regular" DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book" DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book" [behdad@home examples]$ fc-match --sort sans:lang=ko | head -n 3 Gulim.ttf: "Gulim" "Regular" DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book" DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book" [behdad@home examples]$ fc-match --sort sans:lang=ja,ko | head -n 3 sazanami-gothic.ttf: "Sazanami Gothic" "Gothic-Regular" Gulim.ttf: "Gulim" "Regular" DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book" [behdad@home examples]$
It's a bad mistake to try to guess which one of: Traditional chinese Simplified chinese Japanese The user would prefer. There's a lot of sticky politics there on all sides. I don't see a big advantage from having a language tag either ... without out it, the effect will be to just use the first Han font with that character in the current configuration, which is about as good as guess as you can make. The *only* way to do better would be to to frequency analysis on larger segments of text than character-by-character, but that would be better left to the app, I think, since Pango is going to be unable to do it at more than a run or maybe paragraph level, which could still leave a document displaying very wierdly (isolated Han characters might come out in a different font than characters in paragraphs...) I think this is NOTABUG.
Ok, closing as $PANGO_LANGUAGE provides a generic solution to this problem.