After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 353326 - Improve CJK font selection
Improve CJK font selection
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2006-08-29 01:29 UTC by Behdad Esfahbod
Modified: 2007-06-11 03:05 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Behdad Esfahbod 2006-08-29 01:29:53 UTC
This is a followup to Owen's comment here:

  http://blogs.gnome.org/view/ryanl/2006/08/25/0#comments

I actually didn't know that we are returning "" for HAN in pango_script_get_sample_language().  This is definitely suboptimal.  First, we can at least return Chinese.  That definitely works better than the current situation.

Even better, we can return (probably in a new API) a list of languages.  "zh,ja,ko" for example.  Fontconfig already knows about this kind of usage:

[behdad@home examples]$ fc-match --sort sans:lang=ja  | head -n 3
sazanami-gothic.ttf: "Sazanami Gothic" "Gothic-Regular"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
[behdad@home examples]$ fc-match --sort sans:lang=ko  | head -n 3
Gulim.ttf: "Gulim" "Regular"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
[behdad@home examples]$ fc-match --sort sans:lang=ja,ko  | head -n 3
sazanami-gothic.ttf: "Sazanami Gothic" "Gothic-Regular"
Gulim.ttf: "Gulim" "Regular"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
[behdad@home examples]$
Comment 1 Owen Taylor 2006-08-29 14:20:51 UTC
It's a bad mistake to try to guess which one of:

 Traditional chinese
 Simplified chinese
 Japanese

The user would prefer. There's a lot of sticky politics there
on all sides. I don't see a big advantage from having a language
tag either ... without out it, the effect will be to just use the
first Han font with that character in the current configuration,
which is about as good as guess as you can make.

The *only* way to do better would be to to frequency analysis
on larger segments of text than character-by-character, but
that would be better left to the app, I think, since Pango
is going to be unable to do it at more than a run or maybe
paragraph level, which could still leave a document displaying
very wierdly (isolated Han characters might come out in a 
different font than characters in paragraphs...)

I think this is NOTABUG.
Comment 2 Behdad Esfahbod 2007-06-11 03:05:03 UTC
Ok, closing as $PANGO_LANGUAGE provides a generic solution to this problem.