Bug 353326 - Improve CJK font selection
Improve CJK font selection
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: pango-maint
pango-maint
:
Depends on:
Blocks:
  Show dependency tree
 
Reported: 2006-08-29 01:29 UTC by Behdad Esfahbod
Modified: 2007-06-11 03:05 UTC (History)
2 users (show)

See Also:
GNOME target: ---
GNOME version: ---


Attachments

Description Behdad Esfahbod 2006-08-29 01:29:53 UTC
This is a followup to Owen's comment here:

  http://blogs.gnome.org/view/ryanl/2006/08/25/0#comments

I actually didn't know that we are returning "" for HAN in pango_script_get_sample_language().  This is definitely suboptimal.  First, we can at least return Chinese.  That definitely works better than the current situation.

Even better, we can return (probably in a new API) a list of languages.  "zh,ja,ko" for example.  Fontconfig already knows about this kind of usage:

[behdad@home examples]$ fc-match --sort sans:lang=ja  | head -n 3
sazanami-gothic.ttf: "Sazanami Gothic" "Gothic-Regular"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
[behdad@home examples]$ fc-match --sort sans:lang=ko  | head -n 3
Gulim.ttf: "Gulim" "Regular"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
[behdad@home examples]$ fc-match --sort sans:lang=ja,ko  | head -n 3
sazanami-gothic.ttf: "Sazanami Gothic" "Gothic-Regular"
Gulim.ttf: "Gulim" "Regular"
DejaVuLGCSans.ttf: "DejaVu LGC Sans" "Book"
[behdad@home examples]$
Comment 1 Owen Taylor 2006-08-29 14:20:51 UTC
It's a bad mistake to try to guess which one of:

 Traditional chinese
 Simplified chinese
 Japanese

The user would prefer. There's a lot of sticky politics there
on all sides. I don't see a big advantage from having a language
tag either ... without out it, the effect will be to just use the
first Han font with that character in the current configuration,
which is about as good as guess as you can make.

The *only* way to do better would be to to frequency analysis
on larger segments of text than character-by-character, but
that would be better left to the app, I think, since Pango
is going to be unable to do it at more than a run or maybe
paragraph level, which could still leave a document displaying
very wierdly (isolated Han characters might come out in a 
different font than characters in paragraphs...)

I think this is NOTABUG.
Comment 2 Behdad Esfahbod 2007-06-11 03:05:03 UTC
Ok, closing as $PANGO_LANGUAGE provides a generic solution to this problem.

Note You need to log in before you can comment on or make changes to this bug.