After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 442786 - PANGO_LANGUAGE doesn't affect in some case
PANGO_LANGUAGE doesn't affect in some case
Status: RESOLVED OBSOLETE
Product: pango
Classification: Platform
Component: general
1.17.x
Other Linux
: Normal normal
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2007-06-01 05:29 UTC by Akira TAGOH
Modified: 2018-05-22 12:29 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Screenshot (33.89 KB, image/png)
2007-06-01 05:33 UTC, Akira TAGOH
Details
Screenshot (35.10 KB, image/png)
2007-06-01 05:35 UTC, Akira TAGOH
Details
Screenshot2 (35.31 KB, image/png)
2007-06-01 05:38 UTC, Akira TAGOH
Details
Screenshot2 (33.88 KB, image/png)
2007-06-01 05:39 UTC, Akira TAGOH
Details
Screenshot for Romanian trouble with PANGO_LANGUAGE (46.50 KB, image/png)
2008-07-19 09:21 UTC, Vasile Gaburici
Details

Description Akira TAGOH 2007-06-01 05:29:38 UTC
I have played around recent changes for PANGO_LANGUAGE/LANGUAGE to see the language priorities to choose a font.

It works fine when I run pango-view with PANGO_LANGUAGE=C:zh LANG=en_US.UTF-8, and I can see Chinese fonts are used for Chinese characters on even en_US locale and Japanese fonts installed. thanks for that.

I'm expecting it should works without any exceptions. however when I run pango-view  with PANGO_LANGUAGE=C:zh LANG=ja_JP.UTF-8 with test-mixed.txt, Chinese characters are still rendered with Japanese font and Chinese font even if it could be displayed with only Chinese fonts. similarly it should uses Japanese font as much as possible when it's bringing up with PANGO_LANGUAGE=C:ja LANG=zh_CN.UTF-8 say.
Comment 1 Akira TAGOH 2007-06-01 05:33:53 UTC
Created attachment 89157 [details]
Screenshot

This screenshot is an actual result when I run with LANG=ja_JP.UTF-8 PANGO_LANGUAGE=C:zh
Comment 2 Akira TAGOH 2007-06-01 05:35:33 UTC
Created attachment 89158 [details]
Screenshot

This screenshot should be an expected result when running with LANG=ja_JP.UTF-8 PANGO_LANGUAGE=C:zh
Comment 3 Akira TAGOH 2007-06-01 05:38:12 UTC
Created attachment 89159 [details]
Screenshot2

This screenshot is also an actual result when I run with LANG=zh_CN.UTF-8 PANGO_LANGUAGE=C:ja. apparently Japanese fonts aren't used for Chinese/Kanji characters.
Comment 4 Akira TAGOH 2007-06-01 05:39:55 UTC
Created attachment 89160 [details]
Screenshot2

This screenshot should be also an expected result when running with LANG=zh_CN.UTF-8 PANGO_LANGUAGE=C:ja
Comment 5 Behdad Esfahbod 2007-06-01 18:37:15 UTC
Akira,

First, with PANGO_LANGUAGE, you can drop the "C" part.  That trick is only useful when using $LANGUAGE.

About your real problem here, the point is, one CANNOT automatically detect both 
Chinese and Japanese correctly.  They use the same Unicode character codes.  If one wants both to correctly work side by side, one has to tag them with PangoAttrLanguage.  See test-mixed.markup for example.  Try it with pango-view --markup test-mixed.makrup and it always works correctly, no matter what $LANG or $LANGUAGE is.
Comment 6 Akira TAGOH 2007-06-02 15:06:48 UTC
Ideally you're correct. but that's not what I meant. what I'm saying here is, that behavior that is introduced by PANGO_LANGUAGE/LANGUAGE isn't consistent by LANG. and it looks like only works for en_US locale or such. which really looks to me like a bug. I don't expect here to render both of Japanese and Chinese with the proper fonts on even the mixed text. but PANGO_LANGUAGE/LANGUAGE doesn't behave similar to when I run something on en_US locale say.
Comment 7 Akira TAGOH 2007-06-02 15:11:30 UTC
Let me rephrase... I'm actually expecting here is, Pango should refers PANGO_LANGUAGE->LANGUAGE->(LC_CTYPE?->)LANG to select the fonts and if no glyphs are available on it, fallback to next.
Comment 8 Behdad Esfahbod 2007-06-02 18:51:02 UTC
No.  Pango will always first try to use the language set on the context via pango_context_set_language(), which gtk+ calls with the value of $LANG.  Only if that is not useful pango uses $PANGO_LANGUAGE/$LANGUAGE.

You don't want to have a desktop where LANG=ja_JA doesn't make it use Japanese fonts, do you?  In fact, here is the exact sequence of what happens:

"""So, to summarize, during itemization, an item's language is assigned in the
following order:

  - If a language attribute is present, use it, otherwise
  - If a context language is present, use it, otherwise
  - Use pango_language_get_default()

  - Then, if the chosen language is not compatible with the script of the item,
use the first language in $PANGO_LANGUAGE or $LANGUAGE that is compatible with
the script.  If none matches, use pango_language_from_string("xx")."""

I still don't understand what you want to do that you can't now.
Comment 9 Behdad Esfahbod 2007-06-11 03:08:02 UTC
Awaiting Akira's response.
Comment 10 Akira TAGOH 2007-06-12 13:38:31 UTC
Thank you for reminding me. I was about to reply on this, but couldn't have a time to do that. sorry.

Well, what I'm claiming here is, to have a consistency in the behavior of PANGO_LANGUAGE anyway, because current one isn't intuitive. Using PANGO_LANGUAGE/LANGUAGE may be rare. most people usually don't need/want to have this one. so an answer about if I'd use Japanese fonts with LANG=ja_JP is, yes, I do. but thinking about using those envvar, maybe no. there may be some requirements/needs to do that.

Speaking of the ideal solution, I'd love to have the sort of APIs to manage the list of the languages to determine the fonts is most preferred to use for glyphs, like what currently Pango does from PANGO_LANGUAGE/LANGUAGE, and to reflect it to PangoContext on demand. after that, this behavior will makes it possible to have a feature to be the content driven instead of the locale driven - for example, one can looks at two text files with the proper fonts by changing the language from the menu say, and don't need to restart the applications anymore - of course refering LANG is still helpful to determine the default language if nothing is given for that APIs nor PANGO_LANGUAGE/LANGUAGE - well, IMHO using PANGO_LANGUAGE/LANGUAGE would be good only for the demonstration. but to makes it practicality, it would be desirable to have the sort of APIs.

To get it done, we definitely have to modify gtk+ too as you told since it invokes pango_context_set_language() against LANG. but for a first step, Pango may need to provide a way of doing so, I think.

How does it sound?
Comment 11 Behdad Esfahbod 2007-06-12 14:29:30 UTC
Pango already provides pango_context_set_language() and pango_attr_language_new().  Those two mean the application writer has full control over font selection.  Check pango/pango-view/test-mixed.markup for example.

I still don't see what this bug is about.
Comment 12 Akira TAGOH 2007-06-12 17:16:40 UTC
Hmm, but it has to be done artificially, yes? and how does it work similarly like PANGO_LANGUAGE? and as you said, detecting the language is difficult. so PANGO_LANGUAGE was necessary to give fallback hints to Pango?

Then how come it doesn't work for the languages on the those locales? which actually has glyphs points to the same Unicode.
Comment 13 Behdad Esfahbod 2007-06-12 17:35:28 UTC
(In reply to comment #12)
> Hmm, but it has to be done artificially, yes?

What has to be done artificially?  language tagging?  Yes, it has to be done manually, because there is no way to detect it.

> and how does it work similarly like PANGO_LANGUAGE?

What do you mean, I'm totally not following.

> and as you said, detecting the language is difficult. so
> PANGO_LANGUAGE was necessary to give fallback hints to Pango?

Yes, such that me as a user, can tell Pango that whenever it sees text in Arabic script, it's most probably in Persian language, not Arabic language.

> Then how come it doesn't work for the languages on the those locales? which
> actually has glyphs points to the same Unicode.

Which locales, what doesn't work?

Comment 14 Akira TAGOH 2007-06-12 18:50:55 UTC
(In reply to comment #13)
> What has to be done artificially?  language tagging?  Yes, it has to be done
> manually, because there is no way to detect it.

Yes. that's it.

> > and how does it work similarly like PANGO_LANGUAGE?
> 
> What do you mean, I'm totally not following.

As I mentioned earlier, what I want is to get PANGO_LANGUAGE working for every languages. so I'm not quite sure how what you're suggesting works.

> 
> > and as you said, detecting the language is difficult. so
> > PANGO_LANGUAGE was necessary to give fallback hints to Pango?
> 
> Yes, such that me as a user, can tell Pango that whenever it sees text in
> Arabic script, it's most probably in Persian language, not Arabic language.
> 
> > Then how come it doesn't work for the languages on the those locales? which
> > actually has glyphs points to the same Unicode.
> 
> Which locales, what doesn't work?

That's in the initial report here... even if I have PANGO_LANGUAGE=zh, apparently Pango prefers a Japanese font (+ Chinese font for missing glyphs) on LANG=ja_JP. even though Chinese fonts can takes care of them.
And, even if I have PANGO_LANGUAGE=ja, Pango prefers a Chinese font on LANG=zh_{CN,TW} anyway.

That's what I'm saying PANGO_LANGUAGE doesn't affect. I don't have too much idea how much languages there are like this case. but if there are some, this issue may also happens on them.
Comment 15 Behdad Esfahbod 2007-06-12 19:18:31 UTC
So you want pango to ignore $LANG.  I don't think that's going to happen.

The more interesting question is, why would you set PANGO_LANGUAGE=zh_CN and LANG=ja_JP?  Why not set LANG=zh_CN and be happy?

Again, what you are asking for is impossible: you can't have both Japanese and Chinese work correctly without manual language tagging.
Comment 16 Vasile Gaburici 2008-07-17 12:15:07 UTC
What Akira asks for is actually reasonable. In my case, I want to enable the GSUB/latn/ROM/locl feature for OpenType fonts *without* changing the language of app UI. Below is a simple test (taken from http://en.wikipedia.org/wiki/Pango#Support_for_OpenType_features), which requires Verdana 5.01 (or some other font that supports ROM/locl -- any recent Adobe Pro font should).

GSUB/locl works as expected with pango-view --language:
for lang in en ro; do pango-view --font="Verdana 64" --text "şţ vs. șț in $lang" --language=$lang& done
But most apps don't have a --language parameter, e.g. gedit doesn't.

Setting LANG works too:
for lang in en_US ro_RO; do LANG=$lang.UTF-8 pango-view --font "Verdana 64" --text "şţ vs. șț in $lang"& done
Unfortunately LANG will also change the language of the UI for the app (say gedit).

A third way would be to send language markup to pango. This works too:
pango-view --font="Verdana 24"  --markup --text 'In the same text: <span lang="en">şţ</span>(en) and <span lang="ro">şţ</span>(ro).'
This method relies on the app to detect and flag Romanian, which hardly happens.

The best way to enable GSUB/latn/ROM/locl would be an environment (or .pangorc entry) that affects only the locl feature for pango, without changing the application UI. Ideally this should work too:
for lang in en ro; do PANGO_LANGUAGE=$lang.UTF-8 pango-view --font "Verdana 64" --text "şţ vs. șț in $lang"& done
But it DOESN'T. This is the desired feature!
Comment 17 Behdad Esfahbod 2008-07-19 07:55:31 UTC
(In reply to comment #16)
> What Akira asks for is actually reasonable. In my case, I want to enable the
> GSUB/latn/ROM/locl feature for OpenType fonts *without* changing the language
> of app UI. Below is a simple test (taken from
> http://en.wikipedia.org/wiki/Pango#Support_for_OpenType_features), which
> requires Verdana 5.01 (or some other font that supports ROM/locl -- any recent
> Adobe Pro font should).
> 
> GSUB/locl works as expected with pango-view --language:
> for lang in en ro; do pango-view --font="Verdana 64" --text "şţ vs. șț in
> $lang" --language=$lang& done
> But most apps don't have a --language parameter, e.g. gedit doesn't.
> 
> Setting LANG works too:
> for lang in en_US ro_RO; do LANG=$lang.UTF-8 pango-view --font "Verdana 64"
> --text "şţ vs. șț in $lang"& done
> Unfortunately LANG will also change the language of the UI for the app (say
> gedit).
> 
> A third way would be to send language markup to pango. This works too:
> pango-view --font="Verdana 24"  --markup --text 'In the same text: <span
> lang="en">şţ</span>(en) and <span lang="ro">şţ</span>(ro).'
> This method relies on the app to detect and flag Romanian, which hardly
> happens.
>
> The best way to enable GSUB/latn/ROM/locl would be an environment (or .pangorc
> entry) that affects only the locl feature for pango, without changing the
> application UI.


How about setting LC_MESSAGES to keep your UI language?

> Ideally this should work too:
> for lang in en ro; do PANGO_LANGUAGE=$lang.UTF-8 pango-view --font "Verdana 64"
> --text "şţ vs. șț in $lang"& done
> But it DOESN'T. This is the desired feature!
>
Actually it works here:

$ for lang in en sr; do PANGO_LANGUAGE=$lang pango-view --font "DejaVu Sans" --text "б in $lang"; done
Comment 18 Vasile Gaburici 2008-07-19 09:20:09 UTC
(In reply to comment #17)
> (In reply to comment #16)
... snipped ...

> How about setting LC_MESSAGES to keep your UI language?

Yeah, that works, e.g., LANG=ro_RO.UTF-8 LC_MESSAGES=C gedit does the substitutions properly. I'm worried that LANG=ro_RO may change other stuff, e.g. date format...

> Actually it works here:
> 
> $ for lang in en sr; do PANGO_LANGUAGE=$lang pango-view --font "DejaVu Sans"
> --text "б in $lang"; done

That works for me too! But it doesn't work for Romanian!! I tried it with three different fonts. All worked with LANG, but none worked with PANGO_LANGUAGE : http://www.cs.umd.edu/~gaburici/PANGO_LANGUAGE.png

The part of the script that does work (setting LANG):

for font in 'Verdana' 'Charis SIL 20' 'Minion Pro 22'; do
    for lang in en_US.UTF-8 ro_RO.UTF-8; do
	echo $font $lang
	LANG=$lang pango-view --font "$font" --text "şţ in $lang"&
    done
done

The part of the script that doesn't work (setting PANGO_LANGUAGE):

for font in 'Verdana' 'Charis SIL 20' 'Minion Pro 22'; do
    for lang in en ro; do
	echo $font $lang
	PANGO_LANGUAGE=$lang pango-view --font "$font" --text "şţ in $lang"&
    done
done

I also tried LANGUAGE instead of PANGO_LANGUAGE, but that doesn't work either.

Note that Adobe fonts (seen in the Minion example) and 90% of other commercial fonts don't have a real 't with cedilla' glyph (at U+0162/3) because Adobe decided that it doesn't show up in any language, so there's a non-configurable substitution to 't with comma' regardless of language.

I included the SIL font in the example because it doesn't use GSUB/latn/ROM/locl, but rather GSUB/latn/ROM/ccmp to do the substitutions. It works with pango in the same circumstances when locl does.

Hope this helps.
Comment 19 Vasile Gaburici 2008-07-19 09:21:58 UTC
Created attachment 114805 [details]
Screenshot for Romanian trouble with PANGO_LANGUAGE
Comment 20 GNOME Infrastructure Team 2018-05-22 12:29:03 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/pango/issues/78.