GNOME Bugzilla – Bug 442786
PANGO_LANGUAGE doesn't affect in some case
Last modified: 2018-05-22 12:29:03 UTC
I have played around recent changes for PANGO_LANGUAGE/LANGUAGE to see the language priorities to choose a font. It works fine when I run pango-view with PANGO_LANGUAGE=C:zh LANG=en_US.UTF-8, and I can see Chinese fonts are used for Chinese characters on even en_US locale and Japanese fonts installed. thanks for that. I'm expecting it should works without any exceptions. however when I run pango-view with PANGO_LANGUAGE=C:zh LANG=ja_JP.UTF-8 with test-mixed.txt, Chinese characters are still rendered with Japanese font and Chinese font even if it could be displayed with only Chinese fonts. similarly it should uses Japanese font as much as possible when it's bringing up with PANGO_LANGUAGE=C:ja LANG=zh_CN.UTF-8 say.
Created attachment 89157 [details] Screenshot This screenshot is an actual result when I run with LANG=ja_JP.UTF-8 PANGO_LANGUAGE=C:zh
Created attachment 89158 [details] Screenshot This screenshot should be an expected result when running with LANG=ja_JP.UTF-8 PANGO_LANGUAGE=C:zh
Created attachment 89159 [details] Screenshot2 This screenshot is also an actual result when I run with LANG=zh_CN.UTF-8 PANGO_LANGUAGE=C:ja. apparently Japanese fonts aren't used for Chinese/Kanji characters.
Created attachment 89160 [details] Screenshot2 This screenshot should be also an expected result when running with LANG=zh_CN.UTF-8 PANGO_LANGUAGE=C:ja
Akira, First, with PANGO_LANGUAGE, you can drop the "C" part. That trick is only useful when using $LANGUAGE. About your real problem here, the point is, one CANNOT automatically detect both Chinese and Japanese correctly. They use the same Unicode character codes. If one wants both to correctly work side by side, one has to tag them with PangoAttrLanguage. See test-mixed.markup for example. Try it with pango-view --markup test-mixed.makrup and it always works correctly, no matter what $LANG or $LANGUAGE is.
Ideally you're correct. but that's not what I meant. what I'm saying here is, that behavior that is introduced by PANGO_LANGUAGE/LANGUAGE isn't consistent by LANG. and it looks like only works for en_US locale or such. which really looks to me like a bug. I don't expect here to render both of Japanese and Chinese with the proper fonts on even the mixed text. but PANGO_LANGUAGE/LANGUAGE doesn't behave similar to when I run something on en_US locale say.
Let me rephrase... I'm actually expecting here is, Pango should refers PANGO_LANGUAGE->LANGUAGE->(LC_CTYPE?->)LANG to select the fonts and if no glyphs are available on it, fallback to next.
No. Pango will always first try to use the language set on the context via pango_context_set_language(), which gtk+ calls with the value of $LANG. Only if that is not useful pango uses $PANGO_LANGUAGE/$LANGUAGE. You don't want to have a desktop where LANG=ja_JA doesn't make it use Japanese fonts, do you? In fact, here is the exact sequence of what happens: """So, to summarize, during itemization, an item's language is assigned in the following order: - If a language attribute is present, use it, otherwise - If a context language is present, use it, otherwise - Use pango_language_get_default() - Then, if the chosen language is not compatible with the script of the item, use the first language in $PANGO_LANGUAGE or $LANGUAGE that is compatible with the script. If none matches, use pango_language_from_string("xx").""" I still don't understand what you want to do that you can't now.
Awaiting Akira's response.
Thank you for reminding me. I was about to reply on this, but couldn't have a time to do that. sorry. Well, what I'm claiming here is, to have a consistency in the behavior of PANGO_LANGUAGE anyway, because current one isn't intuitive. Using PANGO_LANGUAGE/LANGUAGE may be rare. most people usually don't need/want to have this one. so an answer about if I'd use Japanese fonts with LANG=ja_JP is, yes, I do. but thinking about using those envvar, maybe no. there may be some requirements/needs to do that. Speaking of the ideal solution, I'd love to have the sort of APIs to manage the list of the languages to determine the fonts is most preferred to use for glyphs, like what currently Pango does from PANGO_LANGUAGE/LANGUAGE, and to reflect it to PangoContext on demand. after that, this behavior will makes it possible to have a feature to be the content driven instead of the locale driven - for example, one can looks at two text files with the proper fonts by changing the language from the menu say, and don't need to restart the applications anymore - of course refering LANG is still helpful to determine the default language if nothing is given for that APIs nor PANGO_LANGUAGE/LANGUAGE - well, IMHO using PANGO_LANGUAGE/LANGUAGE would be good only for the demonstration. but to makes it practicality, it would be desirable to have the sort of APIs. To get it done, we definitely have to modify gtk+ too as you told since it invokes pango_context_set_language() against LANG. but for a first step, Pango may need to provide a way of doing so, I think. How does it sound?
Pango already provides pango_context_set_language() and pango_attr_language_new(). Those two mean the application writer has full control over font selection. Check pango/pango-view/test-mixed.markup for example. I still don't see what this bug is about.
Hmm, but it has to be done artificially, yes? and how does it work similarly like PANGO_LANGUAGE? and as you said, detecting the language is difficult. so PANGO_LANGUAGE was necessary to give fallback hints to Pango? Then how come it doesn't work for the languages on the those locales? which actually has glyphs points to the same Unicode.
(In reply to comment #12) > Hmm, but it has to be done artificially, yes? What has to be done artificially? language tagging? Yes, it has to be done manually, because there is no way to detect it. > and how does it work similarly like PANGO_LANGUAGE? What do you mean, I'm totally not following. > and as you said, detecting the language is difficult. so > PANGO_LANGUAGE was necessary to give fallback hints to Pango? Yes, such that me as a user, can tell Pango that whenever it sees text in Arabic script, it's most probably in Persian language, not Arabic language. > Then how come it doesn't work for the languages on the those locales? which > actually has glyphs points to the same Unicode. Which locales, what doesn't work?
(In reply to comment #13) > What has to be done artificially? language tagging? Yes, it has to be done > manually, because there is no way to detect it. Yes. that's it. > > and how does it work similarly like PANGO_LANGUAGE? > > What do you mean, I'm totally not following. As I mentioned earlier, what I want is to get PANGO_LANGUAGE working for every languages. so I'm not quite sure how what you're suggesting works. > > > and as you said, detecting the language is difficult. so > > PANGO_LANGUAGE was necessary to give fallback hints to Pango? > > Yes, such that me as a user, can tell Pango that whenever it sees text in > Arabic script, it's most probably in Persian language, not Arabic language. > > > Then how come it doesn't work for the languages on the those locales? which > > actually has glyphs points to the same Unicode. > > Which locales, what doesn't work? That's in the initial report here... even if I have PANGO_LANGUAGE=zh, apparently Pango prefers a Japanese font (+ Chinese font for missing glyphs) on LANG=ja_JP. even though Chinese fonts can takes care of them. And, even if I have PANGO_LANGUAGE=ja, Pango prefers a Chinese font on LANG=zh_{CN,TW} anyway. That's what I'm saying PANGO_LANGUAGE doesn't affect. I don't have too much idea how much languages there are like this case. but if there are some, this issue may also happens on them.
So you want pango to ignore $LANG. I don't think that's going to happen. The more interesting question is, why would you set PANGO_LANGUAGE=zh_CN and LANG=ja_JP? Why not set LANG=zh_CN and be happy? Again, what you are asking for is impossible: you can't have both Japanese and Chinese work correctly without manual language tagging.
What Akira asks for is actually reasonable. In my case, I want to enable the GSUB/latn/ROM/locl feature for OpenType fonts *without* changing the language of app UI. Below is a simple test (taken from http://en.wikipedia.org/wiki/Pango#Support_for_OpenType_features), which requires Verdana 5.01 (or some other font that supports ROM/locl -- any recent Adobe Pro font should). GSUB/locl works as expected with pango-view --language: for lang in en ro; do pango-view --font="Verdana 64" --text "şţ vs. șț in $lang" --language=$lang& done But most apps don't have a --language parameter, e.g. gedit doesn't. Setting LANG works too: for lang in en_US ro_RO; do LANG=$lang.UTF-8 pango-view --font "Verdana 64" --text "şţ vs. șț in $lang"& done Unfortunately LANG will also change the language of the UI for the app (say gedit). A third way would be to send language markup to pango. This works too: pango-view --font="Verdana 24" --markup --text 'In the same text: <span lang="en">şţ</span>(en) and <span lang="ro">şţ</span>(ro).' This method relies on the app to detect and flag Romanian, which hardly happens. The best way to enable GSUB/latn/ROM/locl would be an environment (or .pangorc entry) that affects only the locl feature for pango, without changing the application UI. Ideally this should work too: for lang in en ro; do PANGO_LANGUAGE=$lang.UTF-8 pango-view --font "Verdana 64" --text "şţ vs. șț in $lang"& done But it DOESN'T. This is the desired feature!
(In reply to comment #16) > What Akira asks for is actually reasonable. In my case, I want to enable the > GSUB/latn/ROM/locl feature for OpenType fonts *without* changing the language > of app UI. Below is a simple test (taken from > http://en.wikipedia.org/wiki/Pango#Support_for_OpenType_features), which > requires Verdana 5.01 (or some other font that supports ROM/locl -- any recent > Adobe Pro font should). > > GSUB/locl works as expected with pango-view --language: > for lang in en ro; do pango-view --font="Verdana 64" --text "şţ vs. șț in > $lang" --language=$lang& done > But most apps don't have a --language parameter, e.g. gedit doesn't. > > Setting LANG works too: > for lang in en_US ro_RO; do LANG=$lang.UTF-8 pango-view --font "Verdana 64" > --text "şţ vs. șț in $lang"& done > Unfortunately LANG will also change the language of the UI for the app (say > gedit). > > A third way would be to send language markup to pango. This works too: > pango-view --font="Verdana 24" --markup --text 'In the same text: <span > lang="en">şţ</span>(en) and <span lang="ro">şţ</span>(ro).' > This method relies on the app to detect and flag Romanian, which hardly > happens. > > The best way to enable GSUB/latn/ROM/locl would be an environment (or .pangorc > entry) that affects only the locl feature for pango, without changing the > application UI. How about setting LC_MESSAGES to keep your UI language? > Ideally this should work too: > for lang in en ro; do PANGO_LANGUAGE=$lang.UTF-8 pango-view --font "Verdana 64" > --text "şţ vs. șț in $lang"& done > But it DOESN'T. This is the desired feature! > Actually it works here: $ for lang in en sr; do PANGO_LANGUAGE=$lang pango-view --font "DejaVu Sans" --text "б in $lang"; done
(In reply to comment #17) > (In reply to comment #16) ... snipped ... > How about setting LC_MESSAGES to keep your UI language? Yeah, that works, e.g., LANG=ro_RO.UTF-8 LC_MESSAGES=C gedit does the substitutions properly. I'm worried that LANG=ro_RO may change other stuff, e.g. date format... > Actually it works here: > > $ for lang in en sr; do PANGO_LANGUAGE=$lang pango-view --font "DejaVu Sans" > --text "б in $lang"; done That works for me too! But it doesn't work for Romanian!! I tried it with three different fonts. All worked with LANG, but none worked with PANGO_LANGUAGE : http://www.cs.umd.edu/~gaburici/PANGO_LANGUAGE.png The part of the script that does work (setting LANG): for font in 'Verdana' 'Charis SIL 20' 'Minion Pro 22'; do for lang in en_US.UTF-8 ro_RO.UTF-8; do echo $font $lang LANG=$lang pango-view --font "$font" --text "şţ in $lang"& done done The part of the script that doesn't work (setting PANGO_LANGUAGE): for font in 'Verdana' 'Charis SIL 20' 'Minion Pro 22'; do for lang in en ro; do echo $font $lang PANGO_LANGUAGE=$lang pango-view --font "$font" --text "şţ in $lang"& done done I also tried LANGUAGE instead of PANGO_LANGUAGE, but that doesn't work either. Note that Adobe fonts (seen in the Minion example) and 90% of other commercial fonts don't have a real 't with cedilla' glyph (at U+0162/3) because Adobe decided that it doesn't show up in any language, so there's a non-configurable substitution to 't with comma' regardless of language. I included the SIL font in the example because it doesn't use GSUB/latn/ROM/locl, but rather GSUB/latn/ROM/ccmp to do the substitutions. It works with pango in the same circumstances when locl does. Hope this helps.
Created attachment 114805 [details] Screenshot for Romanian trouble with PANGO_LANGUAGE
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/pango/issues/78.