After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 604149 - pango enhance to use PANGO_SCRIPT_COMMON with non-latin fonts.
pango enhance to use PANGO_SCRIPT_COMMON with non-latin fonts.
Status: RESOLVED OBSOLETE
Product: pango
Classification: Platform
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2009-12-09 09:31 UTC by Takao Fujiwara
Modified: 2018-05-22 12:54 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch for fontconfig comcharset (20.22 KB, patch)
2009-12-09 09:41 UTC, Takao Fujiwara
none Details | Review
Patch for pango compcharset (8.32 KB, patch)
2009-12-09 10:00 UTC, Takao Fujiwara
none Details | Review

Description Takao Fujiwara 2009-12-09 09:31:37 UTC
Currently Pango has two kind of scripts PANGO_SCRIPT_LATIN and PANGO_SCRIPT_COMMON in ASCII range. 
e.g. ASCII '@' belongs to PANGO_SCRIPT_COMMON.
e.g. ASCII 'a' belongs to PANGO_SCRIPT_LATIN.
Pango uses PANGO_SCRIPT_COMMON with latin fonts only in Monospace/Sans/Sans-Serif.
Both PANGO_SCRIPT_COMMON and PANGO_SCRIPT_LATIN doesn't indicate ASCII only but also many multi-byte chars.

If users type ASCII "@a", pango_script_iter_next() treats the string with PANGO_SCRIPT_LATIN since 'a' is PANGO_SCRIPT_LATIN and pango gets latin fonts from fontconfig.
If users type ASCII "@ ", pango_script_iter_next() treats the string with PANGO_SCRIPT_COMMON and pango gets *non-latin* fonts from fontconfig.
If users type ASCII + Chinese "@\xe4\xb9\x8c", pango_script_iter_next() treats the string with PANGO_SCRIPT_HAN and pango gets *non-latin* fonts from fontconfig.

Normally both Latin and Non-latin font includes the ASCII PANGO_SCRIPT_COMMON chars.

The problem is some of Chinese fonts(e.g. "AR PL UKai CN") have a worse quality in ASCII chars than one of latin fonts(e.g. "DejuVu").

# env LANG=zh_CN.UTF-8 fc-match
ukai.ttc: "AR PL UKai CN" "Book"

Then they'd like to use latin fonts for PANGO_SCRIPT_LATIN and ASCII PANGO_SCRIPT_COMMON.

But currently Pango treats PANGO_SCRIPT_COMMON with non-latin fonts because the behavior would be good in some languages.

Removing the glyphs in Chinese fonts might not be realistic to resolve this issue since users can choose either Monospace/Sans/Serif or each font.
The request is to use latin fonts in ASCII PANGO_SCRIPT_COMMON in Monospace/Sans/Serif and also use non-latin font("AR PL UKai CN") in "AR PL UKai CN".

Currently the behavior of PANGO_SCRIPT_COMMON is hard-coded in pango.
My idea is to have a feature to customize the behavior with fontconfig.

I'm adding the patch of fontconfig and pango.
Comment 1 Takao Fujiwara 2009-12-09 09:41:21 UTC
Created attachment 149412 [details] [review]
Patch for fontconfig comcharset

The idea is to customize charset in fontconfig.

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
	<match target="font">
		<test name="family" compare="contains">
			<string>AR PL</string>
		</test> 
		<edit name="compcharset" mode="assign">
			<compcharset>
			<complang>zh</complang>
			<compmode>weak</compmode>
			<string>0x00-0x19,0x21-0x7F</string>
			</compcharset>
		</edit>
	</match>
</fontconfig>

The compcharset is used in composite fonts likes Monospace/Sans/Serif.
The complang effects the current locale in user side.
The compmode can choose "weak" or "strong". "weak" means that the priority is AR PL < latin fonts.
string means the unicode code points.

If this patch and the next pango patch is applied, Pango can uses latin fonts in the code points; 0x00-0x19,0x21-0x7F in zh Monospace/Sans/Serif.
Comment 2 Takao Fujiwara 2009-12-09 10:00:03 UTC
Created attachment 149414 [details] [review]
Patch for pango compcharset

In this patch, pango can disable the code points which are specified in fontconfig.

# env LANG=zh_CN.UTF-8 fc-match
ukai.ttc: "AR PL UKai CN" "Book"

# env LANG=zh_CN.UTF-8 fc-match --verbose
	compcharset: 
	  Lang: zh
	  Mode: Weak
	  Range: 0x00-0x19,0x21-0x7F(w)

Then text "@a" is drawn with latin fonts.

In another approach, if latin fonts are described in non-latin config file in fontconfig, this could be resolved. e.g.

<match target="pattern">
  <test name="lang" compare="contains">
  <string>zh</string>
  </test>
  <test qual="any" name="family">
    <string>monospace</string>
  </test>
  <edit name="family" mode="prepend" binding="strong">
    <string>Bitstream Vera Serif</string>
    <string>DejaVu Serif</string>
    <string>AR PL UMing CN</string>
    <string>AR PL UKai CN</string>
    <string>AR PL ZenKai Uni</string>
  </edit>
</match>

But it seams this way is not agreed at present by fontconfig maintainers.

Do you have any ideas to resolve this problem?
If this approach is good, I also will try to file a bug in fontconfig for the discussion.

I think this approach also could fix another problem below.
If users use a commercial font, they'd like to use the font for all coverage; PANGO_SCRIPT_COMMON, PANGO_SCRIPT_LATIN, PANGO_SCRIPT_HAN, PANGO_SCRIPT_GREEK, PANGO_SCRIPT_CYRILLIC.
But currently pango treats PANGO_SCRIPT_LATIN with latin fonts.

	<match target="font">
		<test name="family" compare="contains">
			<string>VL Gothic</string>
		</test> 
		<edit name="compcharset" mode="assign">
			<compcharset>
			<complang>ja</complang>
			<compmode>strong</compmode>
			<string>all</string>
			</compcharset>
		</edit>
	</match>

If users add the config in fontconfig, Pango can use the font for all scripts with this patch.
Comment 3 GNOME Infrastructure Team 2018-05-22 12:54:27 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/pango/issues/169.