After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 101081 - Non-BMP (plane 1 thru plane 16) characters are not supported
Non-BMP (plane 1 thru plane 16) characters are not supported
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
1.1.x
Other Linux
: Normal normal
: 1.4.0
Assigned To: pango-maint
pango-maint
: 118792 140570 (view as bug list)
Depends on: 68435 107974
Blocks:
 
 
Reported: 2002-12-13 00:59 UTC by Jungshik Shin
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch to draw the boxes for > ffff, and to enable up to 10ffff in basic-fc.c (4.16 KB, patch)
2003-07-15 22:07 UTC, Noah Levitt
none Details | Review
patch to optimize the pango map data structure (8.03 KB, patch)
2003-07-18 22:25 UTC, Noah Levitt
none Details | Review

Description Jungshik Shin 2002-12-13 00:59:35 UTC
Currently, Pango only supports BMP (plane 0) characters. 
Although it's not yet wide spread, plane 1 and plane 2 
began to get filled (with plane 2 being rapidly filled
up with CJK ideographs) and there are a couple of truetype
fonts that support non-BMP characters. Code2001
by James Kass (http://home.att.net/~jameskass) has glyphs
for plane 1 characters and Mac OS X comes with Japanese
fonts for hundreds of plane 2 characters. There are also
commerical truetype fonts with all the CJK ideographs encoded
so far in Unicode/10646. 
So, it may be time to consider supporting non-BMP characters. 

There was a thread in Linux-UTF8 list(for people other
than Owen who were there :-) :


http://mail.nl.linux.org/linux-utf8/2002-12/msg00000.html
http://mail.nl.linux.org/linux-utf8/2002-11/msg00148.html

And, Mozilla bug 182877(http://bugzilla.mozilla.org/show_bug.cgi?id=182877) 
is of some relevance. Freetype2 had to be patched
to use TTFs with UCS4 cmap and the patch was committed. 
Xft patch is in the queue. 

Incidentally, with a patched Freetype, gedit(Pango) seg-faulted
when Code2001 was chosen. I'll try to track down the cause.
Comment 1 Yao Zhang 2003-07-13 00:57:12 UTC
I tested it with a commercial font convering CJK Unified Ideographs
Extension B after the following change:

Index: pango/modules/basic/basic-fc.c
===================================================================
RCS file: /cvs/gnome/pango/modules/basic/basic-fc.c,v
retrieving revision 1.15
diff -u -r1.15 basic-fc.c
--- pango/modules/basic/basic-fc.c      14 Apr 2003 23:48:23 -0000   
  1.15
+++ pango/modules/basic/basic-fc.c      11 Jul 2003 21:06:22 -0000
@@ -66,7 +66,7 @@
   { 0xf900, 0xfa2d, "*" }, /* CJK Compatibility Ideographs */
   { 0xfe30, 0xfe6b, "*" }, /* CJK Compatibility Forms and Small Form
Variants */
   { 0xff00, 0xffe3, "*" }, /* Halfwidth and Fullwidth Forms (partly) */
-  { 0x0000, 0xffff, "" },
+  { 0x0000, 0x2ffff, "" },
 };

Everything works flawnessly.
Comment 2 Noah Levitt 2003-07-15 22:07:13 UTC
Created attachment 18328 [details] [review]
patch to draw the boxes for > ffff, and to enable up to 10ffff in basic-fc.c
Comment 3 Noah Levitt 2003-07-18 22:25:30 UTC
Created attachment 18420 [details] [review]
patch to optimize the pango map data structure
Comment 4 Noah Levitt 2003-07-18 22:35:08 UTC
With the latter patch, the pango map structure takes 26k with the
basic engine covering up to 10ffff (that is, with the former patch
applied). For comparison, without the latter patch it takes 90k.
Without the patch and covering only up to ffff it takes 40k. (These
are the numbers I measured, it's not impossible that I made a mistake.) 

It would be possible to optimize this patch even further by adding a
PangoMapEntry * to the PangoSubmap.d union, but it would probably
sacrifice some readability (I had enough trouble writing
map_add_engine). Looks like it would save another 6k or so.
Comment 5 Owen Taylor 2003-07-31 19:00:53 UTC
*** Bug 118792 has been marked as a duplicate of this bug. ***
Comment 6 Owen Taylor 2003-11-17 23:21:58 UTC
The 6-digit hex square stuff looks fine. The other changes
shouldn't be necessary with the current script-based 
shaper selection code.

Does PangoCoverage need optimization / moving to a 3 level
page table like you did for PangoMap?
Comment 7 Noah Levitt 2003-11-18 22:57:32 UTC
2003-11-18  Noah Levitt  <nlevitt@columbia.edu>

	* pango/pangxft-font.c (pango_xft_real_render): Draw 6-digit hex boxes
	for > U+FFFF. (#101081)
Comment 8 Noah Levitt 2004-04-22 15:03:57 UTC
*** Bug 140570 has been marked as a duplicate of this bug. ***