After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 125605 - Pango should support rendering of Khmer script
Pango should support rendering of Khmer script
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
unspecified
Other All
: Normal normal
: Medium feature
Assigned To: pango-maint
pango-maint
Depends on:
Blocks: 141983
 
 
Reported: 2003-10-27 15:37 UTC by Christian Rose
Modified: 2005-06-22 15:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
convert the source to a diff against current indic module (11.73 KB, patch)
2004-07-15 16:18 UTC, Daniel Glassey
none Details | Review

Description Christian Rose 2003-10-27 15:37:39 UTC
There are people interested in providing GNOME translations in Khmer [km],
but reportedly Pango doesn't support the rendering of the Khmer script yet
(http://lists.gnome.org/archives/gnome-i18n/2003-October/msg00232.html).
Comment 2 Owen Taylor 2003-10-27 15:54:22 UTC
Having free fonts available is a prerequisite to such work.

http://www.microsoft.com/typography/otfntdev/khmerot/default.htm

Describes the standard for how OpenType Khmer fonts.
Comment 3 Javier SOLA 2003-10-28 01:38:52 UTC
There are OpenType fonts available for testing. Part of the project 
that I am developing includes buying fonts into the public domain, 
but they are not still available as freeware, all we have is 
permission from the developer for copying for testing purposes.

I talked to Eric at IBM (ICU). They don't know if ICU will support 
Khmer or when. They are considering rewriting a part of ICU to make 
it easier to integrate in Pango, and as part of that effort they may 
include Khmer. But everything is pending from later decisions. Javier 
SOLA - fjsola@aui.es

Comment 4 Javier SOLA 2003-10-31 02:12:09 UTC
I have negotiated a set of Khmer fonts to be put in the public 
domain, but I dont know:

- What is the necessary basic set of fonts.Normal and cursive fonts 
have already been developed and and are available. Do we also need 
bold and cursive bold?
- What information should be included in the fonts to inform that 
they are in the public domain?

Comment 5 Javier SOLA 2003-11-20 03:24:50 UTC
A public domain OpenType Khmer language font (KhmerOS) is now
available.  Bold, cursive and cursive-bold variants will be available
within a month. 

This is a very clear schematic font specially designed for small sizes
to be displayed in computer screens. I will soon put it up in a
website, meanwhile, I will send it to anybody who requests it.
Comment 6 Javier SOLA 2004-01-06 07:42:43 UTC
Khmer support has been developed by Lin Chear <linchear@rogers.com>
and it is available at
http://unicode.khmer.cc/khmerpango/indic-khmer.tar.gz

(approximately 16kb)

In its current inception, this module will 'break' Indic Pango
support and replace it with Khmer (it is a modification of the Indic
module and replaces it).

It now needs to be integrated in Pango. If integration by the
maintainer is not possible at this time, help or indications would be
most welcome.
Comment 7 Daniel Glassey 2004-07-15 16:18:16 UTC
Created attachment 29557 [details] [review]
convert the source to a diff against current indic module

had a quick look at this. Is the eventual aim to integrate it into the indic
module or to create a separate khmer module?

I'm attaching a diff of the modified module against the current indic module.

Issues I can see (if the khmer parts are going to be part of the indic module)
are that 
(1)the changes to indic-fc.c need cleaned up to merge with recent changes to
the indic modules
(2)in indic-ot.h the feature flags should not be broken for the indic scripts.
Is there a way to do this withing the indic module or is this difference enough
to require a separate module?
Comment 8 Owen Taylor 2004-07-15 22:47:54 UTC
As I recall, one of the main reasons that "breaking" the current Indic feature
bits was needed is that there only used to be 16 total feature bits available.
With the switch to PangoOTBuffer I switch from a 16 bit quantity to a unsigned
long, so there should portably be 32 bits, so this should no longer be a problem.

It would be interesting to see how much remains of the patch when the rcent
changes are merged in. If it's a small patch, it seems to make more sense
to do it this way then to create an entirely separate patch.

One thing is that Lin Chear also did a port of the Qt module as an independent
Pango module:

 http://mail.gnome.org/archives/gtk-i18n-list/2004-January/msg00042.html

It would be interesting to know how this compares for completeness, 
accuracy, etc, with the indic module modification.
Comment 9 Javier SOLA 2004-07-16 00:15:25 UTC
Owen. Yes, this is correct, the reason for the independent module were the flags.

Two or three more non-indian indic or similar CTL languages are trying to
develop Pango support: Lao and Dzongkha (Bhutan, tibetan script) and maybe
Myanmar. If a split was considered, maybe these languages should be bundled with
Khmer (non-indian indic module), but if Khmer is integrated into Indic, they
should also be easy to integrate.

Bug 141983 depends on this one. Its shape will depend of this decision.

There is another exception to be handled, the letter ROBAT. It was changed in
Unicode 4.0. I will file a bug on this after the module decision.
Comment 10 Owen Taylor 2004-12-15 02:41:51 UTC
I spent some time reading the Unicode and OpenType Khmer specs, and
I think it should be done as a separate module... while there are
clearly some strong parallels with the Indic scripts, Khmer also has
quite a few distinct features.

Trying to force Khmer into the Indic modules will produce a situation
where fixes for Khmer will break Indic and fixes for Indic break
Khmer. We see that even among the languages that the Indic modules
currently support.

I don't think there is any signficant advantage to combining Lao,
Dzhongkha support with Khmer. I suspect that Myanmar is also distinct
enough to really require a separate module. (Lao will be in 1.8 as
part of the Thai module... Lao and Thai are very similar. Tibetan
will be separate module.)

If TrollTech formally agreed to allow the Qt module to be used under the
LGPL, then the Qt port could be used, otherwise Khmer support would have
to be written from scratch. 

(Another approach would be to take the Indic module, delete all the
unnecessary code, make changes like virama => coeng mark in the functions
and docs, and go from that. I don't know if that is easier than starting
from scratch.)

A Khmer module from scratch looks like roughly a week of work. The first
item of the work is to create a set of input test cases with images of
expected output.
Comment 11 Javier SOLA 2004-12-15 07:00:41 UTC
We reached the same conclusion. Khmer needs a separate module.

We have developed a Khmer module for ICU (starting from the ICU indic module),
and sent it to Eric for review. It works quite well for us, after extensive testing.

http://www.khmeros.info/download/issues/ICU-khmer-layout.zip

We are now trying to adapt this module to Pango.

We tried to use the old Pango-Khmer module as a shell structure, but it did not
integrate into present Pango, so we are working on integration. We expect to
finish before the end of the year.

Even if Dzongkha and Myanmar have similar characteristics, they are different
enough that trying to put them together in a single module would require a lot
of extra work.

Myanmar has two-code-point subscript vowels like Khmer... but (unlike Khmer)
two-code-point split matras, and an almost-indian-but-not group in which
NGO+virama+consonant ends up in consonant+diacritic shape (not in Khmer either).
It should be easy to work on, starting from a Khmer module, and taking a couple
of ideas from the indic module.
Comment 12 Behdad Esfahbod 2005-04-08 12:13:00 UTC
Date: Fri, 8 Apr 2005 01:43:06 -0400
From: Jens Herden <jens@khmeros.info>
To: gtk-i18n-list@gnome.org
Cc: Owen Taylor <otaylor@redhat.com>
Subject: Re: khmer modul for Pango

Hi List, hi Owen,

I found a bug and I also saw that my code layout was still not perfect. So I
updated the patch for the Khmer module again. You can get it still here:

    http://www.khmeros.info/download/pango-khmer-patch.zip

Please use this version for the integration into Pango, thanks

Jens
Comment 13 Owen Taylor 2005-06-21 16:03:11 UTC
I've committed code based on the last version of the patch. Changes
are:

 - Reordered changes to build files to keep modules in alphabetical order

 - Changed the copyright notices to include:

 * Partially based on Indic shaper
 * Copyright (C) 2001, 2002 IBM Corporation
 * Author: Eric Mader <mader@jtcsv.com>

 - Fixed various minor code-style problems (mostly excess {} around
   single line conditionals, and missing spaces in function calls.
   "foo(" rather than "foo (".

No substantive changes.

2005-06-21  Owen Taylor  <otaylor@redhat.com>

        * modules/khmer configure.in modules/Makefile.am
        modules/makefile.msc: Add a Khmer module by
        Jens Herden and Javier Sola. (#125605)
Comment 14 Behdad Esfahbod 2005-06-21 23:28:31 UTC
Owen:

Somehow the buffer initialization code in khmer_engine_shape has been dropped
from the patch you applied.  Both the patch in comment 12 and other modules have
a line like this:

buffer = pango_ot_buffer_new (fc_font);

As gcc warns, currently buffer is may be used uninitialized, and I can confirm
that buffer is never ever assigned in khmer_engine_shape.
Comment 15 Owen Taylor 2005-06-22 15:06:24 UTC
2005-06-22  Owen Taylor  <otaylor@redhat.com>

        * modules/khmer/khmer-fc.c (khmer_engine_shape): Add back
        accidentally dropped line (Pointed out by Behdad Esfahbod)