Bug 95708 – enhancing Hangul shaper (Xft) with Oxxx/Nxxx fonts

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 95708 - enhancing Hangul shaper (Xft) with Oxxx/Nxxx fonts


Summary:	enhancing Hangul shaper (Xft) with Oxxx/Nxxx fonts


Status:	RESOLVED WONTFIX

Product:	pango
Classification:	Platform
Component:	hangul
Version:	1.1.x
Hardware:	Other All

Importance:	High enhancement
Target Milestone:	1.4.0
Assigned To:	Changwoo Ryu
QA Contact:	pango-maint

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2002-10-14 09:33 UTC by Jungshik Shin
Modified:	2005-08-15 01:36 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
patch v1 (53.14 KB, patch) 2002-10-17 08:58 UTC, Jungshik Shin	none	Details \| Review
patch v2 (61.39 KB, patch) 2002-10-18 11:53 UTC, Jungshik Shin	none	Details \| Review
patch v3 (72.10 KB, patch) 2002-10-19 23:53 UTC, Jungshik Shin	none	Details \| Review
a new patch (tone mark handling routine removed) (22.20 KB, patch) 2002-10-20 11:09 UTC, Jungshik Shin	none	Details \| Review
a new patch(normalization routine put in a separate file) (18.59 KB, patch) 2002-10-21 13:30 UTC, Jungshik Shin	none	Details \| Review
hangul-utils.c (a new file) for jamo normalization (4.01 KB, text/plain) 2002-10-21 13:31 UTC, Jungshik Shin		Details
a new file(hangul-utils.h) for jamo normalization (1.31 KB, text/plain) 2002-10-21 13:32 UTC, Jungshik Shin		Details
tables-ext-jamos.i (a new file) for Oxxx/Nxxx mapping (31.93 KB, text/plain) 2002-10-21 14:09 UTC, Jungshik Shin		Details
tables-jamos2.i (a new file) for Jamo normalization (18.32 KB, text/plain) 2002-10-21 14:10 UTC, Jungshik Shin		Details
a new patch(following gnome coding-style convention) (19.59 KB, patch) 2002-10-22 14:20 UTC, Jungshik Shin	none	Details \| Review
hangul-utils.h(new : variable/fucn/type name change) (1.33 KB, text/plain) 2002-10-22 14:22 UTC, Jungshik Shin		Details
hangul-utils.c (new: gnome convention) (4.11 KB, text/plain) 2002-10-22 14:24 UTC, Jungshik Shin		Details
a new patch(use bsearch instead of lin. search reducing function calls by 30 ~ 100) (19.97 KB, patch) 2002-10-23 19:53 UTC, Jungshik Shin	none	Details \| Review
hangul-utils.h(new : modified for bsearch) (1.54 KB, text/plain) 2002-10-23 19:55 UTC, Jungshik Shin		Details
hangul-utils.c(new : modified for bsearch) (5.07 KB, text/plain) 2002-10-23 19:57 UTC, Jungshik Shin		Details
tables-ext-jamos.i (new: modified for bsearch : sorted) (32.08 KB, text/plain) 2002-10-23 19:59 UTC, Jungshik Shin		Details
tables-jamos2.i(new: modified for bsearch , sorted) (19.09 KB, text/plain) 2002-10-23 20:00 UTC, Jungshik Shin		Details
all in one patch (76.86 KB, patch) 2002-12-12 21:33 UTC, Jungshik Shin	none	Details \| Review
a new patch against the head (50.17 KB, patch) 2003-08-27 10:46 UTC, Jungshik Shin	none	Details \| Review
updaed patch (using smaller and revised arrays for cluster jamo mapping) (50.34 KB, patch) 2003-08-27 13:43 UTC, Jungshik Shin	none	Details \| Review
*new files (hangul-utils., tables-ext-jamos.i, tables-jamos2.i)** (51.08 KB, patch) 2003-08-27 13:45 UTC, Jungshik Shin	none	Details \| Review
a new patch (updated against the trunk) (31.92 KB, patch) 2003-12-04 18:53 UTC, Jungshik Shin	none	Details \| Review

Description Jungshik Shin 2002-10-14 09:33:13 UTC

This is spun off from bug 95569 and below is copy'n'pasted from
it. 

-----------
    Another possibility is to make Pango do what Yudit 
    and Lambda do with Ogulim/Obatang/Ogunseo and
    fonts. These fonts 
    are distributed in Korean MS Office 2000 and
    Ogulim is also available as 'Old Korean support
    kit' at MS web site. They do not have OT tables
    for Hangul Jamos. Ogulim has a set of glyphs
    for all known consonant and vowel clusters which can
    be assembled together to render a pretty generic
    sequence of Hangul Jamos. 

    There are another set of fonts in MS Word 2000(Korean)
    and Old Korean support kit, namely Ngulim/Nbatang/Ngungseo.
    They have  precomposed glyphs for 
    all known precomposed Hangul syllables(thousands of
    them) ever found in Korean literature. Producing the  mapping
    from Hangul Jamo sequences to those precomposed syllables
    are tedious, but doable. 
 
    I'm wondering whether this font-specific 'hack' can be
    included in Pango. This is sort of like a hack for
    KAIST/Iyagi BDF johab fonts. If there's a way to
    uniquely identify these fonts, I think it's possible.
    and would dramatically improve Pango's ability to
    render pre-1933 orthography Korean text until
    Korean OTFs with Korean Hangul Jamo support are
    widely available. 
-----------

  In Unicode, Hangul syllable S is defined as 
a sequence of Hangul leading consonants (L+), a sequence
of Hangul vowels(V+) and an optional sequence of
Hangul trailing consonants(T*). 

   S: = L+V+T*M?               (1)


Although the number of
L,V and T's in each sequence can be any number in theory,
we can define a Hangul syllable S in practice
as following:

  S:=  L{1,3} V{1,3} T{0,3} M?    (2)

where M is either U+302E or U+302F (Hangul tone mark)

Using Ogulim/Obatang/Ogungseo, it's possible to render
about 500k syllables formed out of all known instances
of Jamo clusters found in existing Korean literature.
This is not the most generic nor even close to supporting
the limited definition given by (2). Nonetheless, 
it's a great step forward and should be sufficient
for virtually all Korean linguists and general public except
for select few with wide-ranging imagination and
desire to come up with and to use novel(hitherto unused/not found)
Jamo clusters.

It seems not so hard to add this to pango/modules/hangul/hangul-xft.c.
One question I have is whether it's all right to
call pango_ft2_font_get_face()(which is necessary for
figuring out the name of a font face for which this font-specific
hack is applicable) in hangul-xft.c. That introduces
a dependency on FT2 in otherwise Xft part. 

FYI, Ogulim/Ngulim fonts are available at
http://office.microsoft.com/korea/assistance/2000/weboldhg.aspx

Comment 1 Owen Taylor 2002-10-14 15:01:58 UTC

No, FT2 code is not available in Xft modules. 

 pango_xft_lock_face() 

is pretty similar though.

Comment 2 Jungshik Shin 2002-10-16 08:40:04 UTC

Owen,
Thanks for info. on pango_xft_lock_face(). 

BTW, I put up the list of extra Jamos avaialble in Oxxx.ttf
at http://jshin.net/i18n/korean/jamos_ogulim.txt 
The list of precomposed syllables (pre-1933-orthography)
in Nxxx.ttf is at http://jshin.net/i18n/korean/ngulim.html.
(Obviously, Ngulim.ttf has to be installed and Mozilla-Xft
works fine in that case.)

Comment 3 Jungshik Shin 2002-10-17 08:58:21 UTC

Created attachment 11622 [details] [review]
patch v1

Comment 4 Jungshik Shin 2002-10-17 09:31:14 UTC

this is still a work in progress. just wanted to put it somewhere safer
than my disk. Nonetheless, it works in the sense that :

  - fonts with spacing jamo glyphs and fonts with combining jamo
    glyphs are distinguished and jamo sequences are rendered
    accordingly

  - baisc jamo sequences are automatically 'normalized' to
    jamo clusters with code points of their own. backing-store
    remains intact.

  - oxxx shoudl work if I can find a work-around for the following
    problem 

  - tone mark (with fallback : still more work to do here)

I encountered a problem with Ogulim and fontconfig(?), though. Ogulim
(Gulim Old Hangul Jamo) has a hack-encoding and even if I specify 
it to be used  explicitly(in an application like gedit), 
fontconfig(this is my suspicion only) comes back with 'New Gulim'
Therefore, pango_xft_lock_face(font)->family_name has 'New Gulim'.
Perhaps, this requires a kind of hackery Owen mentioned in Mozilla
bug for Xft (http://bugzilla.mozilla.org/show_bug.cgi?id=126919#c87).


gedit may not be the best application to test this.. Is there
any Gtk application that makes use of 'fontset' in PangoXft
and 'Pango coverage map'(?) so that I can test Oxxx rendering? 


  * Works to do:

   - clean up and optimize 
   - add support for Nxxxx 
   - figure out how to use Oxxx code(already written)
   - fix 'spacing' problem with tone mark

Comment 5 Jungshik Shin 2002-10-17 09:39:39 UTC

tone marks work well if a font(e.g. CODE2000) has (combining)
glyphs for them. 'glyph positioning
problem arises' when I have to resort to a fallback (':' and 'middle dot'
for U+302E and U+302F)

Comment 6 Jungshik Shin 2002-10-18 11:53:16 UTC

Created attachment 11660 [details] [review]
patch v2

Comment 7 Jungshik Shin 2002-10-19 23:53:56 UTC

Created attachment 11693 [details] [review]
patch v3

Comment 8 Jungshik Shin 2002-10-20 00:14:33 UTC

Now every thing I wanted to implement is in place
except that I have yet to complete precomposed syllable
look-up table for New Gulim. There are about 5000
precomposed syllable in the font and the table included
in the patch is about 250 (a 20th of the total). However,
with them I was able to test my look-up routine and it
works well as intended.

  Works to do:

   - More extensive testing
   - Clean-up (including indentation changes)
     and reorganization (hangul-x.c may needs
     some of utility functions currently in
     hangul-xft.c)
   - implement 'best-possible' rendering approach 
     to long jamo clusters unrendereable as syllables
     even with Ngulim and Ogulim. They must be 'new'
     syllables invented by creative minds :-) 

As for using Ogulim despite its hack-encoding, I'm pretty certain 
that fontconfig can  make things really easy only if
'charset' property and 'lang' property of a font can be 
'editable' in the configuration (fonts.conf) as is the case of 
other properties(e.g. family). Then, upstream-clients of
fontconfig like Pango would have very little to do 
except for changes similar to what my patch does.

Comment 9 Jungshik Shin 2002-10-20 09:16:47 UTC

Currently, my patch does two extra things other than Nxxx/Oxxx support,
adding Hangul tone mark support and distinuishing fonts with
combining glyphs and spacing glyphs for Hangul vowel and trailing
consonants. 

I'll file two separate bugs for them if it's deemed necessary to
expedite things. Hangul tone mark handling seems to be a easier
target for this separation(I've just filed bug 96299 for
Hangul tone mark handling). It also involves a bit complex
handling of width/x-offset setting(not yet implemented in
my patch, which makes a case for the separation even stronger.

Comment 10 Jungshik Shin 2002-10-20 09:31:58 UTC

In my patch, distinuishing fonts with combining glyphs and fonts with
spacing glyphs for Hangul vowel and trailing consonants 
is done in the same routine set_render_func() as selecting render_func
based on family-name (for Nxxx and Oxxx) is done. This makes it
hard to separate  the former (dist. fonts with comb. glyphs
and fonts with spacing glyphs) from this bug. Therefore, I prefer
to leave it as it is for now.

Later when my patch is committed, we can file another bug 
about selecting rendering function in both hangul-x and hangul-xft.
Because it seems like what's done in Thai shaper has to be done here
(caching per font, defining a type or two for Hangul-font type
and charset, using gquark, etc). For the record, I filed bug 96300
for this issue. 

As bug 96299 is spun off from this for Hangul tone mark handling,
I'll upload a new patch without Hangul tone mark handling here soon.

Comment 11 Jungshik Shin 2002-10-20 11:09:43 UTC

Created attachment 11706 [details] [review]
a new patch (tone mark handling routine removed)

Comment 12 Jungshik Shin 2002-10-20 12:26:07 UTC

attachment 11706 [details] [review] does not incude table-ext-jamos.i and table-jamos2.i
file because they haven't changed since patch v3.

Hangul tone mark handling routine (render_tone()) was seprated out
to bug 96299. Beginning with attachment 11706 [details] [review], patches uploaded
here include calling sequences to that function from various
render_func's, but render_tone() itself wont' be inlcuded.
Besides, my patches (from now on) are against my personal tree
with the latest patch for bug 96299 applied. That is, patches
here are produced as if bug 96299 were resolved with my latest
patch. 

Now, let me break down my patch here and explain what each part does:

* a new type  __jamo_norm_map defined in hangul-defs.h
  : used by jamo_srch_repl()
  invoked by normalize_jamo() and og_transform(). 
  __u1100_jamo_clusters[] in tables-jamos2.i and
  __ext_xx_clusters[] (where xx is lc|vo|tc) in tables-ext-jamos.i

  It holds a mapping from a sequence of upto 3 (MAX_BASIC_JAMOS)
  basic Jamos to a cluster jamo. The definition of 'basic' jamo
  is a bit fluid. It means any jamo that can be regarded as 
  a subcomponent of a cluster jamo. 

* tables-jamos2.i : included by hangul-xft.c and used in
  normalize_jamo(). 
  - has __u1100_jamo_clusters[]. This array is automatically
    generated by compatibility decomposition mapping
    in Unicode 2.0 data file. 

* tables-ext-jamos.i : included by hangul-xft.c and used 
  in og_transform().
  - includes various #define's for Oxxxx and Nxxxx fonts
   (OG_*'s and NG_*'s. OG and NG are for Ogulim and Ngulim,
    respecitively)
  - __oj_to_ns is typedef'ed as __jamo_norm_map. a new name
    is used because the symantics is different. This type
    holds a mapping from a sequence of Oxxx Jamos 
    (extended Jamo set including jamos not given codepoints
     of their own in U+1100 block) to a precomposed syllable
     glyph position in Nxxx(Ngulim) fonts. Thus the name,
     oj_to_ns stands for 'Ogulim Jamo to Ngulim Syllable'.

  - three arrays __ext_xx_clusters[] (xx : lc, vo, tc) 
    hold mapping from basic jamo sequence to OG cluster jamos
    used in og_transform. copied from my implementation
    (extending CHO Jin-Hwan's implementation) in Lambda
     and Yudit. 

  - __ogulim_xx_gidx (xx=lc,vo,tc) : mappings from 
    extended Jamo code points (temporary) to 
    'glyph code points' in Oxxx fonts. 

  - __ogulim_....map : 4 of them :
    Oxxx fonts have six glyphs for each LC, 2 glyphs for 
    VO and 4 glyphs for each TC. Which of these glyphs
    to use when forming a syllable depends on whether
    it has TC and what kind of vowel is used in a syllable
    (horizontal, vertical, or both horizontal and vertical).
    These 4 arrays hold mappings to use in selecting a glyph
    based on those factors. Worked out manually by CHO Jin-Hwan
    and extended to support extended Jamos by me. 

 - __og_jamos_to_ng_syllable[] : an array of type __og_to_ns.
   a mapping table from a sequence of OG Jamos to a NG precomposed
   syllable. Nxxx(Ngulim) fonts have about 5000 precomposed
   syllables in PUA. All those syllables are formed out of
   OG-extended Jamos. 

* functions in hangul-xft.c

  - jamo_srch_repl(__jamo_norm_map *cluster, gunichar *in, int *len):
    search for cluster->seq in 'in' and replace it with 
    cluster->liga   in place. returns the difference in length 
    between before and after the replacement.

    called by normalize_jamo, og_transform and render_...with_ngulim()

  - gunichar* normalize_jamo(const gunichar* in, int *len):
    1. Normalize (regularize) a jamo sequence to put it in a regular
       syllable form defined Unicode 3.2 section 3.11 to the extent
       that it's useful in rendering by render_func's().

    2. Replace a compatibly decomposed Jamo sequence (unicode 2.0 
       definition) with a 'precomposed' Jamo cluster (with codepoint
       of its own in U+1100 block). For instance, a seq.
       of U+1100, U+1100 is replaced by U+1101. It actually
       more than Unicode 2.0 decomposition map suggests.
       For a Jamo cluster made up of three basic Jamos
       (e.g. U+1133 : Sios, Piup, Kiyeok), not only
        a sequence of Sios(U+1109), Piup(U+1107) and 
       Kiyeok(U+1100) but also two more sequences,
       {U+1132(Sios-Pieup), U+1100(Kiyeok) and {Sios(U+1109),
        U+111E(Piup-Kiyeok)} are mapped to U+1133.

    3. the result is returned in a newly malloced(g_new'd)
       gunichar*. A calling function has to g_free it.

  - typedef : void (* RenderSyllableFunc) : the same usage
    as in hangul-x.c. all render_syllable_xxx  funcs are
    of this type and used in set_render_func()

  - render_as_precomp_syllable() :
    invoked by render_syllable_with_(combining|spacing|ngulim).
    When a Jamo sequence can be converted to a precomposed
    syllable in U+AC00 block and a font has a glyph for it,
    this is invoked

  - render_syllable_base() : one additional argument
    to distinguish bet. a font with spacing jamo glyphs
    and a font with combining jamo glyphs. 
    As discussed in bug 95569, when combing glyphs for
    a simple overstriking are available in a font,
    the best-possible-effort may lead to an undesirable
    result. So, treat two cases differently. 

    As in all other render_syllable_xxx()'s, tone mark
    is processed first and normalize_jamo() is invoked
    before processing further. I won't mention this
    for other render_syllable_xxx()'s.

  - render_syllable_with_(combining|spacing) :
    just calls render_syllable_base() with the last argument
    set depending on a type of font

  - static void og_transform (gunichar *text, int *length)

    1. shift jamo sequences to three disjoint code blocks in
       PUA (0xF000 for LC, 0xF1000 for VO, 0xF200 for TC).

    2. replace a jamo sequence with a precomposed OG-extended 
       cluster jamo code point in PUA
    3. this replacement is done 'in place' 

  - render_syllable_with_ogulim()
    1. OG_Xform a jamo seqeunce 
    2. If rendereable with OG-extended Jamo glyphs,
       do it using various mapping tables defined
       in tables-ext-jamos.i
    3. otherwise, render it with glyphs for jamos
       in a sequence enumerated. V and T's are
       prepended with Lf to advance cursor position
       because glyphs for V and
       T in Oxxx fonts are non-spacing. 
 
   - oj_ns_comp() : a comparison function
     used to bsearch() for a OG-ext. Jamo
     sequence in __og_jamos_to_ng_syllable[] array.
  
   - render_syllable_with_ngulim() :
      
      1. after processing common in all render_syllable_xxx()'s,
         try to render a seq. with a precompose syllable
          in U+AC00 block.
      2. og_transform() it
      3. if the result is not a sequence that can form a syllable
         with two or three OG-ext. jamos, go to fallback(#6)
      4. bsearch() for a OG-ext. Jamo seq. in    
         __og _jamos_to_ng_syllable[]. If a match is found,
         use that precomposed syllable glyph 
      5. if not, render a sequence as a seq. of OG-ext. jamo
         glyphs designed in such a way that a simple overstrking
         results in a syllable glyph.
      6.   enumerate stand-alone jamos as a fallback.
        as in render_syllable_with_ogulim(), Lf is put
        before V and T to advance 'cursor' because V and
        T glyphs are not spacing in Nxxx fonts.

  -  set_render_func(PangoFont *font, RenderSyllableFunc *render_func)

      1. invoke FT_Face to figure out family name of font
      2. Set render_func() based on family name first
      3. inspect U+1161 (vowel A) glyph and see if it's 
         spacing or combining, set render_func accordingly.

  -  hangul_engine_shape() : 

     1. set_render_func() is called
     2. render_func() is invoked instead of render_syllable().


I hope this explanation will help understand my patch and
expedite commiting it. Comments are all welcome.

Comment 13 Jungshik Shin 2002-10-21 13:30:09 UTC

Created attachment 11728 [details] [review]
a new patch(normalization routine put in a separate file)

Comment 14 Jungshik Shin 2002-10-21 13:31:37 UTC

Created attachment 11729 [details]
hangul-utils.c (a new file) for jamo normalization

Comment 15 Jungshik Shin 2002-10-21 13:32:27 UTC

Created attachment 11730 [details]
a new file(hangul-utils.h) for jamo normalization

Comment 16 Jungshik Shin 2002-10-21 14:09:17 UTC

Created attachment 11731 [details]
tables-ext-jamos.i (a new file) for Oxxx/Nxxx mapping

Comment 17 Jungshik Shin 2002-10-21 14:10:05 UTC

Created attachment 11732 [details]
tables-jamos2.i (a new file) for Jamo normalization

Comment 18 Jungshik Shin 2002-10-21 14:17:25 UTC

In the latest patch (one patch against HEAD, 4 new files), 
the dependence on bug 96299 is completely gone. This patch can
go in independently of bug 96299. Jamo normalization related
routines are put in two new files (hangul-utils.c and hangul-utils.h.
I'm open to a suggestion for a better name if any) in the expectation
of this routine being used by hangul-x.c as well in the future
(see bug 96314). It's to be noted that I can't deal with
Jamo-normalization in a new bug independent of this one because
og_transform in hangul-xft.c shares a function (jamo_srch_repl())
and data structure (__jamo_norm_map) with normalize_jamo().

Comment 19 Jungshik Shin 2002-10-22 14:20:56 UTC

Created attachment 11755 [details] [review]
a new patch(following gnome coding-style convention)

Comment 20 Jungshik Shin 2002-10-22 14:22:40 UTC

Created attachment 11756 [details]
hangul-utils.h(new : variable/fucn/type name change)

Comment 21 Jungshik Shin 2002-10-22 14:24:02 UTC

Created attachment 11757 [details]
hangul-utils.c (new: gnome convention)

Comment 22 Jungshik Shin 2002-10-23 19:53:55 UTC

Created attachment 11787 [details] [review]
a new patch(use bsearch instead of lin. search reducing function calls by 30 ~ 100)

Comment 23 Jungshik Shin 2002-10-23 19:55:19 UTC

Created attachment 11788 [details]
hangul-utils.h(new : modified for bsearch)

Comment 24 Jungshik Shin 2002-10-23 19:57:50 UTC

Created attachment 11789 [details]
hangul-utils.c(new : modified for bsearch)

Comment 25 Jungshik Shin 2002-10-23 19:59:57 UTC

Created attachment 11790 [details]
tables-ext-jamos.i (new: modified for bsearch : sorted)

Comment 26 Jungshik Shin 2002-10-23 20:00:43 UTC

Created attachment 11791 [details]
tables-jamos2.i(new: modified for bsearch , sorted)

Comment 27 Changwoo Ryu 2002-10-26 21:26:45 UTC

-  /* Well, no unicode rendering engine could render Hangul Jamo area
-     _exactly_, I sure.  */
+  /* XXX : If font is Oxxx or Nxxx, set to PANGO_COVERAGE_EXACT for
U+1100 Jamos */

You can do that if the Oxxx/Nxxx code show "U+1100 U+1100 U+1100 ...
(100 times) U+1161 U+1161 ... (10000 times)" as a reasonable syllable
form.

It's still PANGO_COVERAGE_FALLBACK.

Comment 28 Jungshik Shin 2002-10-26 23:32:37 UTC

Am I supposed to respond to your last comment? Did you want me to?
How about these lines in hangul-x.c?

-----------
      else if (render_func == render_syllable_with_ksx1001johab)
        {
          for (i = 0x1100; i <= 0x11ff; i++)
            pango_coverage_set (result, i, PANGO_COVERAGE_EXACT);
----------------

EXACT may well be a bit of overstatement for Oxxx/Nxxx style fonts
for an  example like yours and APPROXIMATE may be about right. 
On the other hand, in light of the above and other similar coverage
settings in hangul-x.c, EXACT is  not much of an overstatment. 
FALLBACK is cleray an understatement. 

It also has to be noted that Oxxx/Nxxx style fonts cover over one 
and half million syllables
(all the syllable combinations that can be composed out of
all the known consonants and vowels in every single 
book published since 1443. Of course, there  may be still some omissions
and some creative - or not so creative minds like mine - can
come up with new vowel clusters and consonant clusters at any time.)
Anyway, they have the best coverage and can be a good model for
developing new fonts for a better support of Hangul Jamos.

Comment 29 Changwoo Ryu 2002-10-27 05:06:00 UTC

That also should be fixed.

The ksc5601.1992-3 stuff was not written by me.  Originally it's from
Sun Microsystems.  I did not notice the lines when applying the patch
from Sun.

Comment 30 Jungshik Shin 2002-10-27 13:42:37 UTC

Hmm.. I don't recall writing that you're responsible for that
in hangul-x.c. I didn't even imply it(because of Owen's
comment around that part of the file) although you apparently
thought that way.

Anyway, why don't you just tell me what the exact graphical form
(as mentioned in PANGO Coverage level doc.) is for your  example (100
U+1100's foll. by 10,000 U+1161's)?  While you're at it, could you
tell me what the exact graphical form is for 
U+0041 followed by 100 combining diacritic marks for Latin
alphabets, in turn,followed by  a combining enclosing
circle?  Can Pango claim 'EXACT' coverage for U+0041 and
diacritic combining marks? If it currently does, does the level 
have to be degraded for them because PANGO can only stack up up to, 
say, 3, diacritics over U+0041?

Comment 31 Owen Taylor 2002-11-02 05:42:54 UTC

Moving bugs to new hangul component

Comment 32 Changwoo Ryu 2002-12-09 19:31:26 UTC

Jungshik, could you make a *single* patch against current CVS HEAD?
I could not build pango with your patches.

BTW, Because of the End User License Agreement of the Microsoft Nxx/Oxx
fonts (which are permitted to be used only with Windows OS), I will
just commit the patches without testing.

Comment 33 Jungshik Shin 2002-12-12 21:33:32 UTC

Created attachment 12947 [details] [review]
all in one patch

Comment 34 Jungshik Shin 2002-12-12 21:40:14 UTC

Changwoo, 
Can you take a look and apply the patch? 
There are a couple of enhancement
I can make, but they can be put off until 1.2.1 or later, 
I think. 
(BTW, because I don't cvs write access here, new files
are diffed against /dev/null without RCS/Index heading, 
but that shouldn't be a problem when you apply it)

Comment 35 Owen Taylor 2002-12-17 02:13:34 UTC

I'm really not comfortable adding this patch at this
point. It's 600 new lines of code (ignoring the tables).

I think we're best off saving this for Pango-1.3.x, when
we can get some testing before the final release.
If it is working well in the 1.3.x branch, perhaps we
can consider a backport to the stable 1.2.x series.

Comment 36 Jungshik Shin 2002-12-17 02:40:30 UTC

I can understand you don't feel very comfortable committing
this long patch not long before the release, but
the length of the patch should not scare you much.

Most of patch is to add new features(old Korean
text rendering). They  wouldn't affect
users who don't use them because existing features and
functionalities are little, if any, affected by 
the change. Functions are shuffled around in
hangul-xft.c to add new features(, which makes the patch long), but
existing  featuers are well preserved.

I've been using Pango with the patch for the last month
and half and it worked rather well for me. 
Besides, basically the same implementation has been 
tested on three other programs (Omega/Lambda, Yudit and Mozilla). This
wouldn't guarantee that there's no bug,
but at least I can tell you that there's no known  
regression and major bug.

I feel rather strongly about implementing this so
that  it'd be nice if you could consider this
issue one more time.
Thanks.

Comment 37 Owen Taylor 2002-12-17 04:27:07 UTC

It's sometimes possible that even if a patch is working well
for someone who has everything configured right on the
system, it might cause some unexpected side effect for
people without a good Korean configuration. We've actually
seen crashes like this earlier.

As I understand it, this is a fairly specialized addition;
I'm sure that it's very important for some users, but maybe
not for the typical Korean user?

I suspect that the average Korean user is probalby more concerned
by the fact that the delete key deletes an entire syllable
than the fact than the missing support for these fonts...

Also, from cwryu's comments apparently fonts that can be used 
with this patch and Linux aren't widely available.

Given that, I just don't see it as worth the risk of adding
this code right before the freeze, and without testing by
a wider group of people.

That's not to say I don't wantthis added, I just don't
want this added right now.

Comment 38 Jungshik Shin 2003-08-27 10:46:57 UTC

Created attachment 19543 [details] [review]
a new patch against the head

Comment 39 Jungshik Shin 2003-08-27 10:49:34 UTC

attachment 19543 [details] [review] does not include new files but has only diffs against
files in the cvs. New files (e.g. tables-jamos2.i) haven't changed
since last December.

Comment 40 Jungshik Shin 2003-08-27 13:43:18 UTC

Created attachment 19546 [details] [review]
updaed patch (using smaller and revised arrays for cluster jamo mapping)

Comment 41 Jungshik Shin 2003-08-27 13:45:30 UTC

Created attachment 19547 [details] [review]
new files (hangul-utils.*, tables-ext-jamos.i, tables-jamos2.i)

Comment 42 Jungshik Shin 2003-08-27 13:49:00 UTC

In the latest two attachments (attachment 19546 [details] [review] and attachment 19547 [details] [review]),
I back-ported changes I made in Mozilla
(http://bugzilla.mozilla.org/show_bug.cgi?id=176315). They include
fixing some mistakes in mapping tables and cutting down the size of
some arrays. 

Otherwise, they're identical to the previous patch.

Comment 43 Owen Taylor 2003-11-17 23:06:59 UTC

If you can find someone else to review this patch, I'm OK
with it going in, though it does seem like a lot of code
to add for fonts that are (?) only available as part of 
MS Office.

Comment 44 Jungshik Shin 2003-11-18 04:20:36 UTC

Fonts are available to everyone for download(e.g.
http://www.korean.go.kr has the link to it for old Hangul display)
except that in _some_ countries, the EULA bind you not to use on
platforms other than Windows. In other countries, it doesn't. For
instance, in Germany, it appears that the EULA is not effective, but
I'm not a lawyer and I don't feel 'comfortable' either.

Anyway, this patch includes a lot of stuffs that can be made use of
for other Hangul fonts. 

As for a reviewer, I can't think of any other than cwryu and noah.
cwryu seems very busy, but it'd be nice if he can review (he sorta did
last year(see his comment : 2002-12-09 [1]) although the patch had to
be modified to fit the new pango framework(?)).

noah, would you take a look? It's large but the principle is rather
simple. Besides, basically the same patch has been in Mozilla since
1.4(?)  (except that Mozilla patch is for a different font) and in Yudit.

Actually, I may find someone from the Korean Linux community (e.g.
Choi Hwanjin who wrote gtk2 input modules for Korean) if necessary.





[1] gnome-bugzilla has to be upgraded for an easier reference to
comments (like comment #15)

Comment 45 Noah Levitt 2003-11-21 06:10:43 UTC

Hi Jungshik, could you update the patch so it applies to current HEAD?

Comment 46 Jungshik Shin 2003-11-21 06:14:14 UTC

Hi Noah, I'll do next week. Would it work?

Comment 47 Jungshik Shin 2003-12-04 18:53:43 UTC

Created attachment 22104 [details] [review]
a new patch (updated against the trunk)

Comment 48 Jungshik Shin 2003-12-04 18:56:30 UTC

Hi Noah, I'm sorry it took longer than I told you. Just attached is a
patch against HEAD. new files (uploaded on Aug 27) can be used as
they're. 
It'd be great if you could take a look and commit as appropriate.

Comment 49 alexander.winston 2004-01-25 01:02:21 UTC

Adding the PATCH keyword and marking the priority level to high.

Comment 50 Changwoo Ryu 2004-01-25 09:47:09 UTC

I don't understand why we should make effort to support these fonts. 
FYI, the below is the EULA of the Oxxx/Nxxx fonts.  It is NOT legal to
use these fonts in non-Windows environments.  Nobody can use these
fonts anyway.  Nobody can even test this patch without violating this
damn EULA.

OK, of course some free alternative fonts could use this encoding
scheme in the future.  But I didn't hear of any news about such
development.

-------------------

MICROSOFT Old Hangul Support Pack

ADDENDUM TO END USER LICENSE AGREEMENT 
FOR MICROSOFT PRODUCT ("EULA")

The Old Hangul support package you have installed or downloaded 
("Language Support Software") enables you to use the versions of 
Microsoft products identified as eligible for the Old Hangul Support 
Software (SOFTWARE PRODUCT) to view, input, manipulate or otherwise 
make use of information presented in Old Hangul.  You may install and 
use one copy of the Old Hangul Support Pack solely as an integrated 
component of a validly licensed copy of the SOFTWARE PRODUCT and 
Windows 95 or Windows NT 4.0 or later versions thereof. Your use of the 
Old Hangul Support Software is governed by this Addendum and the End 
User License Agreement applicable to the SOFTWARE PRODUCT.

Comment 51 Owen Taylor 2004-02-21 18:10:45 UTC

I don't think we should apply this unless people are making
free versions of fonts with these encodings. And since
Microsoft's direction in this area is OpenType fonts
(http://www.microsoft.com/typography/otfntdev/hangulot/default.htm)
I don't expect people to make fonts to match what they were
doing in the past.