After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 555000 - Wrong treatment on non-spacing marks dead keys in GtkIMContextSimple
Wrong treatment on non-spacing marks dead keys in GtkIMContextSimple
Status: RESOLVED FIXED
Product: gtk+
Classification: Platform
Component: Input Methods
2.14.x
Other Linux
: Normal normal
: ---
Assigned To: Hidetoshi Tajima
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2008-10-04 14:56 UTC by Theppitak Karoonboonyanan
Modified: 2008-10-10 08:29 UTC
See Also:
GNOME target: ---
GNOME version: 2.23/2.24


Attachments
Check for relevant scripts for dead keys (1.08 KB, patch)
2008-10-04 15:17 UTC, Theppitak Karoonboonyanan
none Details | Review
Updated macro that detects dead keys (1.25 KB, patch)
2008-10-08 22:18 UTC, Simos Xenitellis
committed Details | Review
List of compose sequences eliminated due to check_algorithmically() (143.56 KB, text/plain)
2008-10-08 22:27 UTC, Simos Xenitellis
  Details

Description Theppitak Karoonboonyanan 2008-10-04 14:56:26 UTC
The newly introduced "check_algorithmically()" in GtkIMContextSimple wrongly assumes all non-spacing marks dead keys to preceed its base character. This is not true for certain languages, including Lao and Thai, where combining characters follow base characters. This causes many valid sequences for those languages to be rejected, as found in LP #273856 [1].

  [1] https://bugs.launchpad.net/ubuntu/intrepid/+source/gtk+2.0/+bug/273856

Steps to reproduce:
1. setxkbmap us,th -option grp:alt_shift_toggle
2. Start gedit
3. Choose "Simple" input method
4. Switch keyboard group to th and type some Thai text, for example, with corresponding keys for qwerty keyboard:
   l;ylfu8iy[

Expected result:
Thai text: สวัสดีครับ

What actually happens:
Thai text: สวสดครบ

The "Simple" intput method used to work in a very primitive way in GTK+ 2.12 for Thai input on non-Thai locales as the default method. The problem in GTK+ 2.14 has caused surprise to many Thai users who are used to English locale.
Comment 1 Theppitak Karoonboonyanan 2008-10-04 15:17:16 UTC
Created attachment 119922 [details] [review]
Check for relevant scripts for dead keys

Only relevant scripts should be counted for the algorithmic check. I'm not sure if the script list is complete yet. Some scripts must be missing.
Comment 2 Matthias Clasen 2008-10-07 17:52:42 UTC
Simos, can you look at this ?
Comment 3 Simos Xenitellis 2008-10-08 20:54:38 UTC
Indeed, the check if the character is of type G_UNICODE_NON_SPACING_MARK is too permissive. There are 755 Unicode characters in the BMP of that type.

What I am considering is to change from 

#define IS_DEAD_KEY(k) \
    (((k) >= GDK_dead_grave && (k) <= (GDK_dead_dasia+1)) || \
     ((g_unichar_type (gdk_keyval_to_unicode (k)) == G_UNICODE_NON_SPACING_MARK) && \
      ((k) < 0x1000000)))

to

#define IS_DEAD_KEY(k) \
    (((k) >= GDK_dead_grave && (k) <= (GDK_dead_dasia+1)) && \
      ((k) < 0x1000000))

or just 

#define IS_DEAD_KEY(k) \
    ((k) >= GDK_dead_grave && (k) <= (GDK_dead_dasia+1))

which will accommodate Thai.

I am going through the compose sequences from X.Org that get eliminated when we produce gtkimcontextsimpleseqs.h, so that we do not have regressions. With this we might miss some opportunities to get extra compose sequences auto-supported, but then again, if these sequences where not available in Xorg in the first place, it should be OK.
Comment 4 Simos Xenitellis 2008-10-08 22:18:41 UTC
Created attachment 120238 [details] [review]
Updated macro that detects dead keys

I propose this patch to fix the issue with the Thai keyboard layout. 

The compose sequences that get eliminated when parsing the X.Org Compose file are all related to dead keys.
Comment 5 Simos Xenitellis 2008-10-08 22:27:22 UTC
Created attachment 120239 [details]
List of compose sequences eliminated due to check_algorithmically()

For reference, these are the compose sequences that get eliminated when parsing the X.Org Compose file. 
All relate to dead keys.

Some sequences, such as 
<U0331> <t>                     : "ṯ"   U1E6F # LATIN SMALL LETTER T WITH LINE BELO

do not pose an issue because the keyboard layouts look like

ng:   key <AD11> { [ bracketleft,  braceleft,   0x1000331, 0x1000331 ] }; // combining macron below

(that is, it's 0x1000000 + value)

and check_algorithmically() does not touch these (due to IS_DEAD_KEY() macro).
Comment 6 Matthias Clasen 2008-10-10 05:28:47 UTC
Sounds good. Please commit to trunk and gtk-2-14
Comment 7 Simos Xenitellis 2008-10-10 08:29:58 UTC
Committed to both trunk and gtk-2-14,

        Bug 555000 – Wrong treatment on non-spacing marks dead keys in
        GtkIMContextSimple

        * gtk/gtkimcontextsimple.c: Change IS_DEAD_KEY() macro so that
        it only checks if input is a deadkey.