GNOME Bugzilla – Bug 167940
Add more dead key sequences (Greek, Greek Polytonic) to gtk+/gtk/gtkimcontextsimple.c
Last modified: 2006-02-21 12:45:11 UTC
Source: http://mail.gnome.org/archives/gtk-i18n-list/2004-December/msg00044.html BEGINS=============> > > 5. I realised that Greek Polytonic does not work (dead keys do not produced > > accented characters). Who should I bug on this? > > Check if proper Compose sequences exist in Gtk+ source code (note that > Gtk+ can produce only precomposed characters, it cannot produce > sequence of characters like X can [at least it was so last time I > checked]). If they don't exist, try looking into your X Compose file, > $XROOT/lib/X11/locale/el_GR.UTF-8. If they exist there, and are not > using only precomposed glyphs, I think it should be treated as RFE for > Gtk+. You can test this either in xterm, xev or in Gtk+ programs > with "X Input Method" enabled. >From gtk+ sources, I see /gtk+/gtk/gtkimcontextsimple.c does not include the affected composed letters. The GTK+ sources currently cover the basic modern Greek composed characters. Owen autogenerates this file from the compose tables of the Xorg files. I changed to "X Input Method" in gEdit and I managed to input the missing composed characters. Owen, in the RFE for GTK+, do I simply ask "Please, autogenerate /gtk+/gtk/gtkimcontextsimple.c including information from $XROOT/lib/X11/locale/el_GR.UTF-8" or do I create it myself from the scripts you mention in gtkimcontextsimple.c and send a patch? If (B), please send me the appropriate scripts. =============== ENDS So, do I make the patch myself (give me scripts to do so) or can someone generate it?
--- gtkimcontextsimple.c.ORIGINAL 2005-06-26 02:00:07.000000000 +0100 +++ gtkimcontextsimple.c 2005-07-13 17:53:57.000000000 +0100 @@ -58,6 +58,10 @@ */ static const guint16 gtk_compose_seqs[] = { + GDK_dead_acute, GDK_period, 0, 0, 0, 0x00B7, + GDK_dead_acute, GDK_greater, 0, 0, 0, 0x00BB, + GDK_dead_acute, GDK_less, 0, 0, 0, 0x00AB, GDK_Greek_accentdieresis, GDK_Greek_iota, 0, 0, 0, 0x0390, /* GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS */ GDK_Greek_accentdieresis, GDK_Greek_upsilon, 0, 0, 0, 0x03B0, /* GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS */ GDK_dead_grave, GDK_space, 0, 0, 0, 0x0060, /* GRAVE_A Tried the above on latest gtk+ 2.7.0 (CVS) as an example, setting GTK+ Input Method to "Default" and I do not get any output. Running "xev" gives me middle dot (0xb7 or 0xc2b7 in UTF-8), see: KeyPress event, serial 29, synthetic NO, window 0x3000001, root 0x48, subw 0x0, time 3656881, (136,32), root:(140,102), state 0x2000, keycode 60 (keysym 0x2e, period), same_screen YES, XLookupString gives 1 bytes: (2e) "." XmbLookupString gives 1 bytes: (2e) "." XFilterEvent returns: True KeyPress event, serial 29, synthetic NO, window 0x3000001, root 0x48, subw 0x0, time 3656881, (136,32), root:(140,102), state 0x2000, keycode 0 (keysym 0xb7, periodcentered), same_screen YES, XKeysymToKeycode returns keycode: 24 XLookupString gives 0 bytes: XmbLookupString gives 2 bytes: (c2 b7) "·" XFilterEvent returns: False KeyRelease event, serial 29, synthetic NO, window 0x3000001, root 0x48, subw 0x0, time 3656898, (136,32), root:(140,102), state 0x2000, keycode 47 (keysym 0xfe51, dead_acute), same_screen YES, XLookupString gives 2 bytes: (c2 b4) "´" KeyRelease event, serial 29, synthetic NO, window 0x3000001, root 0x48, subw 0x0, time 3656987, (136,32), root:(140,102), state 0x2000, keycode 60 (keysym 0x2e, period), same_screen YES, XLookupString gives 1 bytes: (2e) "."
<a href="http://live.gnome.org/BugSquad/BugDays">Bugday</a> tomorrow? In http://cvs.gnome.org/viewcvs/gtk%2B/gtk/gtkimcontextsimple.c?view=markup it mentions that to add more keysyms to "struct gtk_compose_seqs", one would need to use the scripts by Owen Taylor to convert automatically the Compose file to the format that GTK+ understands. The Greek compose file is http://cvs.freedesktop.org/xorg/xc/nls/Compose/el_GR.UTF-8?view=markup (Please see "Part 2", around the middle for the Greek section).
Non-technical explanation: There are two main Input Methods (IM), functionalities that let you type and view text in your own language. These are A. X Input Method (XIM), an input method that comes with X.org. It supports simple scripts (but cannot handle CJK and other exotic languages). B. GTK+ IM, which looks like a "re-implementation" of XIM, as GTK+ is available in environments other than X.org (Windows, OS/X, etc). So, what this bug report is about? The GTK+ IM is sadly out of sync when relating to a specific functionality, that of composing (using dead keys to add accents). GTK+ IM does not cover quite a few of such key combinations and in effect you cannot add accents or product characters using dead keys. Only very basic characters are supported. What needs to be done? See the section before in this bug report.
I am refering to modern Greek but also Greek Polytonic (ancient Greek).
See also bug #165723 about other suggested additions to gtkimcontextsimple.c.
Thanks Tor for the link to the similar bug. I assume my issue is with not being able to add more dead-key sequences by editing gtkimcontextsimple.c and recompiling. Therefore, if anyone wants to reproduce, 1. Add "Keyboard Switcher" to the panel A. Right-click on panel B. Add Applet/Tools/Keyboard Switcher, C. Then go to the properties and add Greek/Extended (Modern Greek)to your existing layout. (see: http://www.livejournal.com/users/simos74/32918.html, in Greek) (see: http://planet.hellug.gr/misc/ubuntu/ubuntu-hoary-preview-write-greek.html, Flash animation, Greek Ubuntu Linux). 2. Open gedit 3. Change the input method to Greek 4. Type in gedit ";", then "." (fullstop) ==> Nothing appears, you should be able to see MIDDLE DOT At this point, right-click in gedit and change the input method module from Default to X Input Method (XIM) and 5. Type again in gedit ";", then "." 6. You get MIDDLE DOT as expected. My test system (Linux) with CVS version of gtk+, glib, and all the other necessary libraries does not help me to solve it.
Did you add your entries to the table in the correct place? The table must be sorted.
I did not know that. I tried adding to the beginning and then at the end. Which field is the sorting key? I believe it is not the Unicode codepoint (first two entries). Where would I need to place "GDK_dead_acute, GDK_period, 0, 0, 0, 0x00B7," (Press ";", then "." to get MIDDLE DOT).
The first two columns are *not* Unicode code points, but GDK keysym values (= X11 keysym values). See the gdkkeysyms.h (or something like that) file. They are not really related to Unicode code points. (That they are 16-bit keysyms and not 32-bit Unicode code points was the reason I had to add that other table with 32-bit entries in the suggested patch in bug #165723)
> Which field is the sorting key? > I believe it is not the Unicode codepoint (first two entries). I was not precise at all. I do not know which is the sorting field. It cannot be the Unicode codepoint (the last field) because the first two rows of the table have a very big value for the codepoint. I have applied patch (http://bugzilla.gnome.org/attachment.cgi?id=48332) as is, compiled and installed. I want to verify that my test system actually work. With the above patch I did not see a difference in the behaviour of GTK+ from before. I am not sure if my test system is screwed up. I use leafpad (few dependancies) as my test editor. "ldd" on leafpad shows: > ldd /usr/local/bin/leafpad libgtk-x11-2.0.so.0 => /usr/local/lib/libgtk-x11-2.0.so.0 (0x00a58000) libgdk-x11-2.0.so.0 => /usr/local/lib/libgdk-x11-2.0.so.0 (0x00190000) libatk-1.0.so.0 => /usr/local/lib/libatk-1.0.so.0 (0x00d1f000) libgdk_pixbuf-2.0.so.0 => /usr/local/lib/libgdk_pixbuf-2.0.so.0 (0x00111000) libpangoxft-1.0.so.0 => /usr/local/lib/libpangoxft-1.0.so.0 (0x00f50000) libpangocairo-1.0.so.0 => /usr/local/lib/libpangocairo-1.0.so.0 (0x00e78000) libpangox-1.0.so.0 => /usr/local/lib/libpangox-1.0.so.0 (0x006a8000) libpangoft2-1.0.so.0 => /usr/local/lib/libpangoft2-1.0.so.0 (0x007e0000) libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x0020a000) libz.so.1 => /usr/lib/libz.so.1 (0x00dab000) libcairo.so.1 => /usr/local/lib/libcairo.so.1 (0x009cf000) libpango-1.0.so.0 => /usr/local/lib/libpango-1.0.so.0 (0x0053e000) libgobject-2.0.so.0 => /usr/local/lib/libgobject-2.0.so.0 (0x0039a000) libgmodule-2.0.so.0 => /usr/local/lib/libgmodule-2.0.so.0 (0x00f66000) libdl.so.2 => /lib/libdl.so.2 (0x0099d000) libglib-2.0.so.0 => /usr/local/lib/libglib-2.0.so.0 (0x0049d000) libfontconfig.so.1 => /usr/lib/libfontconfig.so.1 (0x00125000) libpixman.so.1 => /usr/local/lib/libpixman.so.1 (0x006fa000) libXrender.so.1 => /usr/X11R6/lib/libXrender.so.1 (0x00614000) libpng12.so.0 => /usr/lib/libpng12.so.0 (0x0014c000) libglitz.so.1 => /usr/local/lib/libglitz.so.1 (0x00308000) libm.so.6 => /lib/tls/libm.so.6 (0x002b0000) libc.so.6 => /lib/tls/libc.so.6 (0x00803000) libX11.so.6 => /usr/X11R6/lib/libX11.so.6 (0x003c9000) libXrandr.so.2 => /usr/X11R6/lib/libXrandr.so.2 (0x0061e000) libXinerama.so.1 => /usr/X11R6/lib/libXinerama.so.1 (0x00170000) libXext.so.6 => /usr/X11R6/lib/libXext.so.6 (0x009a3000) libXft.so.2 => /usr/X11R6/lib/libXft.so.2 (0x00600000) libXcursor.so.1 => /usr/X11R6/lib/libXcursor.so.1 (0x005a6000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x00179000) libexpat.so.0 => /usr/lib/libexpat.so.0 (0x0026f000) Whatever is under /usr/local/lib has been compiled from CVS a few weeks ago. I use LD_LIBRARY_PATH=/usr/local/lib to point "leafpad" to the correct libraries. I am going to recompile gtkimcontextsimple.c with a minimal table to see how it behaves (since I cannot add succesfully sequences, I will see if I can remove-easier).
[NOTE: Greek (modern/polytonic) has precomposed letters for each combination, if that makes any difference)] I just tried with table: static const guint16 gtk_compose_seqs[] = { GDK_dead_acute, GDK_Greek_omega, 0, 0, 0, 0x03CE, /* GREEK SMALL LETTER OMEGA WITH TONOS */ }; and I could only type the above deadkey-letter sequence. That is, success, my test installation works! I am trying now to add MIDDLE DOT. Still, 1) I do not know how the sorting of the table takes place so I can add appropriatelly my additions. 2) Tor, since I am testing this, I would like to try your patch with Greek. I noticed that the 32bit table does not replace the 16bit table. If you have a hint on what to add to get Greek to work, I am happy to try.
I tried the following table: static const guint16 gtk_compose_seqs[] = { GDK_dead_acute, GDK_period, 0, 0, 0, 0x00B7, /* That's middle dot, similar to the english ";" that makes a pause within a sentence. */ }; and it worked! I get middle dot (of course, no other deadkey+letter work). So I can add my combinations, pending my understanding of the order to add them.
Owen, could you please send me (http://simos.info/) the necessary scripts to aid me in converting Compose to the format gtkimcontextsimple.c requires? I have also just sent you a personal e-mail about this.
UPDATE Owen mentioned the script no longer exists, therefore I created one so that I can merge polytonic to the existing sequences and produce a well-formed table for inclusion in gtkimcontextsimple.c. I submit a patch below for gtkimcontextsimple.c which works for me, for both modern Greek and Polytonic. I also explain what the script does and how I solved any conflicts (same sequence but output a different resulting Unicode character). I used the Compose file that currently exists for el_GR.UTF-8 on X.org. If there is a further update to this Compose file, I'll update here accordingly.
Created attachment 49791 [details] [review] Adds Greek polytonic sequences to "guint16 gtk_compose_seqs[]" Extra care has been taken to keep the array gtk_compose_seqs[] sorted. In addition, any conflicts (two sequences producing different Unicode character) have been removed. If there was a conflict between Greek and Latin, the latter was prefered to stay in those decision. See below for more details on the decisions taken.
BACKGROUND "gtk_compose_seqs[]" (http://cvs.gnome.org/viewcvs/gtk%2B/gtk/gtkimcontextsimple.c) is an array made up of six elements: GDK_dead_grave, GDK_dead_iota, GDK_Greek_eta, 0, 0, 0x1FC2, /* GREEK_SMALL_LETTER_ETA_WITH_VARIA_AND_YPOGEGRAMMENI */ 1-5: Individual keysyms that compose a sequence to produce a Unicode character. 6: that's the Unicode character This array has (adding this patch) around 1800 entries, which have to be sorted based on the values of the keysyms. Therefore, a script should consult gdk/gdkkeysyms.h (http://cvs.gnome.org/viewcvs/gtk%2B/gdk/gdkkeysyms.h) when comparing. Each entry has a comment corresponding to the Unicode description of the codepoint. Although not essential, it's good to maintain. TASKS That's what the script does: 1. Read gdk/gdkkeysyms.h and keep in a hash the pairs (keysyms, value) 2. Read UnicodeData.txt (from Unicode.org) and keep in a hash the pairs (codepoint, description). 3. Read the existing values from gtk/gtkimcontextsimple.c and keep in memory. 4. Read the Compose file for Greek Polytonic (from x.org) and keep in memory. 5. Merge arrays created from steps 3 and 4. 6. Uniq, that is, remove any identical entries. That's a) same compose sequence, and b) same Unicode char. 7. Run through the array and save any conflicts, same compose sequence but different character. 8. Print the final array in format suitable to include in the .c file. (also refreshes the codepoint names from UnicodeData.txt, unicode.org). From step 7 we solve the conflicts (we choose which to keep) and update the final array. I tested this with GTK+ (CVS) and works for me (both Greek modern and polytonic). See below for the conflicts file (full list) and those that I removed/kept.
Created attachment 49793 [details] Full list of conflicts, (that is, same compose sequence but different result)
Created attachment 49794 [details] Compose sequences that got removed. These are the sequences removed. Two issues to note: A. If another language and Greek are at issue, the other language is prefered (few cases). B. Modern Greek has an accent called "TONOS" which in Polytonic it is "OXIA". Per Unicode standard they are equivalent. However, since the same dead key is used for both (;), it is obvious there is a conflict. The same key is used due to commonality of the accent. As far as I know, the X Input Method (XIM) cannot distinguish between layouts. I verified that both XIM and also WinXP, when in Greek Polytonic, produce TONOS rather than OXIA. Therefore, I kept TONOS, removing the OXIA keysyms. All in all, adding this patch we get the same functionality with either GTK+ IM (Default) or XIM for Greek Modern and Greek Polytonic.
Created attachment 49795 [details] [review] This diff shows what was removed from the conflicts file. Use this to view the decisions taken to resolve any conflicts.
Changing status from NEEDINFO to NEW.
Just to remind, the final patch is ripe to apply, at http://bugzilla.gnome.org/attachment.cgi?id=49791&action=view Over from me.
Created attachment 51465 [details] Perl script that merges new keysyms into gtk/gtkimcomposesimple.c Read http://bugzilla.gnome.org/show_bug.cgi?id=167940#c16 on how the script works and how to generate the input files. The input files need a bit of processing, see also the script source for more references. Once you run the script you will get information on how to generate the final version of the keysyms table found in gtk/gtkimcomposesimple.c In the next attachment I provide the sample input files that work for me.
Created attachment 51466 [details] Input files for script, can be used as test case. Contains input files and the resulting output files after you run the script. You can easily regenerate the input files for Greek or any other language.
X-linking with this bug report in Ubuntu: http://bugzilla.ubuntu.com/show_bug.cgi?id=15414
I am marking this bug as DUPLICATE, because: 1. I have started a new bug report about these issues, join at "Synch gdkkeysyms.h/gtkimcontextsimple.c with X.org 6.9/7.0" http://bugzilla.gnome.org/show_bug.cgi?id=321896 2. There is a new Compose file in Xorg 6.9/7.0 that contains Ancient Greek, Cyrillic and support for many more languages. Therefore, the issue now is not updating gtkimcontextsimple.c but rather recreating it easily from upstream. Also, gdkkeysyms.h needs to be updated as well. 3. This bug report has become very long; the first part, comments 1-13 is investigations while from #14 onwards is the actual report. *** This bug has been marked as a duplicate of 321896 ***
A reminder: This bug report has closed and a new one, bug 321896, has been created which is cleaner and followed better. There are there people watching this closed bug report, one of which just registered. I will be moving those three contacts to watch the new bug.