Bug 341341 – Compose mechanism in simple input method doesn't support decomposed forms

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 341341 - Compose mechanism in simple input method doesn't support decomposed forms


Summary:	Compose mechanism in simple input method doesn't support decomposed forms


Status:	RESOLVED OBSOLETE

Product:	gtk+
Classification:	Platform
Component:	Input Methods
Version:	2.9.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Hidetoshi Tajima
QA Contact:	gtk-bugs

URL:	https://gitlab.gnome.org/GNOME/gtk/is...
Whiteboard:

Depends on:
Blocks:

Reported:	2006-05-10 21:46 UTC by Danilo Segan
Modified:	2018-03-14 22:01 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Serbian ~/.Xcompose file (5.79 KB, text/plain) 2006-05-10 21:48 UTC, Danilo Segan	Details

Description Danilo Segan 2006-05-10 21:46:07 UTC

gtk/gtkimcontextsimple.h contains a table derived from en-US.UTF-8/Compose list.

However, it doesn't support deadkey combinations resolving to more than one Unicode character, which is needed for decomposed forms (especially when there are no precomposed forms, as is the case for Serbian Cyrillic).

Comment 1 Danilo Segan 2006-05-10 21:48:18 UTC

Created attachment 65204 [details]
Serbian ~/.Xcompose file

This file lists the combinations needed for Serbian. With recent X.Org or XFree, it can be put into ~/.Xcompose, and by selecting XIM as the Gtk+ input method.

Comment 2 Simos Xenitellis 2006-05-10 22:22:07 UTC

I did not manage to find decomposed example forms in the existing Compose file,
http://webcvs.freedesktop.org/xorg/lib/X11/nls/en_US.UTF-8/Compose.pre?view=markup
Are there any used already?

Is there a document that could show for which characters there are no precomposed forms but you can only use decomposed forms? In Latin, Greek and Cyrillic?
AFAIK, there are no precomposed glyphs since about Unicode 4.x+ (looking for reference).
I know at least that Coptic has no precomposed glyphs, so it needs this functionality in GTK+. 

(this is all new to me, trying to learn.)

Comment 3 Danilo Segan 2006-05-10 22:40:13 UTC

Simos, there are none in en_US.UTF-8/Compose file because nobody bothered to add them or push for them. At one point, I simply lost time to chase all the things we've needed for Serbian (you can see that I am the author of Serbian GNU libc locale, Serbian XKB layouts, did many Serbian translations, worked on DejaVu Cyrillic... at one point you simply lose energy to chase all the maintainers around ;).

The attached example, which I used to append to en_US.UTF-8/Compose file on my systems (actually, I usually added my own sr_CS.UTF-8/Compose) before ~/.Xcompose support was available, contains several such decomposed forms (using Cyrillic).


Also, Unicode decided to stop including precomposed forms even earlier than 4.x (I know people have asked for precomposed accented Serbian Cyrillic at least 3 years ago, only to be denied because they could get it as a combination).

Comment 4 Simos Xenitellis 2008-01-31 02:48:34 UTC

I believe that there is no existing function in glib that can determine if a sequence of Unicode characters is valid for a language (for example, CYRILLIC SMALL A WITH ACUTE).
I believe this information is available at ftp.unicode.org/Public/UNIDATA/NormalizationTest.txt and a subset can be extracted for the affected languages.

Comment 5 Samuel Thibault 2008-01-31 15:24:06 UTC

That file is only a test for normalization.
As said in bug 345254, all unicode combinations are supposed to be valid, and there is no real reason why <dead_foo>, <combining_foo> and <Multi_key> <foo> shouldn't be automatically converted into the unicode combining equivalent.

Comment 6 Simos Xenitellis 2008-02-20 23:54:03 UTC

Danilo, would it make sense to use a modified keyboard layout which assigns combining diacritics to keys?
As in 
http://blogs.gnome.org/simos/2008/02/20/keyboard-layout-for-combining-diacritics/

Is your issue that you have a mixed environment (some precomposed already in use)?

Comment 7 Simos Xenitellis 2008-09-10 23:02:56 UTC

I have put together a patch (bug 537457) that can make these compose sequences work with GTK+ IM.

I have tested with Khmer and Arabic compose sequences (already upstream in en_US/Compose.pre), and with these compose sequences (.XCompose file).

If you want to go ahead and add your compose sequences to XOrg, I can make sure that GTK+ IM will work with them. 

My guess for a timeline for this, is that you can get it in the next stable release of GTK+ in about six months time. You would have to plan the update to XOrg however earlier. 

Hope this helps.

Comment 8 Matthias Clasen 2018-02-10 04:34:33 UTC

We're moving to gitlab! As part of this move, we are closing bugs that haven't seen activity in more than 5 years. If this issue is still imporant to you and
still relevant with GTK+ 3.22 or master, please consider creating a gitlab issue
for it.