GNOME Bugzilla – Bug 165723
Indic language input with 3rd party keyboard layouts
Last modified: 2018-04-14 23:56:19 UTC
Please describe the problem: I cannot type Bangla (Indic language) texts in any GTK depended software. I cannot type Bangla directly in GAIM, GIMP. It shows me "????" only. But when I copy and paste Bangla texts from another software, it becomes OK. So I think GTK has some problem. Would anyone check the problem and add this input support it in the next releases? Steps to reproduce: 1. When I changed my Keyboar layout and start Typing 2. 3. Actual results: Expected results: Does this happen every time? Yap Other information: Put Indic input support, Please...
*** Bug 165722 has been marked as a duplicate of this bug. ***
Unfortunately I only have Windows 2000 (which doesn't have any Bangla input locale) currently, and cannot debug this. Hmm, but Windows 2000 does have input locales for Hindi and Tamil, for instance, and I can reproduce the problem with them. Fixing that will presumably fix it for Bangla, too.
Fix in HEAD and gtk-2-6: 2005-02-01 Tor Lillqvist <tml@novell.com> * gdk/win32/gdkkeys-win32.c (handle_special, set_shift_vks, reset_after_dead, handle_dead): New functions, code blocks refactored out of update_keymap(). No functionality change. (update_keymap): Use ToUnicodeEx() when available (on NT-based Windows) instead of ToAsciiEx(). Makes keyboard input work in Unicode-only input locales that don't have any ANSI codepage, for instance Hindi and Bengali. Use _gdk_input_codepage only on Win9x. (#165723) * gdk/win32/gdkevents-win32.c (gdk_event_translate): On WM_INPUTLANGCHANGE, use GetLocaleInfo() instead of TranslateCharsetInfo() to get the input locale's corresponding codepage, if any.
Bangla, Hindi, Tamil, Telegu, Sanskrit all are Indic language. If you found Hindi working on Windows then other indics will work depending on Windows script processor.
Why reopen? Yes, I know those are Indic languages. To the best of my knowledge, the fix works for them all. The fix affects only keyboard input of these languages, output has worked already since long ago. Why did you change the target milestone to win32-1.3? That refers to the so-called gtk-1-3-win32-production branch, which is obsolete since long ago. Are you really using that?
[quote]Why did you change the target milestone to win32-1.3? That refers to the so-called gtk-1-3-win32-production branch, which is obsolete since long ago. Are you really using that?[/quote] Well, I got flu and didn't notice what exactly I am doing. If you have any system then please roll back and ignore what I post as win32-1.3. I'm not using that. Also I'm very new with bugzilla stuff, so it would take some time for me to understand. By the way, when we may expect for a fix for Indic Input?
Developers, Is there any progress to sort out the issue I mentioned?
To the best of my knowledge this bug has now been fixed (in the GTK+ source code). The fix will be in the GTK+ 2.6.2 (source code) release. Hopefully I will be able to provide Windows binaries (in the form of zipfiles) of GTK+ 2.6.2 shortly after that. Some time after that there will be Windows installers.
I'm waiting for the ZIP files for Windows. :) Any ways. In Linux, most of the applications are depended on Pango for complex script processing and in Windows, applications are depended on Uniscribe engine. I found a folder called Pango inside GTK folder. Well, I don't have much knowledge on Pango, but what I want to know one thing. In win32, GTK output depends on Pango or the Windows Uniscribe Engine?
Linux: GTK+ => Pango library => Pango layout modules Windows: GTK+ => Pango library => Uniscribe
Cool!!! Can't wait to meed the Win32 version of GTK 2.6.2 :)
I just downloaded ftp://ftp.gtk.org/pub/gtk/v2.6/win32/gtk+-2.6.2.zip and after extracting that I overwrite all the files inside "Program Files\Common Files\GTK\2.0" with the extracted ones. But when I went to start GAIM, I found 2 errors. First one is: Entry Point Not Found The procedure entry point g_assert_warning could not be located in the dynamic link library libglib-2.0-0.dll. Second one: error loading gaim.dll, Error: 127 Something likes that. Is there any problem with the process I followed or there are some errors actually?
GTK+ 2.6.2 requires GLib 2.6.2, Atk 1.8.0 and Pango 1.9.0. You will have to download and install them, too, in the same place where you installed GTK+ 2.6.2. ("Install" here meaning just "unzip".)
So Input of Indic language is quite OK now. But there is still one problem. The input does not understand DeadKey input. As my layout has DeadKey in it, but still I cannot type the letters which is supplied be the dead key. I hope another update is needed.
Sorry, but I don't know wat you mean with "dead key" in the context of Indic languages? As far as I can see from the http://www.microsoft.com/globaldev/reference/keyboards.aspx page, there aren't any dead keys on the Indic keyboards like Bengali, Hindi or Gajarati? (Compare to Latin keyboard layouts like French or Finnsh, with several dead keys for accents and other diacritics.)
Bro, There are not deadkey in their default layouts, which are neither used by the Indians nor by the Bangladeshis. Bangladesh uses the typical customized layouts from http://www.ekushey.org/projects/shadhinota/ and Ministry if IT (India) supplies layouts for theirs users. So there are deadkeys. :) Microsoft is Microsoft. They think you will eat only what they will supply. That is why people are switching towards alternate solutions these days. I think another update is needed.
> There are not deadkey in their default layouts, which are neither > used by the Indians nor by the Bangladeshis. That sounds rather incredible. Why would Microsoft on purpose deliver keyboard layouts that "nobody" uses? I think they explicitly say in some FAQ that for each locale they want to use the most widely used keyboard layouts, including official standards if there are such. Are you sure you aren't overestimating the popularity of these 3rd-party layouts? (I.e., they might be popular among "hackers", but what about typical computer users, i.e. office workers?) That ekushey.org page claims that "the official Microsoft Bangla Layout is based on the Indian INSCRIPT layout". However, the Microsoft keyboard layout page has two layouts for Bengali, "Bengali" and "Bengali (INSCRIPT)". (Which is the preferred form, BTW, Bengali or Bangla? Or are they separate languages?) Are these third-party keyboard layouts such that have more shift states than the "normal" shift and altgr? Those aren't supported yet in GTK+/Win32, see bug #165385. Note that this affects also some European keyboard layouts, for instance Czech.
Well, I知 working on many Bangla computing and localization projects from a long time and to be frank with you I知 also working with Microsoft on the official Bangla opentype font. I agree with you that Microsoft has 2 layouts on their site. But although they have 2 different names, both layouts are same. Please look at them carefully. Ekushey.org says what is right MS layout is based on Inscript but that is not popular in India too. In India, people use Prabhat layout (you can find that in ekushey.org page). The layouts ekushey.org is delivering is made with MSKLC (http://www.microsoft.com/globaldev/tools/msklc.mspx) so there is no hacking. The deadkey feature is available from a long time back on Win32 (donno about Linux) and they works fine is all applications. I don稚 know why this is not working with GTK and what problems you have to implement that practically.
So, Can we expect any advance input support in future? Like DeadKey, AltGR etc. ?
The more people working on this, the faster it will be fixed (hint, hint). Most probably, the right way to fix this is to rework the gdkkeys-win32.c file altogether, see bug #165385. You said that the two Bengali layouts on the Microsoft keyboard layouts page would be identical. They are definitely not. Look carefully yourself ;-) The "Bengali" one has arabic digits on the top row in the base state, the "Bengali - INSCRIPT" one has Bengali digits. The "Bengali" layout has AltGr (and the Bengali digits are accessed through AltGr), the "Bengali - INSCRIPT" does not have AltGr. Plus other differences.
We don't use that layout in Bangladesh. The official Bangla layout for Bangladesh is http://www.ekushey.org/projects/shadhinota/national.html which has DeadKey and AltGR keys. :(
Sigh, I tried to install the keyboard layout from that page, but running the setup.msi results in the error "The installer has encountered an unexpected error installing this package. This may indiocate a problem with this package. The error code is 2103". This is on an English Windows 2000 machine.
You must log-in as administrator or you should have administrator privilege to install the software. I think you don’t need to download and use the software, download the layout definition from http://www.ekushey.org/files/Ekusheyr%20Shadhinota.pdf location.
I did run is as Administrator. (Not logged in, but in a command prompt started with runas /user:administrator.) I'll try as logged-in administrator later. Yes, I do need to actually have the keyuboard layout installed to be able to test it. That pdf file is definitely not sufficient.
The conversation is going longer and longer. Please mail me directly to omiazad at gmail.com then I can help you to understand the layout Omi
But in order to fix GTK+ it isn't enough to just understand how to use the layout, I need to actually use it while running GTK+ test programs. I don't think there is any problem in keeping this conversation here, that way it will be archived for future reference.
Any ways, tell me the progress with your installation and testing. If you face any problem, then please feel free to let me know. Also you may need some Unicode Bangla fonts (as you are on Windows 2000, which doesn't have any Bangla fonts in built with it). You can find some fonts in http://bangla.ekushey.org 's OTF Bangla fonts section. Omi
Well, I now have a new machine with Windows XP, so the problem I had on Windows 2000 is irrelevant... The installer worked fine on this box. I haven't tried debugging this yet, though, it will take some time to set up a development environment on this machine (I don't wanti to simply copy all the historical baggage from the old machine).
Hello Tor, You didn't say anything about dead key input. What is the progress so far? Omi
As Tor is not responding, can anyone else respond to this issue? One more thing that I checked the same layout made for m17n library and tested it with SCIM on Linux. But that is working fine in Linux. So I think the problem is with Windows only.
I am looking at this now. I have the Bengali keyboard layout installed, and a font that covers Bengali (SolaimanLipi). Bengali input in WordPad seems to work as it should, if I try the stuff linked to from the ekushey webpages, or from the Unicode book. You can't compare "bare" GTK+ on Windows to SCIM+GTK+ on Linux. If you use SCIM on Linux, the corresponding thing on Windows would be to use an IME (such as those used for Chinese, Japanese and Korean), I think. Comparing to WordPad, one thing that definitely doesn't work in GTK+ is what happens if you type (I'm using the "normal" English key names here) for instance JGJ or JGT (the first two entries in the lower table on http://www.ekushey.org/projects/shadhinota/technical/phonetic_bangla_typing.htm).I I'll see if I can find some solution to this that wouldn't require a special input module for GTK+.
Fixing this was surprisingly trivial, just a one-line change (effectively) to gdkkeys-win32.c... Now the bengali virama works as it should. The change makes handle_dead() by default return the keysym as such if it isn't one of the special cases that handle_dead() was written for (i.e. the ones where the keysym that comes in isn't "dead", but we want to pass on the corresponding "dead" keysym). I.e. if only GDK lets the keysym corresponding to U+09CD through normally, things work as they should. Fixed in gtk-2-6 and HEAD. 2005-05-23 Tor Lillqvist <tml@novell.com> * gdk/win32/gdkkeys-win32.c (handle_dead): If the keysym isn't one of the special cases this function takes care of, use it as such. This takes care of for instance the Bengali Virama, see bug #165723.
Is there any place from where I can take the testing version and check if it is working or not?
Yes, you can build it yourself from source, from CVS. But that is not easy on Windows, where setting up a working build system can be hard, if one isn't accustomed with the tools and "culture". (For instance, experience with Microsoft's Visual C++ is mostly useless.) But certainly, it would be very nice if you would do this, what GTK+ on Win32 needs *is* more developers who can build and debug it themselves. Otherwise, not really. Distributing a snapshot build of the GDK DLL would mean that I would have to put together a corresponding source snapshot, too. That is just too much work for just one bug. Wait for GTK+ 2.6.8 and its Win32 build.
I'll wait for the release. Then check for any problem...
Sigh, there apparently still are problems after the above fix. The bug reporter now tells me that the sequences GH and GA for instance don't work. And I can confirm that in WordPad. Why couldn't he give exact instructions what keys to type in the first place? Oh well.
The situation with the JGJ (U+0995 U+09CD U+0995) or JGT (U+0995 U+09CD U+099F) -style sequences is very different from the GH (U+09CD U+09BE)sequene. The U+0995 U+09CD U+0995 code points are stored in the text that has been input as such, and it is then Uniscribe that takes care of doing its magic and presenting them as a unique grapheme cluster (or whatever the correct term is). The GH key sequence is quite different. If I type that into WordPad, what gets stored in the text buffer is not U+09CD U+09BE, but the single code point U+0986. Apparently it is Windows's common controls that does this? Anyway, the right place to do the same in GTK+ would be the gtk_compose_seqs table in gtkimcontextsimple.c, I presume. However, that table uses guint16 values for GDK keysyms, and GDK keysyms cover only a small part of Unicode code points. Other keys are represented as their Unicode code point plus 0x0100000, i.e. they don't fit into a guint16. So in order to get the <U+09CD U+09BE> => U+0986 mapping into that table, the type will have to be changed to guint32. Before doing this I would like input from Owen or Matthias, though. Will attach a suggested patch soon.
It might be interesting to see what happens when you turn the IME support on .. I think trying to copy what windows does for input methods exactly within the scope of gtkimcontextsimple.c is a losing battle ... in fact, I think for good Indic/Thai/etc. input methods simple composition isn't going to be sufficient.
Well, no IME is involved in Bengali input with this 3rd-party keyboard layout, anyway. Presumably not with the Microsoft-supplied keyboard layouts, or other Indic languages either. Don't know about Thai.
Created attachment 48332 [details] [review] Suggested patch OK, so I didn't change the data types in the existing table, instead I introduced a new type of table with 32-bit data, and added such a table with just the Bengali character sequences in question. (Omi, are there more two-key sequences than "GH" and "GA" that should get turned into a single Unicode character?) I have a single U+09CD mapping into itself, too, in the table, so that if it isn't followed by U+09BE or U+09C3 it will insert itself. Owen, would this patch be acceptable? BTW, has there been any enhancement request for ways to entering combining diacritical by themselves (other than the control+shift+hex way)? I noticed that by simply adding rows to the gtk_compose_seqs table with for instance GDK_dead_grave by itself mapping to U+0300, this would work. One can then add accents to letters they normally don't occur with (and for which there is no precomposed letter or sequence in the table). Of course, one then has to type the accent after the letter with which it should combine, as this is how Unicode combining characters work, which is rather unatural if one is accustomed to using the normal dead accents on Latin keyboards. Hmm, what about a hack like this: if a dead key is followed by a key that doesn't match any sequence, instead of beeping, the code would commit the second character followed by the combinig character corresponding to the dead key? Then one could type things like <dead-acute, atsign> and get an atsign with an acute accent... Very useful ;-) Just kidding, I guess.
Let me explain how the layouts and DeadKey works with Microsoft. If you go through the source, you'll find this: KEYNAME_DEAD 09cd "BENGALI SIGN VIRAMA" //This means 09CD is a dead key. DEADKEY 09cd 09c7 098f // ে -> এ (Means: If you press deadkey then 09C7 then you'll get 098F as output. 09cb 0993 // ো -> ও (Means: If you press deadkey then 09CB then you'll get 0993 as output. 09c3 098b // ৃ -> ঋ (Means: If you press deadkey then 09C3 then you'll get 098B as output. 09be 0986 // া -> আ (Means: If you press deadkey then 09BE then you'll get 0986 as output. 09bf 0987 // ি -> ই 09c1 0989 // ু -> উ 09c8 0990 // ৈ -> ঐ 09cc 0994 // ৌ -> ঔ 09c0 098a // ী -> ঊ 09c2 098a // ূ -> ঊ 0020 09cd // -> ্ Going through your post I found you went too deep. The deadkey has nothing to do with IME. It's just a magic for older style layouts.
Hmm, you should know that the Windows-specific part of GTK+ (gdk/win32) doesn't handle dead keys itself. It passes the dead keys on to the platform-independent GTK+, which handles the combining. So those sequence will have to be added to the table in gtkimcontextsimple.c. gdk/win32 does its keyboard handling at a rather low level, handling just the WM_KEYDOWN and WM_KEYUP messages resulting from individual keys being pressed and released. This is because GDK needs to generate separate events for key press and release, and also because it's simpler, and matches the way X11 works more closely.
What does that mean? What about those layouts who are not supported by X11? We have lots of Asian layouts supported by SCIM. But they works fine with GTK in Linux. What did you decide then?
I guess I am waiting for Owen or Matthias to comment on the patch. Will it break something or be against user expectation on X11, for instance?
I'm really not at all comfortable with using gtkimcontextsimple.c as a place to fix this problem. ... it sounds like this is something that is configuration in the keyboard layout ... different keyboard layouts could map the same combinations to different things, and GTK+ should be picking that up.
GTK should not handle anything itself. It should depend on the system's default way, which could be anything.
GTK should not handle anything itself. It should depend on the system's default way, which could be anything. But who's going to read it and make the step. It's not about Gaim (when I wrote this bug, this was the only GTK depended application I used), All product of Gnome office (though I don't know why AbiWord doesn't depend on GTK/Pango) should depend on GTK.
Well, no action so far. I think you need to have a look at http://sourceforge.net/support/tracker.php?aid=1474165
Um, why? I don't see anything useful, like a patch, in there?
The issue is 1 year+ old and I think it's my problem that I could not make the developer understand what steps should be taken. We can I do? GTK applications are not working well in Windows.
> [What] can I do? Ask for your money back? Write a patch?
I'm not a developer, I'm just a end user. So I cannot make a patch myself. I thought if the problem is fixed, Bangla speaking people may find GTK+ applications useful. If not.....
Patch applies cleanly. Reviewing it would take a while. (Working on http://mail.gnome.org/archives/gtk-devel-list/2007-March/msg00148.html)
Can't test this with Windows 10 Bengali keyboard layouts (any of them). That is, i can't find a single deadkey in these layouts. Unicode character composition - yes, that seems to be working. But none of the keys that i've tried seems to be working as a dead key.
The layouts in question were mentioned in Comment 16.
We're moving to gitlab! As part of this move, we are moving bugs to NEEDINFO if they haven't seen activity in more than a year. If this issue is still important to you and still relevant with GTK+ 3.22 or master, please reopen it and we will migrate it to gitlab.
As announced a while ago, we are migrating to gitlab, and bugs that haven't seen activity in the last year or so will be not be migrated, but closed out in bugzilla. If this bug is still relevant to you, you can open a new issue describing the symptoms and how to reproduce it with gtk 3.22.x or master in gitlab: https://gitlab.gnome.org/GNOME/gtk/issues/new