GNOME Bugzilla – Bug 348107
backspace changes independent indic characters
Last modified: 2006-09-08 17:39:12 UTC
Please describe the problem: Please describe the problem: Opened by Jatin Nansi (jnansi@redhat.com) on 2005-01-18 08:03 EST [reply] Private Description of problem: when backspace key is hit after a consonant with a dot at the bottom, the dot vanishes and the result is an independent and unrelated alphabet. The dot in this case is not a vowel sign, it is a part of the consonant itself. The 1st example is of Bengali Yaa - <09DF>. A backspace changes it to a bengali Ya - <09AF>. See attached image for example. The image uses the bengali probhat keyboard layout. Version-Release number of selected component (if applicable): 1.6.0-7 How reproducible: Every time Steps to Reproduce: 1. Start gedit in bengali locale 2. Ctrl+space, F6 3. press 'z' then bkspace Actual results: The 'dot' below the yaa character gets deleted, and it becomes a ya. Expected results: The complete yaa character should get deleted. Additional info: Tested on RHEL4-RC-0107.0 WS Steps to reproduce: . Actual results: Expected results: Does this happen every time? Other information:
In the first this bug was token as the bug of Pango, the address is : http://bugzilla.gnome.org/show_bug.cgi?id=345066 But I debug this bug and I find that this bug is not the bug of Pango, but the bug of gtk. gtk_entry_backspace( ) and gtk_text_buffer_backspace( ) need be modified. I will write a patch for this bug.
I debug this bug and I find that this bug is not the bug of Pango, but the bug of gtk. gtk_entry_backspace( ) and gtk_text_buffer_backspace( ) need be modified. I will write a patch for this bug. I filed a new bug of gtk in Gnome bugzilla. http://bugzilla.gnome.org/show_bug.cgi?id=348107
Sorry, the above message can be delete. :)
Created attachment 69303 [details] [review] my patch I wrote a patch for this bug. This bug has been fixed.
I don't think special-casing in the widget is at all good idea. Can't this be handled by setting the backspace_deletes_char Pango logical attribute appropriately? For example: if we have a Indic syllable corresponding of only a base consonant, then set backspace_deletes_char to false after that position?
This bug is not relative to pango, the reason of creating this bug is that calling fg_utf8_normalize( ) in gtk_entry_backspace( ) or gtk_text_buffer_backspace( ). When 0x09df passes on to g_utf8_normalize( ), it returns 0x09af and 0x09bc. Then it becomes two glyphs. It looks like that this bug is relative to glib, because g_utf8_normalize( ) finds the conjuctions in decomp_table[ ] of gunidecomp.h .
Would you change this bug to glib's bug?
Created attachment 69456 [details] [review] modified patch Sorry, I found that this bug is not relative to glib, but still is relative to gtk. In gtk_entry_backspace( ) and gtk_text_buffer_backspace( ), it calls g_utf8_normalize( ) to handle the event of backspace. And this time, the string of 0x09df(not only 09df, but also thousands of unicodes, for example, 06d3, 0929, 09dd, 1db4 and so on) becomes 0x09af and 0x09bc. Then when the key of "backspace" is pressed, it will delete the string of 0x09af and 0x09bc, not delete the string of 0x09df. I deleted g_utf8_normalize( ) for handling the event of backspace in gtk_entry_backspace( ) and gtk_text_buffer_backspace( ).
The operation of GTK+ and Pango should not depend on the normalization form of the input text. So, hitting backspace after the combination: U+9AF U+9BC Must do the same *exactly* the same thing as hitting backspace after U+9DF So your patch is not the right way of going about things. If a backspace after U+9DF should delete both the base character and the combining mark, then a backspace after U+9AF U+9BC must delete both characters as well. Please look at the backspace_deletes_character field of PangoLogAttr, as I mentioned in comment 5. (Note that two sequences above may not render the same ... see bug 139950. But just because we have bugs in rendering, we shouldn't introduce them in editing.)
Must a backspace after U+9AF U+9BC delete both characters? I think a backspace after U+9AF U+9BC should delete one character(U+9BC), but a backspace after U+9DF should delete both the base character and the combining mark.
The editing behavior can not depend on internal encoding differences that aren't visible to the user. (While you may consider U+9DF the more "normal" way to represent this form, not everybody agrees. For example, Mac OS X filenames are stored in Unicode NFD, so the text would appear as U+9AF U+9BC there.)
LingNing, the widget must behave as if the text was given in NFD, thats why we normalize there.
*** This bug has been marked as a duplicate of 345066 ***