After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 345066 - backspace changes independent indic characters
backspace changes independent indic characters
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: indic
unspecified
Other All
: Normal normal
: ---
Assigned To: Pango Indic
Pango Indic
: 348107 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2006-06-16 02:06 UTC by LingNing Zhang
Modified: 2012-08-09 06:35 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
screenshot of gedit (23.44 KB, image/png)
2006-06-16 02:07 UTC, LingNing Zhang
  Details
Fix based on the arabic lang engine (1.05 KB, patch)
2007-06-17 06:39 UTC, Sayamindu Dasgupta
none Details | Review
patch for handling indic NFC (1.46 KB, application/octet-stream)
2009-12-21 09:07 UTC, Pravin Satpute
  Details
total indic characters require fix (276 bytes, text/plain)
2009-12-21 09:14 UTC, Pravin Satpute
  Details
patch to fix backspace behaviour for Indic characters (2.63 KB, patch)
2009-12-23 05:00 UTC, Parag AN
none Details | Review

Description LingNing Zhang 2006-06-16 02:06:34 UTC
Please describe the problem:
Opened by Jatin Nansi (jnansi@redhat.com)  	 on 2005-01-18 08:03 EST  	[reply]  	   Private

Description of problem:
when backspace key is hit after a consonant with a dot at the bottom,
the dot vanishes and the result is an independent and unrelated
alphabet. The dot in this case is not a vowel sign, it is a part of
the consonant itself. 
The 1st example is of Bengali Yaa - <09DF>. A backspace changes it to
a bengali Ya - <09AF>.
See attached image for example. The image uses the bengali probhat
keyboard layout.


Version-Release number of selected component (if applicable):
1.6.0-7


How reproducible:
Every time


Steps to Reproduce:
1. Start gedit in bengali locale
2. Ctrl+space, F6
3. press 'z' then bkspace
  
Actual results:
The 'dot' below the yaa character gets deleted, and it becomes a ya.


Expected results:
The complete yaa character should get deleted.


Additional info:
Tested on RHEL4-RC-0107.0 WS

Steps to reproduce:



Actual results:


Expected results:


Does this happen every time?


Other information:
Comment 1 LingNing Zhang 2006-06-16 02:07:22 UTC
Created attachment 67457 [details]
screenshot of gedit

screenshot of gedit
Comment 2 LingNing Zhang 2006-06-16 02:17:33 UTC
the same bug in RedHat bugzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=145431
Comment 3 Behdad Esfahbod 2006-06-17 01:45:58 UTC
Yeah, this should be fixed in the Indic Pango module by correctly setting the backspace_deletes_character (or similar named) bit.
Comment 4 LingNing Zhang 2006-07-20 09:50:04 UTC
I debug this bug and I find that this bug is not the bug of Pango, but the bug of gtk. 
gtk_entry_backspace( ) and gtk_text_buffer_backspace( ) need be modified.
I will write a patch for this bug.
This bug can be closed. This is not a bug of Pango.
I filed a new bug of gtk.
http://bugzilla.gnome.org/show_bug.cgi?id=348107
Comment 5 LingNing Zhang 2006-07-21 03:13:48 UTC
I wrote a patch for this bug.
The patch is below:
http://bugzilla.gnome.org/show_bug.cgi?id=348107
Comment 6 Matthias Clasen 2006-07-22 13:46:14 UTC
See Owens comment on the other bug. This should be fixed in the indic module,
not by special-casing inside some widgets.
Comment 7 LingNing Zhang 2006-07-24 03:00:56 UTC
This bug is not relative to pango, the reason of creating this bug is that
calling fg_utf8_normalize( ) in gtk_entry_backspace( ) or
gtk_text_buffer_backspace( ). When 0x09df passes on to g_utf8_normalize( ), it
returns 0x09af and 0x09bc. Then it becomes two glyphs. 
It looks like that this bug is relative to glib, because g_utf8_normalize( )
finds the conjuctions in decomp_table[ ] of gunidecomp.h .
Comment 8 LingNing Zhang 2006-07-24 08:06:14 UTC
Sorry, I found that this bug is not relative to glib, but still is relative to
gtk.
And I wrote a new patch for this bug.
http://bugzilla.gnome.org/show_bug.cgi?id=348107
Comment 9 Behdad Esfahbod 2006-09-08 17:39:12 UTC
*** Bug 348107 has been marked as a duplicate of this bug. ***
Comment 10 Behdad Esfahbod 2006-09-08 17:39:58 UTC
Seems like an Indic language engine is needed.
Comment 11 Behdad Esfahbod 2006-09-18 22:14:48 UTC
Ok, we now have an Arabic lang engine in HEAD that implements the exact same feature requested in this bug but for Arabic.

LingNing, can you write the Indic module?

See bug 350132 for the Arabic module.
Comment 12 Behdad Esfahbod 2006-10-12 18:50:54 UTC
The Indic lang engine is there.  Just see what the Arabic lang engine is doing and do similarly in the Indic engine.
Comment 13 Sayamindu Dasgupta 2007-06-17 06:39:45 UTC
Created attachment 90121 [details] [review]
Fix based on the arabic lang engine

The attached patch handles the following characters 
* Bengali RRA (U+09DC)
* Bengali RHA (U+09DD) 
* Bengali YYA (U+09DF).
It is based on the arabic lang engine fix, as suggested in comment #12.
Comment 14 Rahul Bhalerao 2008-06-16 09:37:15 UTC
I think this is a bug with normalization and should be fixed there only. Bengali Yaa need not be normalized to a nukta form if it an independent character. But this is defined in Unicode Character Database and need to be fixed there first.

09DF;09AF 09BC;09AF 09BC;09AF 09BC;09AF 09BC; # (য়; য◌়; য◌়; য◌়; য◌়; ) BENGALI LETTER YYA

http://www.unicode.org/Public/UNIDATA/NormalizationTest.txt
Comment 15 Pravin Satpute 2009-12-21 09:07:40 UTC
Created attachment 150160 [details]
patch for handling indic NFC

behdad as per your comment at 
https://bugzilla.gnome.org/show_bug.cgi?id=350132#c20

attaching here just for review, since already same kind of bug 
 
but somehow its not working for split matras (IS_SPLIT_MATRA_BRAHMI), since
 
(0995 + 09cb ) after NFC it becomes (09c7 + 0995+ 09be)  
and single backspace key deletes all(0995 + 09cb) :(

it will be nice if we can keep this going here now :)
Comment 16 Pravin Satpute 2009-12-21 09:14:50 UTC
Created attachment 150162 [details]
total indic characters require fix

this is a list of total characters required backspace fix.
Comment 17 Parag AN 2009-12-23 04:57:20 UTC
Thanks. I have created patch that will cover all these SPLIT_MATRAS and COMPOSITE characters which need correct backspace behavior.

one can test Fedora 12 build of pango from http://koji.fedoraproject.org/koji/taskinfo?taskID=1885360
Comment 18 Parag AN 2009-12-23 05:00:54 UTC
Created attachment 150272 [details] [review]
patch to fix backspace behaviour for Indic characters
Comment 19 Behdad Esfahbod 2010-03-04 01:44:09 UTC
Patch committed.
Comment 20 Pravin Satpute 2012-08-08 07:08:02 UTC
Somehow we missed character U+0929 in this patch. Should i provide patch for adding that character?

https://bugzilla.redhat.com/show_bug.cgi?id=501900 

Or does harfbuzz-ng will deprecate all these fixes?
Comment 21 Rahul Bhalerao 2012-08-08 10:17:14 UTC
▼ Hide quoted text

Problem here is, what if the user intends to delete only the nukta (dot sign for which normalization is done)? In such case it does not make a good experience if the whole character is deleted. In any case most of such nukta added characters are typed using two keystrokes and most keylayouts do not have a single direct key to input these. Hence to support normalization and still not create too much of user experience glitch, a good tradeoff would be to keep it as it is and deprecate these fixes at least for the nukta cases.
Comment 22 Pravin Satpute 2012-08-08 10:34:59 UTC
(In reply to comment #21)
> ▼ Hide quoted text
> 
> Problem here is, what if the user intends to delete only the nukta (dot sign
> for which normalization is done)? In such case it does not make a good
> experience if the whole character is deleted. In any case most of such nukta
> added characters are typed using two keystrokes and most keylayouts do not have
> a single direct key to input these. 

With applied patch things happening as you said/expects i.e. backspace deleting characters as per users input.
Comment 23 Behdad Esfahbod 2012-08-08 17:51:05 UTC
HarfBuzz doesn't fix cursoring and deletion issues, but now that I have a better understanding of the Indic scripts, I expect to rewrite the Pango Indic language module in a few months...
Comment 24 Pravin Satpute 2012-08-09 06:35:22 UTC
that is good to know. thanks you.