After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 353877 - Sinhala is_cursor_position and backspace_deletes_character issues
Sinhala is_cursor_position and backspace_deletes_character issues
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: indic
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Pango Indic
Pango Indic
Depends on:
Blocks:
 
 
Reported: 2006-09-01 17:32 UTC by Behdad Esfahbod
Modified: 2006-10-12 18:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
proposed patch (4.01 KB, patch)
2006-09-28 15:11 UTC, Akira TAGOH
none Details | Review
proposed patch (4.01 KB, patch)
2006-09-29 08:31 UTC, Akira TAGOH
rejected Details | Review
proposed patch for HEAD (7.41 KB, patch)
2006-10-04 08:23 UTC, Akira TAGOH
none Details | Review
proposed patch for 1.14.4 (7.54 KB, patch)
2006-10-04 08:26 UTC, Akira TAGOH
none Details | Review
updated patch for #353877 and #345069 for HEAD (7.53 KB, patch)
2006-10-05 10:13 UTC, Akira TAGOH
committed Details | Review

Description Behdad Esfahbod 2006-09-01 17:32:16 UTC
From:

  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=204934

The cursoring and the delete opereaion using DELETE key not behaving properly.

Version-Release number of selected component (if applicable):
pango-1.14.2-1.fc6
gedit-2.15.9-1.fc6

How reproducible:
Always

Steps to Reproduce:
1. Open the URL :
http://batman.brisbane.redhat.com/~indic/IndicTC/lang/si_LK/font/IndicFontTestCaseGSUB-Sinhala.html
2. Open gedit
3. Copy and paste the composed char from the URL text box given.
4. Place the cursor either right or left hand side of the char.
5. allow the cursor to go through from right side to left side or vice-varsa by 
   left or right arrow key.
6. Observe the result

  
Actual results:
cursor is going through wrongly accross the char.

Expected results:
cursor should go extreme left and right side with only one left or right arrow
keysrtoke.
Comment 1 Akira TAGOH 2006-09-28 15:11:58 UTC
Created attachment 73564 [details] [review]
proposed patch

pango_default_break doesn't support Sinhala. Pango needs an exception to deal with Sinhala because it's using ZWJ for incorrect purpose and there are some special case to take care of the chunk of characters as one character.

For reference, it's documented in Sinhala Standards (http://fonts.lk/doc/sinhala%20standards.pdf)
Comment 2 Akira TAGOH 2006-09-29 08:31:18 UTC
Created attachment 73616 [details] [review]
proposed patch

just fix a typo like s/Raphaya/Rephaya/g ...
Comment 3 Behdad Esfahbod 2006-10-02 20:50:45 UTC
This is not acceptable.  Please write a lang engine for the script, like I did for Arabic in HEAD.  Check out pango/modules/arabic/arabic-lang.c
Comment 4 Akira TAGOH 2006-10-03 15:06:57 UTC
Thanks for the info. BTW I'm seeing the strange behavior on frequently turned is_cursor_position on/off when I'm monitoring in my own script_break function. it looks like sync'ed with the cursor blink. I presumed that it should always take along the same thing. is it an expected behavior?
Without this assumption, probably I need to do same thing like pango_default_break() does, because I'm just going to backport my initial patch to the pango language module, which just turned is_cursor_position off for some Sinhala letters and relying on the default behavior for other cases.
Comment 5 Behdad Esfahbod 2006-10-03 16:54:39 UTC
You should be testing with pango head, not 1.14.x.  Language modules in all versions before current head were seriously broken out of repair.

With head, just look at what the arabic-lang.c module is doing.  Pango has already called pango_default_break() for you.  You just need to fixup the result.
Comment 6 Akira TAGOH 2006-10-04 08:23:39 UTC
Created attachment 73997 [details] [review]
proposed patch for HEAD

Yes, I noticed that too pango_break() was improved/fixed in HEAD.
Comment 7 Akira TAGOH 2006-10-04 08:26:31 UTC
Created attachment 73998 [details] [review]
proposed patch for 1.14.4

for 1.14.4, just invoking pango_default_break() in own script_break before correct.
Comment 8 Behdad Esfahbod 2006-10-04 16:27:36 UTC
Thanks Akira.  Looks good.  Now, can you update the module to fix Bug 345069 and Bug 345066 too?
Comment 9 Akira TAGOH 2006-10-05 00:52:00 UTC
Sure. let me try.
Comment 10 Akira TAGOH 2006-10-05 10:13:09 UTC
Created attachment 74041 [details] [review]
updated patch for #353877 and #345069 for HEAD

After some looking, I don't think we really need any extra code to solve Bug#345069 if we has _empty_ language module for Devanagari and others perhaps, which means do nothing for them.  If any language module is available for the given character, Pango is going to split them off to deal with them in the language module. and the dependent vowels itself should be marked as the cursor movable.

So attached patch should help.

For Bug#345066, need to investigate much further. it looks to me like it always enters into the normalized-block unless one press the backspace within the composed character, because the current position should points to the next character, which is most likely be turned backspace_deletes_character on. so doesn't it look at log_attrs[current_pos - 1] or similar, otherwise doesn't it evaluate PangoLogAttr with the normalized one?
Comment 11 Behdad Esfahbod 2006-10-05 15:43:34 UTC
(In reply to comment #10)
> Created an attachment (id=74041) [edit]
> updated patch for #353877 and #345069 for HEAD
> 
> After some looking, I don't think we really need any extra code to solve
> Bug#345069 if we has _empty_ language module for Devanagari and others perhaps,
> which means do nothing for them.  If any language module is available for the
> given character, Pango is going to split them off to deal with them in the
> language module. and the dependent vowels itself should be marked as the cursor
> movable.

Yeah, currently (in HEAD), at a boundary I'm doing:

     log_attrs[0].backspace_deletes_character  = attr_before.backspace_deletes_character;

     log_attrs[0].is_line_break      |= attr_before.is_line_break;
     log_attrs[0].is_mandatory_break |= attr_before.is_mandatory_break;
     log_attrs[0].is_cursor_position |= attr_before.is_cursor_position;

So, is_cursor_position will be set if it was set at the end of the previous segment.  And that's most probably the case.

> So attached patch should help.
> 
> For Bug#345066, need to investigate much further. it looks to me like it always
> enters into the normalized-block unless one press the backspace within the
> composed character, because the current position should points to the next
> character, which is most likely be turned backspace_deletes_character on. so
> doesn't it look at log_attrs[current_pos - 1] or similar, otherwise doesn't it
> evaluate PangoLogAttr with the normalized one?

I'm not sure I fully understand your question, but lemme explain how backspace works in Gtk+ as it confused me when writing the very simple Arabic language engine too.  This is in gtk_text_buffer_backspace().

When you press backspace, Gtk+ first checks whether backspace_deletes_character is set for the current cursor position (that is, after the character).  If backspace_deletes_character is not set, it deletes the entire grapheme before the cursor position and done.  Otherwise, it normalizes the grapheme to NFD (that is, decomposes it), and deletes the grapheme, reinsert all but last character in the decomposed sequence.
Comment 12 Akira TAGOH 2006-10-06 13:38:55 UTC
(In reply to comment #11)
> I'm not sure I fully understand your question, but lemme explain how backspace
> works in Gtk+ as it confused me when writing the very simple Arabic language
> engine too.  This is in gtk_text_buffer_backspace().
> 
> When you press backspace, Gtk+ first checks whether backspace_deletes_character
> is set for the current cursor position (that is, after the character).  If
> backspace_deletes_character is not set, it deletes the entire grapheme before
> the cursor position and done.  Otherwise, it normalizes the grapheme to NFD
> (that is, decomposes it), and deletes the grapheme, reinsert all but last
> character in the decomposed sequence.
> 

Yes, right. so how about this case - one input U+09DF and 'a', and going to delete U+09DF with the backspace. so the current cursor position should be

U+09DF | a

and GTK+ checks backspace_deletes_character for the current cursor position, which the instance of PangoLogAttr for 'a' has. so it's always set to true. then U+09DF will becomes U+09AF U+09BC by g_utf8_normalize() and it deletes U+09BC.

That's the actual behavior right now (at least on pango 1.14.4 and gtk+ 2.10.4). this also happens without even 'a', because null character also turns backspace_deletes_character on.  So I'm really wondering if this could be fixed in only the language module.

So I suppose that it may works correctly if GTK+ checks backspace_deletes_character for the previous character at the current cursor position, or should invoke pango_layout_get_log_attrs() with normalized text and deletes the proper characters from the current cursor position?
Comment 13 Behdad Esfahbod 2006-10-12 18:26:05 UTC
Akira, if you set the correct backspace_deletes_character in your module, it should all work correctly.  In HEAD and in Pango 1.14.6, I'm taking backspace_deletes_character property from the previous run.  Going to commit your module.
Comment 14 Behdad Esfahbod 2006-10-12 18:46:26 UTC
2006-10-12  Behdad Esfahbod  <behdad@gnome.org>

        Bug 353877 – Sinhala is_cursor_position and
        backspace_deletes_character issues
        Patch from Akira TAGOH

        * configure.in:
        * modules/indic/Makefile.am:
        * modules/indic/indic-lang.c: Add a simple Indic language engine.



Closing this bug.  For other changes please attach patches to other bugs, or open new ones.