GNOME Bugzilla – Bug 353877
Sinhala is_cursor_position and backspace_deletes_character issues
Last modified: 2006-10-12 18:46:26 UTC
From: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=204934 The cursoring and the delete opereaion using DELETE key not behaving properly. Version-Release number of selected component (if applicable): pango-1.14.2-1.fc6 gedit-2.15.9-1.fc6 How reproducible: Always Steps to Reproduce: 1. Open the URL : http://batman.brisbane.redhat.com/~indic/IndicTC/lang/si_LK/font/IndicFontTestCaseGSUB-Sinhala.html 2. Open gedit 3. Copy and paste the composed char from the URL text box given. 4. Place the cursor either right or left hand side of the char. 5. allow the cursor to go through from right side to left side or vice-varsa by left or right arrow key. 6. Observe the result Actual results: cursor is going through wrongly accross the char. Expected results: cursor should go extreme left and right side with only one left or right arrow keysrtoke.
Created attachment 73564 [details] [review] proposed patch pango_default_break doesn't support Sinhala. Pango needs an exception to deal with Sinhala because it's using ZWJ for incorrect purpose and there are some special case to take care of the chunk of characters as one character. For reference, it's documented in Sinhala Standards (http://fonts.lk/doc/sinhala%20standards.pdf)
Created attachment 73616 [details] [review] proposed patch just fix a typo like s/Raphaya/Rephaya/g ...
This is not acceptable. Please write a lang engine for the script, like I did for Arabic in HEAD. Check out pango/modules/arabic/arabic-lang.c
Thanks for the info. BTW I'm seeing the strange behavior on frequently turned is_cursor_position on/off when I'm monitoring in my own script_break function. it looks like sync'ed with the cursor blink. I presumed that it should always take along the same thing. is it an expected behavior? Without this assumption, probably I need to do same thing like pango_default_break() does, because I'm just going to backport my initial patch to the pango language module, which just turned is_cursor_position off for some Sinhala letters and relying on the default behavior for other cases.
You should be testing with pango head, not 1.14.x. Language modules in all versions before current head were seriously broken out of repair. With head, just look at what the arabic-lang.c module is doing. Pango has already called pango_default_break() for you. You just need to fixup the result.
Created attachment 73997 [details] [review] proposed patch for HEAD Yes, I noticed that too pango_break() was improved/fixed in HEAD.
Created attachment 73998 [details] [review] proposed patch for 1.14.4 for 1.14.4, just invoking pango_default_break() in own script_break before correct.
Thanks Akira. Looks good. Now, can you update the module to fix Bug 345069 and Bug 345066 too?
Sure. let me try.
Created attachment 74041 [details] [review] updated patch for #353877 and #345069 for HEAD After some looking, I don't think we really need any extra code to solve Bug#345069 if we has _empty_ language module for Devanagari and others perhaps, which means do nothing for them. If any language module is available for the given character, Pango is going to split them off to deal with them in the language module. and the dependent vowels itself should be marked as the cursor movable. So attached patch should help. For Bug#345066, need to investigate much further. it looks to me like it always enters into the normalized-block unless one press the backspace within the composed character, because the current position should points to the next character, which is most likely be turned backspace_deletes_character on. so doesn't it look at log_attrs[current_pos - 1] or similar, otherwise doesn't it evaluate PangoLogAttr with the normalized one?
(In reply to comment #10) > Created an attachment (id=74041) [edit] > updated patch for #353877 and #345069 for HEAD > > After some looking, I don't think we really need any extra code to solve > Bug#345069 if we has _empty_ language module for Devanagari and others perhaps, > which means do nothing for them. If any language module is available for the > given character, Pango is going to split them off to deal with them in the > language module. and the dependent vowels itself should be marked as the cursor > movable. Yeah, currently (in HEAD), at a boundary I'm doing: log_attrs[0].backspace_deletes_character = attr_before.backspace_deletes_character; log_attrs[0].is_line_break |= attr_before.is_line_break; log_attrs[0].is_mandatory_break |= attr_before.is_mandatory_break; log_attrs[0].is_cursor_position |= attr_before.is_cursor_position; So, is_cursor_position will be set if it was set at the end of the previous segment. And that's most probably the case. > So attached patch should help. > > For Bug#345066, need to investigate much further. it looks to me like it always > enters into the normalized-block unless one press the backspace within the > composed character, because the current position should points to the next > character, which is most likely be turned backspace_deletes_character on. so > doesn't it look at log_attrs[current_pos - 1] or similar, otherwise doesn't it > evaluate PangoLogAttr with the normalized one? I'm not sure I fully understand your question, but lemme explain how backspace works in Gtk+ as it confused me when writing the very simple Arabic language engine too. This is in gtk_text_buffer_backspace(). When you press backspace, Gtk+ first checks whether backspace_deletes_character is set for the current cursor position (that is, after the character). If backspace_deletes_character is not set, it deletes the entire grapheme before the cursor position and done. Otherwise, it normalizes the grapheme to NFD (that is, decomposes it), and deletes the grapheme, reinsert all but last character in the decomposed sequence.
(In reply to comment #11) > I'm not sure I fully understand your question, but lemme explain how backspace > works in Gtk+ as it confused me when writing the very simple Arabic language > engine too. This is in gtk_text_buffer_backspace(). > > When you press backspace, Gtk+ first checks whether backspace_deletes_character > is set for the current cursor position (that is, after the character). If > backspace_deletes_character is not set, it deletes the entire grapheme before > the cursor position and done. Otherwise, it normalizes the grapheme to NFD > (that is, decomposes it), and deletes the grapheme, reinsert all but last > character in the decomposed sequence. > Yes, right. so how about this case - one input U+09DF and 'a', and going to delete U+09DF with the backspace. so the current cursor position should be U+09DF | a and GTK+ checks backspace_deletes_character for the current cursor position, which the instance of PangoLogAttr for 'a' has. so it's always set to true. then U+09DF will becomes U+09AF U+09BC by g_utf8_normalize() and it deletes U+09BC. That's the actual behavior right now (at least on pango 1.14.4 and gtk+ 2.10.4). this also happens without even 'a', because null character also turns backspace_deletes_character on. So I'm really wondering if this could be fixed in only the language module. So I suppose that it may works correctly if GTK+ checks backspace_deletes_character for the previous character at the current cursor position, or should invoke pango_layout_get_log_attrs() with normalized text and deletes the proper characters from the current cursor position?
Akira, if you set the correct backspace_deletes_character in your module, it should all work correctly. In HEAD and in Pango 1.14.6, I'm taking backspace_deletes_character property from the previous run. Going to commit your module.
2006-10-12 Behdad Esfahbod <behdad@gnome.org> Bug 353877 – Sinhala is_cursor_position and backspace_deletes_character issues Patch from Akira TAGOH * configure.in: * modules/indic/Makefile.am: * modules/indic/indic-lang.c: Add a simple Indic language engine. Closing this bug. For other changes please attach patches to other bugs, or open new ones.