GNOME Bugzilla – Bug 119891
GtkEntry and GtkTextView need support for PangoLogAttr.backspace_deletes_character
Last modified: 2011-02-04 16:16:49 UTC
GtkEntry and GtkTextView need support for PangoLogAttr.backspace_deletes_character (bug #114483)
*** Bug 124514 has been marked as a duplicate of this bug. ***
Created attachment 23597 [details] [review] Not to delete Marks with their base for arabic language
The correct fix is described in the Summmary.
Requires a semi-API addition of another binding signal; but since it's pretty important, I may try to streamline this into GTK+-2.4.1 or GTK+-2.4.2.
Created attachment 27270 [details] [review] A quick-hack patch (GtkEntry only) Should the fix be something like this? However, one question just arises regarding the logic of pango_default_break() which keeps the backspace_deletes_character flag *only* at cursor positions. To handle backspace events, should the actual attribute value (whether to delete the whole cluster or just the last character) be peeked from *previous* cursor position instead of the current one as done in the patch? If so, wouldn't it be more efficient to store the flag in *every* character, including combining marks?
I don't quite understand the question. As I understand how the Pango attribute is set, it applies to a delete key *at* the cursor position; that is, to the grapheme that precedes the cursor position. Two other comments: What I did in a now lost patch for GtkEntry and GtkTextView was to add a separate binding signal for "backspace" rather than extending the delete enumeration. It's also necessary to actually convert the preceding grapheme to NFD before deleting one character off of it; the deletion behavior should not depend on the normalization form of the text.
OK. Let me elaborate the question: Assuming B = base character, M = combining mark, L = Latin character (for which backspace_deletes_character = false), | = cursor: BBMBMM|L (text buffer) 110100 1 (is_cursor_position) 110100 0 (backspace_deletes_char) When Backspace is pressed, the last M before the cursor should be deleted, according to backspace_deletes_char of B in the previous cluster, rather than that of L at the cursor position. Another case (which is quite usual) is: BBMBMM| (text buffer) 110100 (is_cursor_position) 110100 (backspace_deletes_char) where the cursor is at the end of text buffer. This makes me think it might have been more efficient had the backspace_deletes_character been stored this way: BBMBMM|L (text buffer) 110100 1 (is_cursor_position) 111111 0 (backspace_deletes_char) so that the appropriate action can be immediately determined without moving back to previous cursor position. However, as you said that the cluster before the cursor needs to be analyzed to determine the last character to delete anyway, the idea of storing flags for every characters including combining marks may make no difference.
Created attachment 27323 [details] [review] GtkEntry patch (using new signal + normalization)
Created attachment 27330 [details] [review] Similar patch for GtkTextView
However, I doubt if it's appropriate to try to resolve the normalization thing for backspace. I think what most users expect is that backspace undo the previously typed character. Deleting something else may be beyond expectation.
IIRC backspace is supposed to delete marks for Arabic marks only, or similar cases, which may happen to have no pre-composed equivalents. At least that's the case for Arabic.
Same for Thai. And my two patches above just perform what Thai users expect: backspace deletes the mark only. (I've just found, however, that the GtkTextView patch causes problem with GtkSourceView undo buffer.) My question in #10 may be made more specific case by case: 1. In non-normalized string with non-canonical order of combining marks, what should backspace delete: the last typed or the last logical character? For example: if 'Bmn' is the NFD, but what appears before the cursor is 'Bnm' (which just reflects the recent typing order), what is user's expectation when pressing backspace, deleting 'n' or 'm'? (With normalization concern, 'n' is removed, but my opinion is that 'm' should be expected to be removed to undo the last keystroke.) 2. Where the grapheme before the cursor was converted into precomposed form by the input method, what is expected when pressing backspace: delete the whole precomposed character or decompose it before deleting the last logical component? (With normalization concern, the latter is assumed. But I'm not sure if it's expected, since there is no such case in my script/language.)
I must add to my previous opinion whether normalization should be concerned that in case of Thai, this makes no difference, since most Thai IM's are implemented such that text inputs are guaranteed to be always normalized. So, either case is OK for Thai.
Mass changing gtk+ bugs with target milestone of 2.4.2 to target 2.4.4, as Matthias said he was trying to do himself on IRC and was asking for help with. If you see this message, it means I was successful at fixing the borken-ness in bugzilla :) Sorry for the spam; just query on this message and delete all emails you get with this message, since there will probably be a lot.
Note that GtkTextView patch in Attachment #27330 [details] requires Bug #141993 to be fixed.
I've committed the patches to cvs HEAD now.
Works great, but the undo manager needs some work. This is what I get when I remove a mark in gedit: (gedit:1327): GtkSourceView-CRITICAL **: file gtksourceundomanager.c: line 617 (gtk_source_undo_manager_insert_text_handler): assertion `strlen (text) == (guint)length' failed And if I undo, I get both the mark *and a copy of it's base character* inserted, resulting in two base characters, which is wrong. For example, say I have this: بهِداد and I remove the Kasre mark, I get: بهداد and Undo and get: بهِهداد Should be easy to fix though.
Sorry, it's in gtksourceview. Filed as bug 149128.
The same bug for GtkSourceView was already filed as Bug #141993, as said in Comment #15.
> ------- Additional Comment #10 From Theppitak Karoonboonyanan 2004-05-04 19:31 > > However, I doubt if it's appropriate to try to resolve the normalization thing > for backspace. I think what most users expect is that backspace undo the > previously typed character. Deleting something else may be beyond expectation. Actually this issue has just showed itself to me for Arabic: I type U+0622 ARABIC LETTER ALEF WITH MADDA ABOVE, which is indeed one letter in Persian/Arabic, but then press the backspace, get the U+0653 ARABIC MADDAH ABOVE removed, remains a U+0627 ARABIC LETTER ALEF. So, thinking again, I think it's quite important to try to keep what user typed in. Any idea? Open a new bug?