GNOME Bugzilla – Bug 345069
Indic dependent vowel's assumption on character cluster
Last modified: 2009-07-15 10:52:31 UTC
Please describe the problem: Opened by Leon Ho (llch@redhat.com) on 2005-01-28 15:06 EST [reply] Private Description of problem: When typing dependent vowel without any consonants, it will assume the previous char was consonant. Version-Release number of selected component (if applicable): 1.6.0-7 How reproducible: everytime Steps to Reproduce: 1. type "a" 2. type "ो" 3a. press backspace 3b. press left arrow-key Actual results: a. 'a' get deleted along with the devanagari vowel sign b. the cursor position located before 'a' Expected results: a. only the vowel sign get deleted b. only moves before vowel sign Additional info: Steps to reproduce: Actual results: Expected results: Does this happen every time? Other information:
the same bug in RedHat bugzilla: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=146489
Created attachment 69714 [details] [review] my patch I wrote a patch for this bug, added some codes in pango_default_break( ).
The patch is definitely wrong. It adds a cursor position between every letter and combining mark. What we want instead is to force a cursor position between items. Working on it.
After some diagnosis, I'm not sure why the Latin and Devanagari characters are put in the same item :(.
Ah ok, this is because we merge items when breaking: while (items) { PangoItem tmp_item = *(PangoItem *)items->data; /* Accumulate all the consecutive items that match in language * characteristics, ignoring font, style tags, etc. */ while (items->next) { PangoItem *next_item = items->next->data; /* FIXME: Handle language tags */ if (next_item->analysis.lang_engine != tmp_item.analysis.lang_engine) break; else { tmp_item.length += next_item->length; tmp_item.num_chars += next_item->num_chars; } items = items->next; } /* Break the paragraph delimiters with the last item */ if (items->next == NULL) { tmp_item.num_chars += g_utf8_strlen (text + index + tmp_item.length, para_delimiter_len); tmp_item.length += para_delimiter_len; } pango_break (text + index, tmp_item.length, &tmp_item.analysis, log_attrs + offset, tmp_item.num_chars + 1); offset += tmp_item.num_chars; index += tmp_item.length; items = items->next; } Not sure how to fix it. Unicode does move characters used by multiple scripts into Common, so it should be theoretically Ok to force a break between items. However, the problem is that items don't keep their script tag; just language. That hits us in a couple of places... Anyway, I think we will eventually need a language engine for Indic. That automatically will fix this problem. In that case, we may as well add a stub engine right now. Owen, what do you think?
Ok, we now have an Arabic lang engine in HEAD. Going to add an Indic one. All we need to do for this bug is to set is_cursor_position to true for the first log_attr. See bug 350132 for the Arabic module.
Should be fixed with what I committed for bug 353877 (the Indic lang engine)
still problem is there problem is not happening if we enter say a + U+0915 or any consonant but it is happening for vowels like U+093f etc. aि aी
looking closer at it, 1) things are working properly for all valid combination like a + indic consonant like क, ख etc problem is happening only for a+ indic matras like mentioned above example aि aी etc but this is a invalid combination and i have never seen such combination any time possible combination are aकि where things are working properly still while debugging by making gtk text-entry box even this things working properly in GTK Text Entry Box so might be this is problem of gedit IMHO we can close this bug as a not bug since this combination never come in practical uses what you says Behdad?