Bug 345069 – Indic dependent vowel's assumption on character cluster

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 345069 - Indic dependent vowel's assumption on character cluster


Summary:	Indic dependent vowel's assumption on character cluster


Status:	RESOLVED FIXED

Product:	pango
Classification:	Platform
Component:	indic
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Pango Indic
QA Contact:	Pango Indic

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2006-06-16 02:15 UTC by LingNing Zhang
Modified:	2009-07-15 10:52 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
my patch (633 bytes, patch) 2006-07-27 08:16 UTC, LingNing Zhang	rejected	Details \| Review

Description LingNing Zhang 2006-06-16 02:15:47 UTC

Please describe the problem:
Opened by Leon Ho (llch@redhat.com)  	 on 2005-01-28 15:06 EST  	[reply]  	   Private

Description of problem:
When typing dependent vowel without any consonants, it will assume the
previous char was consonant.

Version-Release number of selected component (if applicable):
1.6.0-7

How reproducible:
everytime

Steps to Reproduce:
1. type "a"
2. type "ो"
3a. press backspace
3b. press left arrow-key
  
Actual results:
a. 'a' get deleted along with the devanagari vowel sign
b. the cursor position located before 'a'

Expected results:
a. only the vowel sign get deleted
b. only moves before vowel sign

Additional info:


Steps to reproduce:


Actual results:


Expected results:


Does this happen every time?


Other information:

Comment 1 LingNing Zhang 2006-06-16 02:16:24 UTC

the same bug in RedHat bugzilla:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=146489

Comment 2 LingNing Zhang 2006-07-27 08:16:26 UTC

Created attachment 69714 [details] [review]
my patch

I wrote a patch for this bug, added some codes in pango_default_break( ).

Comment 3 Behdad Esfahbod 2006-09-08 17:15:32 UTC

The patch is definitely wrong.  It adds a cursor position between every letter and combining mark.

What we want instead is to force a cursor position between items.  Working on it.

Comment 4 Behdad Esfahbod 2006-09-08 17:24:20 UTC

After some diagnosis, I'm not sure why the Latin and Devanagari characters are put in the same item :(.

Comment 5 Behdad Esfahbod 2006-09-08 17:36:10 UTC

Ah ok, this is because we merge items when breaking:

  while (items)
    {
      PangoItem tmp_item = *(PangoItem *)items->data;

      /* Accumulate all the consecutive items that match in language
       * characteristics, ignoring font, style tags, etc.
       */
      while (items->next)
        {
          PangoItem *next_item = items->next->data;

          /* FIXME: Handle language tags */
          if (next_item->analysis.lang_engine != tmp_item.analysis.lang_engine)
            break; 
          else
            {
              tmp_item.length += next_item->length;
              tmp_item.num_chars += next_item->num_chars;
            }

          items = items->next;
        }

      /* Break the paragraph delimiters with the last item */
      if (items->next == NULL)
        {
          tmp_item.num_chars += g_utf8_strlen (text + index + tmp_item.length, para_delimiter_len);
          tmp_item.length += para_delimiter_len;
        }
  
      pango_break (text + index, tmp_item.length, &tmp_item.analysis,
                   log_attrs + offset, tmp_item.num_chars + 1);

      offset += tmp_item.num_chars;
      index += tmp_item.length;

      items = items->next;
    }


Not sure how to fix it.  Unicode does move characters used by multiple scripts into Common, so it should be theoretically Ok to force a break between items.  However, the problem is that items don't keep their script tag; just language.  That hits us in a couple of places...

Anyway, I think we will eventually need a language engine for Indic.  That automatically will fix this problem.  In that case, we may as well add a stub engine right now.

Owen, what do you think?

Comment 6 Behdad Esfahbod 2006-09-18 22:14:04 UTC

Ok, we now have an Arabic lang engine in HEAD.  Going to add an Indic one.  All we need to do for this bug is to set is_cursor_position to true for the first log_attr.

See bug 350132 for the Arabic module.

Comment 7 Behdad Esfahbod 2006-10-12 18:48:24 UTC

Should be fixed with what I committed for bug 353877 (the Indic lang engine)

Comment 8 Pravin Satpute 2009-07-14 07:14:55 UTC

still problem is there

problem is not happening if we enter say
a + U+0915 or any consonant

but it is happening for vowels like U+093f etc.
aि  aी

Comment 9 Pravin Satpute 2009-07-15 10:52:31 UTC

looking closer at it,
1) things are working properly for all valid combination like

a + indic consonant like क, ख etc

problem is happening only for 
a+ indic matras like mentioned above 
example aि aी etc
but this is a invalid combination and i have never seen such combination any time

possible combination are 

aकि where things are working properly

still while debugging by making gtk text-entry box even this things working properly in GTK Text Entry Box so might be this is problem of gedit

IMHO we can close this bug as a not bug since this combination never come in practical uses

what you says Behdad?