After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 451682 - Cursor positioning for Sinhala is broken
Cursor positioning for Sinhala is broken
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: indic
1.16.x
Other Linux
: Normal normal
: ---
Assigned To: Pango Indic
pango-maint
Depends on:
Blocks:
 
 
Reported: 2007-06-27 18:05 UTC by Harshula Jayasuriya
Modified: 2007-08-02 16:17 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch to correct Sinhala cursor positioning (3.70 KB, patch)
2007-06-27 18:09 UTC, Harshula Jayasuriya
none Details | Review
Testcase (124 bytes, text/plain)
2007-06-27 18:12 UTC, Harshula Jayasuriya
  Details
Updated patch to correct Sinhala cursor positioning (4.66 KB, patch)
2007-07-31 17:36 UTC, Harshula Jayasuriya
none Details | Review
committed patch (4.95 KB, patch)
2007-07-31 18:59 UTC, Behdad Esfahbod
committed Details | Review

Description Harshula Jayasuriya 2007-06-27 18:05:53 UTC
Hi,

The code added to support cursor positioning for Sinhala in Bug 353877 is
incorrect.

I'll attached a patch containing:

* BUGFIX: Correct the broken Repaya handling code.
* BUGFIX: Consonant clusters do NOT result in implicit conjuncts in
  SINHALA orthography.
* The cursor should treat as a single glyph:
    SINHALA CONS + 0x0DCA + 0x200D + SINHALA CONS
    SINHALA CONS + 0x200D + 0x0DCA + SINHALA CONS

Originally reported at:
http://mail.gnome.org/archives/gtk-i18n-list/2007-June/msg00031.html

Regards,
Harshula
Comment 1 Harshula Jayasuriya 2007-06-27 18:09:22 UTC
Created attachment 90768 [details] [review]
Patch to correct Sinhala cursor positioning

* BUGFIX: Correct the broken Repaya handling code.
* BUGFIX: Consonant clusters do NOT result in implicit conjuncts in
          SINHALA orthography.
* The cursor should treat as a single glyph:
    SINHALA CONS + 0x0DCA + 0x200D + SINHALA CONS
    SINHALA CONS + 0x200D + 0x0DCA + SINHALA CONS
Comment 2 Harshula Jayasuriya 2007-06-27 18:12:18 UTC
Created attachment 90769 [details]
Testcase
Comment 3 Behdad Esfahbod 2007-07-26 17:25:01 UTC
Why do you need to condition on being Sinhala?  Isn't the character checks enough?

And even if you need to check, isn't checking analysis.script == PANGO_SCRIPT_SINHALA easier?

While at it, please first submit a patch to move this repeated four lines into a function:

                  attrs[i].is_cursor_position = FALSE;
                  attrs[i].is_char_break = FALSE;
                  attrs[i].is_line_break = FALSE;
                  attrs[i].is_mandatory_break = FALSE;

In fact, they probably should just modify is_cursor_position and leave the rest as is.
Comment 4 Harshula Jayasuriya 2007-07-27 18:51:50 UTC
> ------- Comment #3 from Behdad Esfahbod  2007-07-26 17:25 UTC -------
> Why do you need to condition on being Sinhala?  Isn't the character checks
> enough?

Unfortunately not, Sinhala orthography is quite different to North Indian 
orthographies. The problem is 0x200D is used by all the scripts and it is not 
possible to control the behaviour of the Sinhala script without 
changing/breaking the behaviour of the Indic scripts.

This should also avoid regressions in the Sinhala script when someone makes 
changes to the behaviour of the Indic scripts. That's the biggest problem at 
the moment.

> And even if you need to check, isn't checking analysis.script ==
> PANGO_SCRIPT_SINHALA easier?

Can you do that? PangoAnalysis (pango/pango-item.h) looks like:

struct _PangoAnalysis
{
  PangoEngineShape *shape_engine;
  PangoEngineLang  *lang_engine;
  PangoFont *font;

  guint8 level;
  guint8 gravity; /* PangoGravity */
  guint8 flags;

  PangoLanguage *language;
  GSList *extra_attrs;
};

If there's a better way, I'm more than willing to change it.

> While at it, please first submit a patch to move this repeated four lines into
> a function:
>
>                   attrs[i].is_cursor_position = FALSE;
>                   attrs[i].is_char_break = FALSE;
>                   attrs[i].is_line_break = FALSE;
>                   attrs[i].is_mandatory_break = FALSE;

I followed the conventions already used by the code you have checked in. There's a lot of existing code that does exactly what you describe above. Are you asking me to clean up up the file in general?

> In fact, they probably should just modify is_cursor_position and leave the
> rest as is.

What's your preference? I can create a macro which sets all 4 flags or if you are sure it is safe to delete all but the is_cursor_position flag, I can do that.

cya,
#
Comment 5 Behdad Esfahbod 2007-07-27 21:24:26 UTC
(In reply to comment #4)
> > ------- Comment #3 from Behdad Esfahbod  2007-07-26 17:25 UTC -------
> > Why do you need to condition on being Sinhala?  Isn't the character checks
> > enough?
> 
> Unfortunately not, Sinhala orthography is quite different to North Indian 
> orthographies. The problem is 0x200D is used by all the scripts and it is not 
> possible to control the behaviour of the Sinhala script without 
> changing/breaking the behaviour of the Indic scripts.
>
> This should also avoid regressions in the Sinhala script when someone makes 
> changes to the behaviour of the Indic scripts. That's the biggest problem at 
> the moment.
> 
> > And even if you need to check, isn't checking analysis.script ==
> > PANGO_SCRIPT_SINHALA easier?
> 
> Can you do that? PangoAnalysis (pango/pango-item.h) looks like:
> 
> struct _PangoAnalysis
> {
>   PangoEngineShape *shape_engine;
>   PangoEngineLang  *lang_engine;
>   PangoFont *font;
> 
>   guint8 level;
>   guint8 gravity; /* PangoGravity */
>   guint8 flags;
> 
>   PangoLanguage *language;
>   GSList *extra_attrs;
> };
> 
> If there's a better way, I'm more than willing to change it.

Your Pango is too old.  Master has:

struct _PangoAnalysis
{
  PangoEngineShape *shape_engine;
  PangoEngineLang  *lang_engine;
  PangoFont *font;

  guint8 level;
  guint8 gravity; /* PangoGravity */
  guint8 flags;

  guint8 script; /* PangoScript */
  PangoLanguage *language;
  
  GSList *extra_attrs;
};



 
> > While at it, please first submit a patch to move this repeated four lines into
> > a function:
> >
> >                   attrs[i].is_cursor_position = FALSE;
> >                   attrs[i].is_char_break = FALSE;
> >                   attrs[i].is_line_break = FALSE;
> >                   attrs[i].is_mandatory_break = FALSE;
> 
> I followed the conventions already used by the code you have checked in.
> There's a lot of existing code that does exactly what you describe above. Are
> you asking me to clean up up the file in general?
> 
> > In fact, they probably should just modify is_cursor_position and leave the
> > rest as is.
> 
> What's your preference? I can create a macro which sets all 4 flags or if you
> are sure it is safe to delete all but the is_cursor_position flag, I can do
> that.

Yes, remove it all and just leave the is_cursor_position lines.

> cya,
> #

Comment 6 Behdad Esfahbod 2007-07-27 21:30:28 UTC
> > > While at it, please first submit a patch to move this repeated four lines into
> > > a function:
> > >
> > >                   attrs[i].is_cursor_position = FALSE;
> > >                   attrs[i].is_char_break = FALSE;
> > >                   attrs[i].is_line_break = FALSE;
> > >                   attrs[i].is_mandatory_break = FALSE;
> > 
> > I followed the conventions already used by the code you have checked in.
> > There's a lot of existing code that does exactly what you describe above. Are
> > you asking me to clean up up the file in general?
> > 
> > > In fact, they probably should just modify is_cursor_position and leave the
> > > rest as is.
> > 
> > What's your preference? I can create a macro which sets all 4 flags or if you
> > are sure it is safe to delete all but the is_cursor_position flag, I can do
> > that.
> 
> Yes, remove it all and just leave the is_cursor_position lines.

Actually reading the code, it's easier to just set all four to false.  Just move it to a function and should be fine.
Comment 7 Harshula Jayasuriya 2007-07-31 17:36:56 UTC
Created attachment 92809 [details] [review]
Updated patch to correct Sinhala cursor positioning

* Create a new function bind_chars to set the flags of a PangoLogAttr.
* Port patch from 16.4 to 17.5
* Use the new field 'script' in PangoLogAttr.
Comment 8 Behdad Esfahbod 2007-07-31 18:59:47 UTC
Created attachment 92812 [details] [review]
committed patch

I'm not sure that you are correctly handling is_conjunct.  To me, you should be setting it to FALSE in more places. (in the else to the entire block maybe?)
Comment 9 Behdad Esfahbod 2007-07-31 19:02:53 UTC
2007-07-31  Behdad Esfahbod  <behdad@gnome.org>

        Bug 451682 – Cursor positioning for Sinhala is broken
        Based on patch from Harshula

        * modules/indic/indic-lang.c (not_cursor_position),
        (indic_engine_break):
        Clean up cursor position stuff.

Comment 10 Harshula Jayasuriya 2007-08-01 15:34:29 UTC
------- Comment #8 from Behdad Esfahbod  2007-07-31 18:59 UTC -------
> Created an attachment (id=92812)
>  --> (http://bugzilla.gnome.org/attachment.cgi?id=92812&action=view)
> committed patch

The purpose of my patch was to:

/*
 * Consonant clusters do NOT result in implicit conjuncts
 * in SINHALA orthography.
 */
else if (!is_conjunct && prev_wc == 0x0DCA && this_wc != 0x200D)
{
  attrs[i].is_cursor_position = TRUE;
}

Note that I'm setting is_cursor_position to *TRUE*. The modified patch you 
committed sets it to *FALSE*.

> I'm not sure that you are correctly handling is_conjunct.  To me, you should be
> setting it to FALSE in more places. (in the else to the entire block maybe?)

For valid sequences of chars the handling of is_conjunct is correct. For invalid
sequences of chars the cursor positioning is undefined and is the least of our
problems. i.e. The rendering will be a mess.

BTW, your changes to the non-Sinhala Indic cursor positioning code was brave! ;-)

cya,
#
Comment 11 Behdad Esfahbod 2007-08-01 18:34:07 UTC
(In reply to comment #10)
> ------- Comment #8 from Behdad Esfahbod  2007-07-31 18:59 UTC -------
> > Created an attachment (id=92812) [edit]
> >  --> (http://bugzilla.gnome.org/attachment.cgi?id=92812&action=view)
> > committed patch
> 
> The purpose of my patch was to:
> 
> /*
>  * Consonant clusters do NOT result in implicit conjuncts
>  * in SINHALA orthography.
>  */
> else if (!is_conjunct && prev_wc == 0x0DCA && this_wc != 0x200D)
> {
>   attrs[i].is_cursor_position = TRUE;
> }
> 
> Note that I'm setting is_cursor_position to *TRUE*. The modified patch you 
> committed sets it to *FALSE*.

That slipped my eyes.  Still, I'm not very happy about that.  Is default rules setting cursor position to FALSE that you need to set it to TRUE here?  If not, you better just leave it as is.  I don't mind the change though.  Committing the fix. 
Comment 12 Behdad Esfahbod 2007-08-01 18:35:46 UTC
2007-08-01  Behdad Esfahbod  <behdad@gnome.org>

        Bug 451682 – Cursor positioning for Sinhala is broken
        Based on patch from Harshula

        * modules/indic/indic-lang.c (indic_engine_break): Fix bug that
        I introduced in the patch.

Comment 13 Behdad Esfahbod 2007-08-01 18:36:11 UTC
Closing again.
Comment 14 Harshula Jayasuriya 2007-08-02 16:17:53 UTC
------- Comment #11 from Behdad Esfahbod  2007-08-01 18:34 UTC -------
> Is default rules setting cursor position to FALSE that you need to set  it 
> to TRUE here?

Yep, that's right. The orthography rules for Sinhala are very different to the North Indian scripts. Indic support tends to be North Indian script centric.

cya,
#