After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 150883 - Unicode LRO defect
Unicode LRO defect
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
1.4.x
Other Linux
: Normal normal
: Medium fix
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2004-08-23 20:34 UTC by Felipe Heidrich
Modified: 2007-09-29 03:15 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
snippet (1.37 KB, text/plain)
2004-08-23 20:36 UTC, Felipe Heidrich
Details
bad shaping (5.96 KB, image/png)
2004-08-23 20:37 UTC, Felipe Heidrich
Details
good shaping (11.41 KB, image/png)
2004-08-23 20:40 UTC, Felipe Heidrich
Details

Description Felipe Heidrich 2004-08-23 20:34:39 UTC
When LRO is used PangoLayout shaping fails.
Compile and run the snippet, notice the shaping is wrong.

The unicode sequence is: 
\u202d\u0637\u0627\u0644\u0628\u0020\u0633\u0644\u0627\u0645\u0020\u0645\u062d\u0645\u062f
Where \u202d == LRO
Comment 1 Felipe Heidrich 2004-08-23 20:36:02 UTC
Created attachment 30873 [details]
snippet

Snippet

Relate to https://bugs.eclipse.org/bugs/show_bug.cgi?id=72413
Comment 2 Felipe Heidrich 2004-08-23 20:37:18 UTC
Created attachment 30874 [details]
bad shaping
Comment 3 Felipe Heidrich 2004-08-23 20:40:08 UTC
Created attachment 30875 [details]
good shaping
Comment 4 Felipe Heidrich 2004-08-23 20:42:33 UTC
I used the same font (Tahoma) in both screenshots so it is easier for us 
(english speaker) to see the differences.
Comment 5 Felipe Heidrich 2004-08-23 20:48:38 UTC
Relate to https://bugs.eclipse.org/bugs/show_bug.cgi?id=72413
Comment 6 Owen Taylor 2004-08-24 14:04:31 UTC
Unicode standard reference:

http://www.unicode.org/reports/tr9/#Shaping

To get the behavior specified there, I think the right approach is to,
in arabic_engine_shape(), if the direction of the run is LTR, just
reverse the characters in it before feeding it to the rest of the
shaping process. Then you'd have to fix up the logical clusters 
afterwards. (You might want to use character offsets not indices
as the clusters input to pango_ot_buffer_add_glyph() and then map
back to character indices later... might be easier.)

What wont' work, and would be very hard to get working within
the Pango framework, is to have shaping across multiple directional
runs.
Comment 7 Behdad Esfahbod 2004-08-25 00:29:08 UTC
I'm supposed to fix this.  Fortunately the consensus is to limit shaping to
directional runs, but it still is problematic, since ZWJ needs special handling.
 In normal day-to-day text you may have adjacent ZWJ and Arabic text that ZWJ
gets an even embedding level, and Arabic text gets odd...  Unicode says that ZWJ
and ZWNJ should affect the adjacent base letters, no matter what the embedding
level is. Any idea?
Comment 8 Behdad Esfahbod 2004-08-25 00:37:35 UTC
To be exact, ZWJ (and other BN chars) are removed in the bidi process.  What
happens right now is that FriBidi assigns some sane embedding level values to
this chars, so what we get.  Since they are Boundary Neutral characters, by
definition, any assignment of an embedding level can break the adjacency to
either previous or next char.  In current FriBidi CVS code, I do shaping on the
whole paragraph in one pass, using the embedding levels as input.  Do you think
there's a way to do this in Pango in near future?

[I get all reports on pango, no need to CC, just that I don't have internet at
home this week :(]
Comment 9 Owen Taylor 2004-09-22 20:06:50 UTC
I don't think shaping across directional levels is in the future for Pango;
it would require a major change in the shaping pipeline.
And, after all, you might have:

 Indic text ZWJ Arabic text

You can't pass that all to the Arabic shaper! What might be possible
to do is have an extended version of the script_shape() virtual function
that tags a "flags" argument with a ZWJ_BEFORE flag, or something like
that. (This might be useful for dealing with special behavior at the end
of lines like hanging punctuation as well.)

Comment 10 Felipe Heidrich 2007-07-16 19:54:36 UTC
I started testing Bidi on the Mac and ATSUI got this right.
Behdad/Owen no progress here ?
Comment 11 Behdad Esfahbod 2007-07-24 23:11:10 UTC
2007-07-24  Behdad Esfahbod  <behdad@gnome.org>

        Bug 150883 – Unicode LRO defect

        * modules/arabic/arabic-fc.c (arabic_engine_shape):
        * modules/arabic/arabic-ot.c (Get_Joining_Class),
        (Arabic_Assign_Properties):
        * modules/arabic/arabic-ot.h:
        Correctly handle Arabic shaping in left-to-right runs.

Comment 12 Felipe Heidrich 2007-07-25 15:24:42 UTC
Thanks Behdad
Comment 13 LingNing Zhang 2007-09-29 03:15:31 UTC
behdad, could you please close the same bug in RedHat bugzilla ?
https://bugzilla.redhat.com/show_bug.cgi?id=185490
Thank you, :)