GNOME Bugzilla – Bug 306639
Zero width characters in arabic shaper
Last modified: 2005-07-23 19:25:26 UTC
Version details: also 1.6.x/2.8 Distribution/Version: Fedora Core 3 and 3.92 Download this file and open it in gedit: http://www.bamdad.org/~behnam/persian/gedit/gedit-bidi-test-01.txt You can see [200C] and [200D] character in text area. I don't know exactly when this bug accures, yet.
They are not [200C] and [200D], but [202C], [202D], and [202E], ie. PDF, LRO, and RLO.
Oops, your right, behdad. Also I remember some time saw [202A] and [202B], mean LRE and RLE. They're not shown always in my fc-3 or fc-3.92. But testing BidiAssist on a FreeBSD with GNOME-2.10/GTK-2.6, we saw all of bidi control characters that we inserted.
related: http://bugzilla.gnome.org/show_bug.cgi?id=150883
Compare checks for zero-width characters in arabic_engine_shape() with ZERO_WIDTH_CHAR() in modules/basic/basic-common.h. The ZERO_WIDTH_CHAR() macro is already duplicated between the basic and indic modules, so it probably needs to be moved into the Pango API in some form, probably just as a convenience function. But that's not necessary for a basic fix.
Owen, don't you think these should be removed somewhere out of individual shapers? There's mirroring too, that's code duplicated in several modules, while should be performed no matter which shaper is chosen.
Created attachment 47739 [details] [review] Patch that defines pango_is_zero_width for internal use Attached patch defines pango_is_zero_width in pango-utils.[ch], and all shapers (basic, Indic, Arabic) use that.
Behnam, can you test the patch please?
Created attachment 48175 [details] [review] Attached patch defines pango_is_zero_width in Syriac shaper
Created attachment 49639 [details] [review] Patch with documentation Attached is the patch that I'm going to apply. Added documentation and an optimization.
Commited to HEAD: 2005-07-23 Behdad Esfahbod <pango@behdad.org> * pango/pango-utils.c, pango/pango-utils.h (pango_is_zerowidth): New function added. * modules/basic/basic-common.h, modules/basic/basic-fc.c, modules/basic/basic-win32.c, modules/basic/basic-x.c, modules/hangul/hangul-fc.c, modules/arabic/arabic-fc.c, modules/indic/indic-fc.c, modules/indic/indic-ot.h, modules/syriac/syriac-fc.c: Use the new pango_is_zerowidth function. (#306639, Behnam Esfahbod)