After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 472657 - Renders U+200B (zero width space) visibly under certain conditions
Renders U+200B (zero width space) visibly under certain conditions
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
1.16.x
Other Linux
: Normal normal
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2007-09-01 21:06 UTC by Sven Arvidsson
Modified: 2012-08-18 17:46 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Sven Arvidsson 2007-09-01 21:06:34 UTC
[ From http://bugs.debian.org/439767 by Rich Felker ]

"After upgrading my system, the latest Pango renders U+200B (zero-width
space) visibly under certain conditions, as a "missing glyph" box
containing the hex value. Particularly, Pango seems to be a looking
for a glyph for this character matching the current language/script.
In my case, Tibetan characters adjacent to U+200B cause the
misrendering to happen.

I first observed the problem on a Google search (in Iceweasel) but
have since been able to reproduce it in GTK+ text widgets by first
typing U+200B (using an input method) then moving the cursor before
the U+200B character and typing any Tibetan character. Thus, I am
fairly confident that the bug is in Pango itself and not GTK+ or
Iceweasel.

I suspect the Tibetan fonts I am using lack a glyph for U+200B, but
Pango should not be insisting on trying to find a "Tibetan version" of
this character. It probably shouldn't even look for glyphs at all, but
instead always treat it as a zero-width character with no visible
glyph... but if it is going to use a glyph it should grab one from any
available font.

Screenshot of the issue:
http://www.aerifal.cx/~dalias/images/200b.png

I'm using Monlam Uni Ochan1, from lobsangmonlam.org. Indeed, the
problem goes away if I remove it so that Jomolhari is used. But in the
past, the problem didn't exist even though I was using Monlam's font.
And, opening up both fonts in FontForge, I see that Jomolhari has
blank glyphs for all the various spaces while Monlam's fonts lack
them.

Still, I think the old behavior was correct. Space characters
(especially zero-width whose "glyph" is 100% defined by Unicode and
cannot vary) should never appear as [200B] etc. replacement glyphs
regardless of what's present or missing in the font, and especially
when the font wasn't even selected manually but rather used as a
fallback for scripts not in the selected font."
Comment 1 Behdad Esfahbod 2007-09-07 18:42:13 UTC
Do you also see it in gedit?
Comment 2 Sven Arvidsson 2007-09-08 20:47:11 UTC
This is the response I got;

"Yes, see screenshot attached. Typing any Latin character between the
Tibetan text and the U+200B makes the visible glyph vanish.

The Monlam Uni Ochan1 font I'm using can be obtained from the Monlam
Bod-yig v2 zip file on my Tibetan fonts site,
http://www.aerifal.cx/~dalias/bodyig/fonts/

(The alternative is the upstream site that distributes it only as a
Windows .exe file, www.lobsangmonlam.org.)

My site has it listed under legacy fonts, but this is just because it
has glyphs on top of other script ranges that do not belong to
Tibetan, as well as other non-Unicode fonts packaged with it; the font
in question here does have valid OpenType tables and works fine for
displaying Unicode-encoded Tibetan on both Linux/Pango and Windows."

Screenshot:
http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=22;filename=gedit_pango_tib_bug.png;att=1;bug=439767
Comment 3 Behdad Esfahbod 2012-08-18 17:46:39 UTC
We've merged HarfBuzz branch now, which fixes this.