After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 153546 - arabic shaping is not done when glyphs come from different fonts
arabic shaping is not done when glyphs come from different fonts
Status: RESOLVED WONTFIX
Product: pango
Classification: Platform
Component: general
1.2.x
Other Linux
: Normal major
: ---
Assigned To: pango-maint
pango-maint
Depends on:
Blocks:
 
 
Reported: 2004-09-23 14:02 UTC by Pablo Saratxaga
Modified: 2005-10-13 09:49 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
an image showing rendering of a same word with pango and yudit and two different fonts, to illustrate the problem (13.09 KB, image/png)
2004-09-23 14:04 UTC, Pablo Saratxaga
Details

Description Pablo Saratxaga 2004-09-23 14:02:24 UTC
when contiguous letters are displayed with glyphs from different fonts, they are
incorrectly shaped.
tested in pango 1.2.5, I didn't looked at more recent CVS versions, but there is
nothing in the Changelogs that say it has been fixed, so most likely the problem
is still there.
The internal reasons of the problem are also probably somewhat related to bug #83058

For example, the word ئۇيغۇرچە (Uyghurche: 064a, 065a, 06c7, 06a4, 063a, 06c7,
0631, 0686, 06d5) displayed in the attached image.

The first rendering is done with pango, using "Sans" pseudo font, which takes
its glyphs primarly from "Arial" font; the arabic letter U (06c7) is missing
from Arial, and taken from "Code2000" font (the only font I have with that
letter); note how the boundaries between 06c7 and the preceding letter (wrongly)
become non-joigning boundaries.

The second rendering is done with pango and "Code2000" font. the last letter
(06d5) is missing from Code2000, so we see again the same unshaping between the
two last letters. The rendering of the first letter is wrong too, and maybe it
is a bug in pango (the hamza above (065a) should be ignored in shaping, and
pango should look at the following letter (here 06c7) to see if the first one
should connect or not). But note how this time the letter 06c7 is properly
shaped as joining on its right; and how the letter 063a is properly joining on
its left too, contrary to the first rendering.

The third rendering was done with yudit editor, with a font selection similar to
the "Sans" one used by pango; and that is how pango should render arabic script
with glyphs from multiple fonts.
Note how all glyphs are properly shaped.
Yes, their joining is not perfect, as they come from different fonts where the
joining line is at different heights; but it is much much better anyway than the
pango rendering on line 1. That is how the first line should look like.

The last rendering is done with yudit, with a font selection similar to the
"Code2000" used by pango (that is, Code2000 for all glyphs but the last one);
look how all letters are properly shaped; there is no problem because of the
hamza above between first and third letter; and no shaping problem because of
the last glyph coming from a different font; that is how the 2nd rendering
should look like.
Comment 1 Pablo Saratxaga 2004-09-23 14:04:04 UTC
Created attachment 31872 [details]
an image showing rendering of a same word with pango and yudit and two different fonts, to illustrate the problem
Comment 2 Pablo Saratxaga 2004-09-23 14:06:19 UTC
(too bad that bugzilla doesn't allow UTF-8 input... for pango and i18n problems
it would help to be able to type non-ascii in the bug report)
Comment 3 Owen Taylor 2004-09-23 14:18:38 UTC
Very hard to fix without major restructuring of Pango.

(I've generally seen no problem with UTF-8 in bugzlla recently)

Comment 4 Munzir Taha 2004-09-24 10:49:34 UTC
Owen, would you please reconsider this. You can make it low priority or delay 
it but it need to be fixed one day. Sans fonts would not be useful in such 
situations and they would lose the purpose they made for afterall. 
Comment 5 Owen Taylor 2004-09-24 14:38:41 UTC
To me it's a fairly serious bug or misconfiguration if the automatically
selected for the Arabic script can't properly display your language. Throwing in a 
character from a different font is a poor substitute, even if it is properly
shaped. It's vastly better to use the different font in the first place.

If someone is working in Urdu, the Arabic script font they should get for
Sans should be one with all the necessary Arabic characters for Urdu.

So, I don't see it as a huge problem, and as I said, it's very hard to fix.
I'd like to restrict open bugs to bugs, that either:

 - I'm planning to fix

or that:

 - If someone asked, I could tell them how to go about fixing, even if it 
   would be months of work.

If a bug doesn't fit into either of those categories, I think keeping it
open is just clutter.
Comment 6 Behdad Esfahbod 2005-10-13 09:49:59 UTC
I'm actually planning to work on this.  The problem is that most of the text
rendered using Pango is not tagged, so for example when I typically deal with
Persian text, most of the time the font chosen is the best for rendering Arabic,
not Persian...

I see Mozilla doing this, and I know I appreciate it when browsing sites that
(mis)use weird characters they should not be using in Persian (and so our
Persian fonts do not have.)  Anyway, can stay closed until a patch comes up.