After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 63633 - Use UCD compat decompositions
Use UCD compat decompositions
Status: RESOLVED FIXED
Product: pango
Classification: Platform
Component: general
0.x
Other All
: Normal normal
: Medium feature
Assigned To: pango-maint
pango-maint
: 468334 556079 560562 572627 581350 595633 (view as bug list)
Depends on:
Blocks: 83935 85341 595615 621639
 
 
Reported: 2001-11-02 20:22 UTC by Havoc Pennington
Modified: 2012-08-18 17:07 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Havoc Pennington 2001-11-02 20:22:48 UTC
We should allow people to use ellipses and maybe bullets 
without having to require special fonts.

Ellipses are easy, just shape them to three period glyphs.
Everyone will use three periods anyhow, if we don't support 
ellipses reliably.
Comment 1 Owen Taylor 2001-11-19 22:23:25 UTC
Interesting question is that if your font description has several
fonts - some being fallback and possibly not matching the very
well, when is it better to use synthesized glyphs rather than glyphs 
for a fallback font?
Comment 2 Behdad Esfahbod 2005-11-23 14:11:11 UTC
To wildly generalize, should we go on with compatibility decomposition for any
missing character?  That information is in UnicodeData.txt, not hard to put in a
compact table.

I think after finding the answer to Owen's question, the rest is not hard at
all. Thoughts?
Comment 3 Alexander “weej” Jones 2005-11-27 03:45:57 UTC
I noticed an interrobang (I think it's called) on a site the other day that looked /rubbish/ cause it was the fallback bitmap jobby. Would be nice if that glyph was generated automatically from a ! and a ? somehow...
Comment 4 Behdad Esfahbod 2005-11-27 03:58:08 UTC
*That* kind of characters are really not in the scope of this bug.  It's what
fonts are for afterall :).
Comment 5 Behdad Esfahbod 2007-06-28 14:45:38 UTC
Note that the bullet doesn't have a compat equivalent in UCD.

vte has been doing something like this, drawaing boxdrawing, bullets, and a bunch of other character graphically.  The question is, with fonts like DejaVu now common, do we still need to go this far?

And it's probably easier to ship a font with pango that has all the characters we want to guarantee instead of coding it in.

So there are two different issues here:

  * Use UCD compat data.  This should be done in the same place that we want to support NFC/NFD conversion.

  * Graphically draw some chars like bullets, thin space, etc.  Not sure this is worth the effort.
Comment 6 Behdad Esfahbod 2007-06-28 14:46:07 UTC
*** Bug 145275 has been marked as a duplicate of this bug. ***
Comment 7 Behdad Esfahbod 2007-08-20 03:44:35 UTC
*** Bug 468334 has been marked as a duplicate of this bug. ***
Comment 8 Alexander “weej” Jones 2007-08-20 18:47:20 UTC
D'oh, I didn't realise it was this bug, I'm already CC'd.

Using the approximate sequence is *definitely* preferable when the missing glyph is already a composure of existing glyphs.

Basically, seeing something like "The temperature is 15 ℃" will look terrible if the C doesn't match up. Try this in different fonts: "C℃C℃C℃".

In a good font (for example Bitstream Vera Sans or Deja Vu Sans), it is more preferable to use "℃" vs. "°C", as the appearance is better (and it's not just a kerning issue either, it's intended to lock together rather than just be placed next to each other). We could use this by default in, e.g., Weather Applet, but people using fonts without °C will be faced with an odd looking "C". Letting pango do the detection and replacement transparently would be pretty awesome.

As for graphically drawing stuff like bullets, as you say I think it would be much better (not to mention simpler for your codebase) to distribute a fallback font.
Comment 9 Behdad Esfahbod 2007-08-20 19:47:45 UTC
(In reply to comment #8)
> As for graphically drawing stuff like bullets, as you say I think it would be
> much better (not to mention simpler for your codebase) to distribute a fallback
> font.

And to not reinvent the wheel, lets agree that that font is DejaVu (LGC).  Problem solved.
Comment 10 Behdad Esfahbod 2007-10-10 23:47:14 UTC
Relevant FAQ from Unicode: http://www.unicode.org/faq/unsup_char.html
Comment 11 Behdad Esfahbod 2008-10-14 03:32:23 UTC
*** Bug 556079 has been marked as a duplicate of this bug. ***
Comment 12 Sven Neumann 2008-11-13 07:56:26 UTC
*** Bug 560562 has been marked as a duplicate of this bug. ***
Comment 13 Robert Margulski 2008-11-14 01:10:38 UTC
Dear Developers,
I'm sorry that I stirred up this discussion when I tried to submit #556079.
Please, my unanswered question is still valid.
Gimp 2.4.7 displays 33% as "33%"
Gimp 3.0.0 displays 33% as "332005%" with 2005 in a glyph.
I have been told it is (just with 3.0.0+) my system font.
Q: What font should I be using?
please.
thank you - RMargulski@tellurian.net
Comment 14 Sven Neumann 2008-11-14 07:09:53 UTC
Robert, please don't post such questions here. If you don't understand what this bug report is about, then you may ask questions in the bug report that you filed.
Comment 15 Sven Neumann 2008-11-14 07:15:41 UTC
Behdad, as this is a really annoying problem, in particular for users on the Windows platform, I would like to give this a shot and try to come up with a patch that fixes this at least for characters that have the White_Space property. Can you give me a hint where this should be implemented ?
Comment 16 Behdad Esfahbod 2008-11-18 14:37:55 UTC
Sven, I'm currently thinking about how to best implement this.  I'll comment here in a few days.
Comment 17 Owen Taylor 2008-11-18 23:22:38 UTC
I thought there was another bug specifically about whitespace like U+2005
(FOUR-PER-EM Space) - my argument was that we should *never* fall back for
whitespace even when a different font on the system has the glyph; just use a 
space glyph and advance font_size/4 (or whatever is appropriate for that whitespace)

I can't find that bug though.
Comment 18 Sven Neumann 2009-02-21 11:34:10 UTC
*** Bug 572627 has been marked as a duplicate of this bug. ***
Comment 19 Michael Schumacher 2009-05-04 23:39:52 UTC
*** Bug 581350 has been marked as a duplicate of this bug. ***
Comment 20 Matthias Clasen 2009-09-18 22:10:24 UTC
*** Bug 595633 has been marked as a duplicate of this bug. ***
Comment 21 Morten Welinder 2009-10-02 17:29:16 UTC
In regards to comment 17:

Owen, that is probably my bug 416526 which was duped to bug 145275.
Comment 22 Philip Withnall 2010-05-03 18:50:55 UTC
(In reply to comment #16)
> Sven, I'm currently thinking about how to best implement this.  I'll comment
> here in a few days.

Behdad, did you come up with a plan for this? If you can give me a starting point, I'd like to hack on this bug.
Comment 23 Behdad Esfahbod 2010-05-04 14:10:46 UTC
(In reply to comment #22)
> (In reply to comment #16)
> > Sven, I'm currently thinking about how to best implement this.  I'll comment
> > here in a few days.
> 
> Behdad, did you come up with a plan for this? If you can give me a starting
> point, I'd like to hack on this bug.

Actually no, I didn't.  We may be able to solve this in HarfBuzz, but then remains fixing the font selection in Pango.
Comment 24 Philip Withnall 2010-05-08 00:27:03 UTC
(In reply to comment #23)
> (In reply to comment #22)
> > (In reply to comment #16)
> > > Sven, I'm currently thinking about how to best implement this.  I'll comment
> > > here in a few days.
> > 
> > Behdad, did you come up with a plan for this? If you can give me a starting
> > point, I'd like to hack on this bug.
> 
> Actually no, I didn't.  We may be able to solve this in HarfBuzz, but then
> remains fixing the font selection in Pango.

I've taken a look at harfbuzz-ng; am I correct in thinking that the solution should be in hb_map_glyphs()? If it can't find a glyph for the current font and a decomposition exists for the current codepoint, it could insert the decomposed glyphs instead. Unfortunately, this means the buffer might have to be reallocated. :-\

I don't know how this would fit in with falling back on different fonts in Pango rather than decompositions.
Comment 25 Behdad Esfahbod 2010-05-08 03:15:24 UTC
(In reply to comment #24)

> I've taken a look at harfbuzz-ng; am I correct in thinking that the solution
> should be in hb_map_glyphs()? If it can't find a glyph for the current font and
> a decomposition exists for the current codepoint, it could insert the
> decomposed glyphs instead. Unfortunately, this means the buffer might have to
> be reallocated. :-\

Buffer reallocation is not a problem as we already have the facility for that.  And yes, that's the right function.  Decomposing is easy.  Harder is to try composing multiple characters into one.  In that case, we should normalize the string first.  It sounds like a good idea to normalize anyway.

I'll keep it in mind for the upcoming HarfBuzz hackfest.

> I don't know how this would fit in with falling back on different fonts in
> Pango rather than decompositions.

So yes, that's the part I don't know either.  If we know the layout engine does decomposition, pango's font selection can also do the same...
Comment 26 Philip Withnall 2010-06-25 21:14:18 UTC
(In reply to comment #25)
> I'll keep it in mind for the upcoming HarfBuzz hackfest.

Did anything come about for this as a result of the hackfest?
Comment 27 Philip Withnall 2012-08-05 09:04:43 UTC
The use of compat decompositions has been fixed in HarfBuzz as https://bugs.freedesktop.org/show_bug.cgi?id=41095. Is there anything left to do for this bug report (ignoring the fact that Pango doesn’t yet use HarfBuzz).
Comment 28 Behdad Esfahbod 2012-08-07 19:12:20 UTC
Well, the itemizer also need to know that harfbuzz can handle the sequence.  Right now the itemizer will try to find a fallback font supporting the character.  Only if no font has that character it will try rendering it using the default font and magically harfbuzz will decompose it.  To be honest, I'm not quite disturbed by this.  I think it's good enough.  So this can be closed when harfbuzz lands in pango.
Comment 29 Behdad Esfahbod 2012-08-18 17:07:42 UTC
We've merged the HarfBuzz branch.  Closing fixed.