GNOME Bugzilla – Bug 63633
Use UCD compat decompositions
Last modified: 2012-08-18 17:07:42 UTC
We should allow people to use ellipses and maybe bullets without having to require special fonts. Ellipses are easy, just shape them to three period glyphs. Everyone will use three periods anyhow, if we don't support ellipses reliably.
Interesting question is that if your font description has several fonts - some being fallback and possibly not matching the very well, when is it better to use synthesized glyphs rather than glyphs for a fallback font?
To wildly generalize, should we go on with compatibility decomposition for any missing character? That information is in UnicodeData.txt, not hard to put in a compact table. I think after finding the answer to Owen's question, the rest is not hard at all. Thoughts?
I noticed an interrobang (I think it's called) on a site the other day that looked /rubbish/ cause it was the fallback bitmap jobby. Would be nice if that glyph was generated automatically from a ! and a ? somehow...
*That* kind of characters are really not in the scope of this bug. It's what fonts are for afterall :).
Note that the bullet doesn't have a compat equivalent in UCD. vte has been doing something like this, drawaing boxdrawing, bullets, and a bunch of other character graphically. The question is, with fonts like DejaVu now common, do we still need to go this far? And it's probably easier to ship a font with pango that has all the characters we want to guarantee instead of coding it in. So there are two different issues here: * Use UCD compat data. This should be done in the same place that we want to support NFC/NFD conversion. * Graphically draw some chars like bullets, thin space, etc. Not sure this is worth the effort.
*** Bug 145275 has been marked as a duplicate of this bug. ***
*** Bug 468334 has been marked as a duplicate of this bug. ***
D'oh, I didn't realise it was this bug, I'm already CC'd. Using the approximate sequence is *definitely* preferable when the missing glyph is already a composure of existing glyphs. Basically, seeing something like "The temperature is 15 ℃" will look terrible if the C doesn't match up. Try this in different fonts: "C℃C℃C℃". In a good font (for example Bitstream Vera Sans or Deja Vu Sans), it is more preferable to use "℃" vs. "°C", as the appearance is better (and it's not just a kerning issue either, it's intended to lock together rather than just be placed next to each other). We could use this by default in, e.g., Weather Applet, but people using fonts without °C will be faced with an odd looking "C". Letting pango do the detection and replacement transparently would be pretty awesome. As for graphically drawing stuff like bullets, as you say I think it would be much better (not to mention simpler for your codebase) to distribute a fallback font.
(In reply to comment #8) > As for graphically drawing stuff like bullets, as you say I think it would be > much better (not to mention simpler for your codebase) to distribute a fallback > font. And to not reinvent the wheel, lets agree that that font is DejaVu (LGC). Problem solved.
Relevant FAQ from Unicode: http://www.unicode.org/faq/unsup_char.html
*** Bug 556079 has been marked as a duplicate of this bug. ***
*** Bug 560562 has been marked as a duplicate of this bug. ***
Dear Developers, I'm sorry that I stirred up this discussion when I tried to submit #556079. Please, my unanswered question is still valid. Gimp 2.4.7 displays 33% as "33%" Gimp 3.0.0 displays 33% as "332005%" with 2005 in a glyph. I have been told it is (just with 3.0.0+) my system font. Q: What font should I be using? please. thank you - RMargulski@tellurian.net
Robert, please don't post such questions here. If you don't understand what this bug report is about, then you may ask questions in the bug report that you filed.
Behdad, as this is a really annoying problem, in particular for users on the Windows platform, I would like to give this a shot and try to come up with a patch that fixes this at least for characters that have the White_Space property. Can you give me a hint where this should be implemented ?
Sven, I'm currently thinking about how to best implement this. I'll comment here in a few days.
I thought there was another bug specifically about whitespace like U+2005 (FOUR-PER-EM Space) - my argument was that we should *never* fall back for whitespace even when a different font on the system has the glyph; just use a space glyph and advance font_size/4 (or whatever is appropriate for that whitespace) I can't find that bug though.
*** Bug 572627 has been marked as a duplicate of this bug. ***
*** Bug 581350 has been marked as a duplicate of this bug. ***
*** Bug 595633 has been marked as a duplicate of this bug. ***
In regards to comment 17: Owen, that is probably my bug 416526 which was duped to bug 145275.
(In reply to comment #16) > Sven, I'm currently thinking about how to best implement this. I'll comment > here in a few days. Behdad, did you come up with a plan for this? If you can give me a starting point, I'd like to hack on this bug.
(In reply to comment #22) > (In reply to comment #16) > > Sven, I'm currently thinking about how to best implement this. I'll comment > > here in a few days. > > Behdad, did you come up with a plan for this? If you can give me a starting > point, I'd like to hack on this bug. Actually no, I didn't. We may be able to solve this in HarfBuzz, but then remains fixing the font selection in Pango.
(In reply to comment #23) > (In reply to comment #22) > > (In reply to comment #16) > > > Sven, I'm currently thinking about how to best implement this. I'll comment > > > here in a few days. > > > > Behdad, did you come up with a plan for this? If you can give me a starting > > point, I'd like to hack on this bug. > > Actually no, I didn't. We may be able to solve this in HarfBuzz, but then > remains fixing the font selection in Pango. I've taken a look at harfbuzz-ng; am I correct in thinking that the solution should be in hb_map_glyphs()? If it can't find a glyph for the current font and a decomposition exists for the current codepoint, it could insert the decomposed glyphs instead. Unfortunately, this means the buffer might have to be reallocated. :-\ I don't know how this would fit in with falling back on different fonts in Pango rather than decompositions.
(In reply to comment #24) > I've taken a look at harfbuzz-ng; am I correct in thinking that the solution > should be in hb_map_glyphs()? If it can't find a glyph for the current font and > a decomposition exists for the current codepoint, it could insert the > decomposed glyphs instead. Unfortunately, this means the buffer might have to > be reallocated. :-\ Buffer reallocation is not a problem as we already have the facility for that. And yes, that's the right function. Decomposing is easy. Harder is to try composing multiple characters into one. In that case, we should normalize the string first. It sounds like a good idea to normalize anyway. I'll keep it in mind for the upcoming HarfBuzz hackfest. > I don't know how this would fit in with falling back on different fonts in > Pango rather than decompositions. So yes, that's the part I don't know either. If we know the layout engine does decomposition, pango's font selection can also do the same...
(In reply to comment #25) > I'll keep it in mind for the upcoming HarfBuzz hackfest. Did anything come about for this as a result of the hackfest?
The use of compat decompositions has been fixed in HarfBuzz as https://bugs.freedesktop.org/show_bug.cgi?id=41095. Is there anything left to do for this bug report (ignoring the fact that Pango doesn’t yet use HarfBuzz).
Well, the itemizer also need to know that harfbuzz can handle the sequence. Right now the itemizer will try to find a fallback font supporting the character. Only if no font has that character it will try rendering it using the default font and magically harfbuzz will decompose it. To be honest, I'm not quite disturbed by this. I think it's good enough. So this can be closed when harfbuzz lands in pango.
We've merged the HarfBuzz branch. Closing fixed.