After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 103938 - feature request: better handling of invalid combinations
feature request: better handling of invalid combinations
Product: pango
Classification: Platform
Component: general
Other other
: Normal normal
: Medium feature
Assigned To: pango-maint
: 121095 127176 132378 (view as bug list)
Depends on:
Reported: 2003-01-20 03:44 UTC by Noah Levitt
Modified: 2012-08-18 17:10 UTC
See Also:
GNOME target: ---
GNOME version: ---

patch to improve the current behavior somewhat (2.74 KB, patch)
2003-07-20 03:27 UTC, Noah Levitt
needs-work Details | Review

Description Noah Levitt 2003-01-20 03:42:12 UTC
Package: pango
Severity: enhancement
Version: 1.2.x
Synopsis: feature request: better handling of invalid combinations
Bugzilla-Product: pango
Bugzilla-Component: general


OpenType documentation[1] seems to say that invalid combinations should
be handled thusly:

 - If a combining character is preceded by a space and a zero width
joiner, render it all by itself.

 - Otherwise, render it on top of U+25CC DOTTED CIRCLE. This way it will
look the way it does in the code charts.

Also worth noting is that the Unicode Standard 3.2, section 2.6, says:
"By convention, diacritical marks used by the Unicode Standard may be
exhibited in (apparent) isolation by applying them to U+0020 SPACE or to


------- Bug moved to this database by 2003-01-19 22:42 -------

Reassigning to the default owner of the component,

Comment 1 Noah Levitt 2003-07-20 03:27:01 UTC
Created attachment 18445 [details] [review]
patch to improve the current behavior somewhat
Comment 2 Noah Levitt 2003-07-20 03:39:21 UTC
The patch above doesn't do anything with dotted circles. It seems like
that'd be pretty hard. Instead, it attempts to allot an appropriate
amount of room for a sequence of combining characters applied to a
space or not applied to anything (at the beginning of a line for example).

Incidentally, as I read the standard, a space followed by a combining
character isn't technically an invalid combination. Section 3.6 D17a:
"Defective combining character sequence: a combining character
sequence that does not start with a base character.  - Defective
combining character sequences occur when a sequence of combining
characters appears at the start of a string or follows a control or
format character."
Comment 3 Noah Levitt 2003-07-20 20:32:17 UTC
My patch above is wrong. It doesn't handle the case where the base
character and combining character, or different combining characters,
come from different fonts.
Comment 4 Owen Taylor 2003-11-17 23:28:07 UTC
What was the conclusion of the discussion about this 
on the Unicode mailing list? I thought some people were
arguing that the OpenType interpretation was clearly 
incompatible with the Unicode spec. 

(Though it may well be the case that we should deviate
from the Unicode spec as well if that is going ot make
things better for our users.)

Comment 5 Noah Levitt 2003-11-18 07:59:41 UTC
In my opinion, the conclusion of the thread was that space+diacritic
should show the diacritic in isolation, and a diacritic in isolation
should be shown on a dotted circle. It's not crystal clear though.
John Cowan does say, "This is a clear demonstration that Uniscribe
fails to implement a standard correctly, a property unique neither to
Microsoft nor to the Unicode Standard," in reference to the space+ZWJ
Comment 6 Noah Levitt 2003-11-18 23:05:48 UTC
*** Bug 121095 has been marked as a duplicate of this bug. ***
Comment 7 Owen Taylor 2004-02-23 18:40:58 UTC
*** Bug 132378 has been marked as a duplicate of this bug. ***
Comment 8 Behdad Esfahbod 2006-07-08 03:51:41 UTC
*** Bug 127176 has been marked as a duplicate of this bug. ***
Comment 9 Behdad Esfahbod 2006-07-08 03:52:20 UTC
Bug 127176 contains patch to do dotted-circle for invalid mark sequences in the Arabic shaper.
Comment 10 Behdad Esfahbod 2006-08-11 16:10:53 UTC
Red Hat bug about Punjabi and dotted circle (which is implemented in Qt):
Comment 11 Behdad Esfahbod 2012-08-18 17:10:26 UTC
Closing obsolete.  Should be addressed in HarfBuzz if ever.