After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 350610 - Unicode Bidirectional types and functions
Unicode Bidirectional types and functions
Status: RESOLVED WONTFIX
Product: glib
Classification: Platform
Component: general
2.12.x
Other Linux
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2006-08-09 15:09 UTC by Ed Catmur
Modified: 2008-11-29 00:02 UTC
See Also:
GNOME target: ---
GNOME version: 2.15/2.16



Description Ed Catmur 2006-08-09 15:09:27 UTC
Currently if I want to find the direction type of a gunichar or the base direction of a string I have to link Pango. This would be a good candidate for inclusion in GLib along with the other gunichar functions.

Specifically:

enum GUnicodeDirection {...} (from PangoDirection)
GUnicodeDirection g_unichar_direction (gunichar c) (from pango_unichar_direction)
GUnicodeDirection g_utf8_base_direction (const gchar *str, gssize len) (from pango_find_base_dir)

Also useful might be an implementation of the Bidirectional Algorithm.
Comment 1 Behdad Esfahbod 2006-08-10 15:21:17 UTC
Out of curiousity, what are you planning to use the bidirectional algorithm for?
Comment 2 Ed Catmur 2006-08-10 16:29:08 UTC
Well, actually I wasn't. It just seemed like a good idea, seeing as it's specified in Unicode but non-trivial to implement correctly.

The basic directionality stuff, though, is your fault, actually :). I read Planet Gnome in Liferea, and because you sign your name in Arabic script it messes up the directionality of the item list (<name>: <item>); because your name's at the front the base direction is detected as RTL. Obviously to fix this you have to detect the directionality of the overall feed itself (from the title, say); which means linking Pango, which seems unfair given that most other Unicode data is in GLib.

Actually, that's another thing - Liferea now overrides item directionality by inserting a LRM or RLM at the start of the text; would it make sense to ask for a widget property to override text base direction? I'm thinking of four options:
GTK_WIDGET_TEXT_BASE_DIRECTION_DEFAULT /* default: use direction from text */
GTK_WIDGET_TEXT_BASE_DIRECTION_WIDGET  /* as gtk_widget_get_direction()/gtk_widget_get_default_direction() */
GTK_WIDGET_TEXT_BASE_DIRECTION_LTR     /* left-to-right */
GTK_WIDGET_TEXT_BASE_DIRECTION_RTL     /* right-to-left */
This would be fed through to Pango via pango_layout_set_auto_dir() and pango_context_set_base_dir(). Should I file an enhancement bug on this?
Comment 3 Behdad Esfahbod 2006-08-10 20:57:04 UTC
We have bugs open for adding Pango markup/attributes that allow things like that. 
Comment 4 Ed Catmur 2006-08-11 16:55:36 UTC
Ah: bug 70399 and bug 168108. Thanks.
Comment 5 Dan Winship 2008-08-22 18:13:53 UTC
The IDNA algorithms impose certain restrictions on the bidi properties of characters in internationalized domain names, to avoid a case where two distinct hostnames would both display the same because of directionality issues. (Eg, www.אבג123.com and www.123אבג.com, where the first is "aleph bet gimel 1 2 3" and the second is "1 2 3 aleph bet gimel".)

Enforcing the restriction is more important for nameserver implementations than it is for clients (since if a client looks up an invalid name, it will just get a "not found"), but the specs say that clients are supposed to do the checks as well anyway. (Security reasons?) So this is something that would require access to the bidi properties. (But not the whole bidi algorithm.)

The current rule (from RFC 3454) is that if any character in a segment of a domain name has bidi character type R or AL, then the segment must start and end with an R or AL character, and cannot contain any L characters. However, this rule doesn't work with some languages and is being updated (http://tools.ietf.org/wg/idnabis/draft-ietf-idnabis-bidi/). The currently-proposed new rule makes use of even more distinct bidi types than the old one, so this would definitely require more than just the current PangoDirection values.
Comment 6 Behdad Esfahbod 2008-08-22 20:29:27 UTC
Pango now exports the bidi type.  And there's of course always GNU FriBidi.  I'd rather not add these to glib right now.