GNOME Bugzilla – Bug 563156
Document printing and scanning gunichar values
Last modified: 2008-12-08 02:58:47 UTC
There are two conventions for writing out a Unicode codepoint in ASCII: * For valid Unicode characters, U+XXXX, U+XXXXX, or U+XXXXXX is used depending on how many hex chars is needed. * For invalid or unassigned codepoints, same format is used by with a '-' instead of '+'. Another useful representation is "\uxxxx" which is used by Java and some other languages. Would be useful to document the printf/scanf format for these. This requires a format modifier for guint32 (bug 563150), but other than that: * For scanf: "U%*[-+]%06X" * For printf: "U+%04X" and "U-%04X" I'm not convinced that these deserve a macro, but mentioning in the docs would be nice.
Sounds like a nice addition, feel free to add this information at a suitable place, e.g the long desc in the "Unicode Manipulation" section.
Ok, wrote some docs for this. I ignored the U-XXXX notation as that's not documented in the Unicode standard. Adding the guint32 modifier, these become quite ugly: * For scanf: "U+%06"G_GINT32_FORMAT"X" * For printf: "U+%04"G_GINT32_FORMAT"X" Anyway. Fixed. 2008-12-07 Behdad Esfahbod <behdad@gnome.org> Bug 563156 – Document printing and scanning gunichar values * glib/tmpl/unicode.sgml: Document printing and scanning gunichar values.