GNOME Bugzilla – Bug 696407
add a function to convert arbitrary data to valid JSON strings
Last modified: 2017-09-05 10:39:23 UTC
JSON strings are defined in the RFC as: The representation of strings is similar to conventions used in the C family of programming languages. A string begins and ends with quotation marks. All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F). Any character may be escaped. If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF), then it may be represented as a six-character sequence: a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point. The hexadecimal letters A though F can be upper or lowercase. So, for example, a string containing only a single reverse solidus character may be represented as "\u005C". Alternatively, there are two-character sequence escape representations of some popular characters. So, for example, a string containing only a single reverse solidus character may be represented more compactly as "\\". To escape an extended character that is not in the Basic Multilingual Plane, the character is represented as a twelve-character sequence, encoding the UTF-16 surrogate pair. So, for example, a string containing only the G clef character (U+1D11E) may be represented as "\uD834\uDD1E". see: http://www.ietf.org/rfc/rfc4627.txt?number=4627 the parsing code can deal with escaped Unicode code points in both UTF-8 and UTF-16 surrogate pairs, but we don't have anything that can generate escaped sequences from arbitrary data. all functions in JSON-GLib dealing with strings assume that the string is UTF-8 encoded, and without control points; we cannot change that to work compatibly: we'd have to add a "length" argument to all functions dealing with strings, or we'd have to duplicate each entry point dealing with strings. instead, we could add a function like: char *json_escape_string (const guint8 *data, gsize len); that behaves like g_markup_escape_text(), and escapes all Unicode characters (including control characters) into \uXXXX and \uXXXX\uXXXX sequences.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/json-glib/issues/5.