GNOME Bugzilla – Bug 790034
Using Glib::ustring for URIs
Last modified: 2020-11-12 09:29:08 UTC
Libxml++ seems to use Glib::ustring extensively, even for URIs. Are URIs supposed to be Unicode strings at all? Glib::uri_parse_scheme, Glib::file_get_contents, and other functions in glibmm operate on normal std::strings. As URIs typically need to have most non-alphanumerical symbols urlencoded, and filenames don't have an encoding at all (i.e. should be treated as byte strings), I propose changing the interfaces to use std::string for URI passing instead.
From libxml2/include/libxml/xmlstring.h: /** * xmlChar: * * This is a basic byte in an UTF-8 encoded string. * It's unsigned allowing to pinpoint case where char * are assigned * to xmlChar * (possibly making serialization back impossible). */ typedef unsigned char xmlChar; In most functions libxml2 uses xmlChar* or const xmlChar* for URIs, announcing that the strings are UTF-8 encoded. The textreader parser is an exception. URIs are const char* there. The situation is unclear. If remains unclear when I check what's used for URIs in glibmm and gtkmm. Somewhere std::string, elsewhere Glib::ustring.
libxml++ has moved to https://github.com/libxmlplusplus/libxmlplusplus If this ticket is still valid in a recent version of libxml++, then please create a ticket at https://github.com/libxmlplusplus/libxmlplusplus/issues - thanks a lot!