GNOME Bugzilla – Bug 341016
libxml2 fails to parse auto-detected UCS-4LE in-memory data (and likely others)
Last modified: 2021-07-05 13:26:36 UTC
Please describe the problem: libxml2 fails to parse auto-detected UCS-4LE in-memory data (and likely others). The reason is that it determines the encoding to be "UCS-4LE", then asks iconv for "ISO-10646-UCS-4" (which iconv does not know) and then lets iconv find a codec for "UCS-4", which is different from "UCS-4LE". Parsing as "UCS-4" fails. Steps to reproduce: 1. parse a UCS-4LE buffer - fails 2. ask libxml2 to autodetect the encoding: XML_CHAR_ENCODING_UCS4LE 3. ask libxml2 what the name of the encoding is: "ISO-10646-UCS-4" (note the missing "LE") Parsing the buffer with an explicit encoding "UCS-4LE" works. Actual results: Expected results: Does this happen every time? Other information:
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.