After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 721239 - xmlSaveTree() will ignore encoding from context
xmlSaveTree() will ignore encoding from context
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: general
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2013-12-30 15:24 UTC by milo
Modified: 2021-07-05 13:24 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description milo 2013-12-30 15:24:37 UTC
Hello,

if i have some document like this:

***********************************************************
<?xml version="1.0"?>
<h>
   <c tooltip="N&#xFC;rburg Grand Prix Kurs"/>
   <c tooltip="Aut&#xF3;dromo Jos&#xE9; Carlos Pace"/>
</h> 
***********************************************************

loaded into pxdocin. Here pxdocin->encoding is set, obviously, to NULL because the declaration does not contain info about encoding.

So when you try to create some save ctxt and set the output encoding to utf8,  and use the xmlSaveDoc() function, the document will be automatically switched and the output, in a windows shell, would be like this

***********************************************************
<?xml version="1.0" encoding="utf8"?>
<h>
   <c tooltip="N├╝rburg Grand Prix Kurs"/>
   <c tooltip="Aut├│dromo Jos├® Carlos Pace"/>
</h>
***********************************************************

here is the code for this output:

***********************************************************
xmlNodePtr pRoot=xmlDocGetRootElement(pxdocin);
xmlSaveCtxtPtr pxsctxt=xmlSaveToFd(1,"utf8",XML_SAVE_FORMAT|XML_SAVE_AS_XML);
xmlSaveDoc(pxsctxt,pRootNode->doc);
xmlSaveClose(pxsctx);
***********************************************************

which is ok (note that windows shell is defaulted to cp850)

BUT: if you try to dump a node or a nodelist, using xmlSaveTree() function, then the output results to this:

***********************************************************
<h>
   <c tooltip="N&#xFC;rburg Grand Prix Kurs"/>
   <c tooltip="Aut&#xF3;dromo Jos&#xE9; Carlos Pace"/>
</h> 
***********************************************************

here is the code for this (note the absence of the decl, because it's the root node:

***********************************************************
xmlNodePtr pRoot=xmlDocGetRootElement(pxdocin);
xmlSaveCtxtPtr pxsctxt=xmlSaveToFd(1,"utf8",XML_SAVE_FORMAT|XML_SAVE_AS_XML);
xmlSaveTree(pxsctxt,pRoot);
xmlSaveClose(pxsctx);
***********************************************************

so basically it's the same saving context but using the xmlSaveTree() function and not the xmlSaveDoc() function.

The problem, here, is in xmlsave.c @ line:2101

***********************************************************
        } else if ((*cur >= 0x80) && ((doc == NULL) ||
                                      (doc->encoding == NULL))) {
***********************************************************

because:
- calling xmlSaveDoc(), will set doc->encoding to ctxt->encoding and dump to output correctly encoded
- calling xmlSaveTree(), will not set doc->encoding, thus ti will escape chars using numeric entities.

SO:
my idea of fix is in file xmlsave.c @line 1953


long
xmlSaveTree(xmlSaveCtxtPtr ctxt, xmlNodePtr node)
{
    long ret = 0;
    const xmlChar* pSavedDocEncoding=NULL;

    if ((ctxt == NULL) || (node == NULL)) return(-1);

    // save document encoding
    pSavedDocEncoding=node->doc->encoding;

    // set the ctxt encoding
    node->doc->encoding=ctxt->encoding;

    // dump
    xmlNodeDumpOutputInternal(ctxt, node);

    // restore document encoding
    node->doc->encoding=pSavedDocEncoding;

    return(ret);
}

this will ensure that the encoding is set and evaluated and then restored when it ends

bye.
Comment 1 GNOME Infrastructure Team 2021-07-05 13:24:42 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.