After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 523187 - xmlSaveFormatFileEnc doesn't work if new nodes were added to an existing xmlDoc
xmlSaveFormatFileEnc doesn't work if new nodes were added to an existing xmlDoc
Status: RESOLVED NOTABUG
Product: libxml
Classification: Deprecated
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2008-03-18 14:49 UTC by Ignacio Espinosa
Modified: 2008-03-19 16:03 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
libxmlpp_formatted.cc (964 bytes, text/plain)
2008-03-18 15:13 UTC, Murray Cumming
Details
test.xml (106 bytes, text/plain)
2008-03-18 15:14 UTC, Murray Cumming
Details
libxml_formatted.c (765 bytes, text/plain)
2008-03-18 15:30 UTC, Murray Cumming
Details

Description Ignacio Espinosa 2008-03-18 14:49:44 UTC
write_to_file_formatted doesn't work on new nodes added to a existing Document.
Here are the source files and the result of the program:

=== program.cc ===
#include <iostream>
#include <libxml++/libxml++.h>

using namespace std;
using namespace xmlpp;

int main (void)
{
        try {
                DomParser parser;
                parser.parse_file("test.xml");
                if(parser)
                {
                        Document *doc = parser.get_document();
                        Node *root = doc->get_root_node();
                        Element *tmp_elem = root->add_child("newnode00");
                        Node *tmp_node = dynamic_cast<Node*>(tmp_elem);

                        tmp_elem = tmp_node->add_child("childnode01");
                        tmp_elem->set_child_text("text 0.1");
                        tmp_elem = tmp_node->add_child("childnode02");
                        tmp_elem->set_child_text("text 0.2");

                        doc->write_to_file_formatted("result.xml");
                }
                else
                        cout << "Error reading test.xml\n";
        }
        catch(const std::exception &ex)
        {
              cout << "Exception caught: " << ex.what() << endl;
        }

        return 0;
}

=== test.xml === 
<?xml version="1.0"?>
<test>
	<testchild>
		<another/>
	</testchild>
</test>

=== result.xml ===
<?xml version="1.0"?>
<test>
	<testchild>
		<another/>
	</testchild>
<newnode00><childnode01>text 0.1</childnode01><childnode02>text 0.2</childnode02></newnode00></test>
Comment 1 Murray Cumming 2008-03-18 15:13:23 UTC
Created attachment 107545 [details]
libxmlpp_formatted.cc

Confirmed. And here is the test as an attachment to make life easier when re-testing.
Comment 2 Murray Cumming 2008-03-18 15:14:01 UTC
Created attachment 107546 [details]
test.xml

The XML file used by the test program.
Comment 3 Murray Cumming 2008-03-18 15:30:15 UTC
Created attachment 107547 [details]
libxml_formatted.c

Here is a C test case that seems to show that the problem is in libxml.
Comment 4 Daniel Veillard 2008-03-18 17:12:05 UTC
If libxml2 detects they might be some mixed content for a node it will
refuse to add more nodes for 'formatting' as it can't guess if 
the text nodes present are here for formatting or as legit content.

The test document has text children hence libxml2 disable 'formatting'
in that subtree.

All spaces in content are significant by default in XML. Years of
SGML experience proved that no heuristic could work detecting
'significant' white spaces from 'indenting' white spaces, hence the
POV of the XML spec and why libxml2 is just being careful.

If you think you know better, just go though the document before
calling libxml2 and remove all those 'not significant' text nodes.
Libxml2 won't try this because a generation of markup hackers failed
to find a proper algorithm, but if you have time to spend and are not 
afraid of breaking your users document, go for it !

  From a libxml2 POV, not a bug,

Daniel
Comment 5 Murray Cumming 2008-03-18 17:26:38 UTC
That makes sense. Thanks, Daniel. There's no need to be nasty about it though.
Comment 6 Daniel Veillard 2008-03-19 15:29:41 UTC
Sorry if this sound nasty. Maybe it's because I'm seeing this pointed
out more often than it should. Maybe it is possible to do, I'm afraid 
it's not in the general case. sometimes the default behaviour could be 
enhanced that's sure, for example if you have a DTD and you know the
content model of the elements. But this is a risky business.
There is also a cost issue, if you start doing a lot of analysis 
when saving just because someone switched a 'please indent' flag
somewhere, this can have serious consequences in term of throughput
while this could be handled in a deterministic way if the program 
had done the indentation while modifying the tree.

Daniel
Comment 7 Murray Cumming 2008-03-19 16:03:11 UTC
I guess there should theoretically be some way to specify indenting even with child text nodes, and maybe even a way to say that text should be wrapped at a line limit and indented, but that's not something I'm going to work on, of course.

Thanks, Daniel.