GNOME Bugzilla – Bug 75244
xmlSaveFormatFile ("-", ? 1) acts like xmlSaveFormatFile ("-", ?, 0)
Last modified: 2009-08-15 18:40:50 UTC
In theory, using a "1" as the last paramter to this call (according to www.xmlsoft.org/html/libxml-tree.html#XMLSAVEFORMATFILE) will cause indent spaces to be added to the output. Empirically, this is not true; there is no difference in behavior. Tested using a real file rather than stdout; no difference. NOTE: This bug is a 2.4.16 bug, but the bug database only knows of a "version 2.4.13"
If your input document already had some "formatting" spaces, libxmml will not modify them. Knowing what is "formatting" spaces is impossible in theory, as a result the XML specification states that *all* spaces outside markup are significant. As a result libxml is very conservative, it will not modify existing formatting spaces. So if your document got new lines but no indentation it will not try to indent by modifying existing text nodes. Sorry indenting support has to be made very cautious, if you know that some formatting text are formating text, then remove them before the serialization and libxml2 will do it. But libxml will not take the risk of possibly modifying valid content. Please also read entry #2 of the devalopper FAQ. Daniel
My document is generated by a sequence of: if (NULL != ( wk = xmlNewChild (package, ns, "pmaintainer", NULL) )) { xmlSetProp (wk, "name", xmlEncodeEntitiesReentrant(doc, (xmlChar *) "Allan Clark")); xmlSetProp (wk, "email", xmlEncodeEntitiesReentrant(doc, (xmlChar *) "<easysoft-deb-12@chickenandporn.com>")); } ...and is printed as <item/><item/> ie no chars, spaces or newlines between XML markups at all. ... which of those is "formatting" characters? :)
immediate warning: your code is leaking like hell !!! read the man page of xmlEncodeEntitiesReentrant() the caller must free the string. Daniel
paphio:~/XML -> cat tst.xml <doc><a><b/> </a> </doc> paphio:~/XML -> ./xmllint --format tst.xml <?xml version="1.0"?> <doc> <a> <b/> </a> </doc> paphio:~/XML -> gdb xmllint [...](gdb) b xmlSaveFormatFile Breakpoint 1 at 0x807b32e: file tree.c, line 6894. (gdb) r --format tst.xml Starting program: /u/veillard/XML/xmllint --format tst.xml Breakpoint 1, xmlSaveFormatFile (filename=0x80ce822 "-", cur=0x8111da0, format=1) at tree.c:6894 6894 return ( xmlSaveFormatFileEnc( filename, cur, NULL, format ) ); (gdb) c Continuing. <?xml version="1.0"?> <doc> <a> <b/> </a> </doc> Program exited normally. (gdb) Can you explain what's your doing differently then ? Seems to work fine here. The only difference maybe is that --format calls xmlKeepBlanksDefault(0); which makes the file parsed drop the formatting blanks nodes on input. But since as you say you are generating the tree from scratch, I can't understand nor reproduce your problem ! Daniel
Found. The indentation of nodes is activated by either: set xmlIndentTreeOutput = 1; or call xmlKeepBlanksDefault(0); otherwise the formatting limited to a simple unindented formatting. This should solves it. I just need to update the documentation of xmlSaveFormatFile() to make this explicit. Daniel
Got more informations. The content model of the elements is mixed content. Libxml won't modify nodes with mixed content to do formatting, it's deliberate to not risk loosing or modifying user data. Not a bug, closed. Daniel