GNOME Bugzilla – Bug 681822
Blank nodes are always removed from HTML (regardless the option HTML_PARSE_NOBLANKS)
Last modified: 2017-06-17 10:51:55 UTC
Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes are removed from a HTML document, for example: <html> <head> <title>This is a test.</title> </head> <body> <p>This is a test.</p> </body> </html> is read as: <html><head><title>This is a test.</title></head><body> <p>This is a test.</p> </body></html> (Reproduced with the version 2.8.0.)
Created attachment 221122 [details] [review] Preliminary bug fix Probably this bug fix is incorrect, but it does what I need.
Okay, i agree that having 1 useless parser flag setup by default and forcing loss of data is not a good situatin, so i applied a patch based on yours fixing the issue. I will check how it goes uspstream hopefully we can keep as-is without adding a new parser flag specifically for that behaviour http://git.gnome.org/browse/libxml2/commit/?id=f933c898132f20a50ba39ac6116378b71a01c700 thanks ! Daniel
*** Bug 642191 has been marked as a duplicate of this bug. ***
*** Bug 319716 has been marked as a duplicate of this bug. ***
*** Bug 675351 has been marked as a duplicate of this bug. ***
*** Bug 728997 has been marked as a duplicate of this bug. ***