GNOME Bugzilla – Bug 675351
Does not understand whitespace around HTML5 elements
Last modified: 2017-06-17 10:50:58 UTC
Whitespace around HTML 5 elements are not understood. Example using xmllint: $ cat sample.html <!DOCTYPE html> <html> <body> <mark>hello</mark> <mark>world</mark> </body> </html> $ xmllint --html sample.html sample.html:4: HTML parser error : Tag mark invalid <mark>hello</mark> <mark>world</mark> ^ sample.html:4: HTML parser error : Tag mark invalid <mark>hello</mark> <mark>world</mark> ^ <!DOCTYPE html> <html><body> <mark>hello</mark><mark>world</mark> </body></html> As you can see, the whitespace around the mark elements are now missing. This happens when parsing HTML documents as well, making the resulting tree wrong. We're using libxml2 to preprocess HTML-documents, and having whitespace disappearing in the middle of sentences are causing some trouble for us.
*** This bug has been marked as a duplicate of bug 681822 ***