After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 675351 - Does not understand whitespace around HTML5 elements
Does not understand whitespace around HTML5 elements
Status: RESOLVED DUPLICATE of bug 681822
Product: libxml2
Classification: Platform
Component: htmlparser
2.7.8
Other Mac OS
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2012-05-03 08:13 UTC by Magnus Bergmark
Modified: 2017-06-17 10:50 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Magnus Bergmark 2012-05-03 08:13:10 UTC
Whitespace around HTML 5 elements are not understood. Example using xmllint:

$ cat sample.html
<!DOCTYPE html>
<html>
  <body>
    <mark>hello</mark> <mark>world</mark>
  </body>
</html>

$ xmllint --html sample.html 
sample.html:4: HTML parser error : Tag mark invalid
    <mark>hello</mark> <mark>world</mark>
         ^
sample.html:4: HTML parser error : Tag mark invalid
    <mark>hello</mark> <mark>world</mark>
                            ^
<!DOCTYPE html>
<html><body>
    <mark>hello</mark><mark>world</mark>
</body></html>

As you can see, the whitespace around the mark elements are now missing. This happens when parsing HTML documents as well, making the resulting tree wrong.

We're using libxml2 to preprocess HTML-documents, and having whitespace disappearing in the middle of sentences are causing some trouble for us.
Comment 1 Nick Wellnhofer 2017-06-17 10:50:58 UTC

*** This bug has been marked as a duplicate of bug 681822 ***