GNOME Bugzilla – Bug 642191
Html parser ignores whitespace after empty tags.
Last modified: 2017-06-17 10:49:56 UTC
If an start tag of an empty element (<input>, <img>, etc) is followed by just whitespace that whitespace is ignored. erik@debian:~$ echo '<p><input> <input></p>' | xmllint -html - <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html><body><p><input><input></p></body></html> In for all other elements whitespace is preserved. erik@colinux:~$ echo "<p> <em></em> <input> </p>" | xmllint -html - <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html><body><p> <em></em> <input></p></body></html> Why do an exception for empty tags? The html will render differently with and without the space between the <input> tags. erik@debian:~$ xmllint -version xmllint: using libxml version 20708 compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib
*** This bug has been marked as a duplicate of bug 681822 ***