GNOME Bugzilla – Bug 334669
Broken XML generated by HTML parser
Last modified: 2006-10-13 12:43:50 UTC
Please describe the problem: An empty tag is generated when parsing HTML with code like <a://b/> Steps to reproduce: <html> bad tag <http://bla.com/> </html> [pesenti@dev articles]$ ~/stable/libxml/bin/xmllint --html /tmp/o.html /tmp/o.html:2: HTML parser error : error parsing attribute name bad tag <http://bla.com/> ^ /tmp/o.html:2: warning: Namespace prefix http is not defined bad tag <http://bla.com/> ^ /tmp/o.html:2: HTML parser error : Tag http: invalid bad tag <http://bla.com/> ^ <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html><body><p> bad tag <></></p></body></html> Actual results: Expected results: Does this happen every time? Yes Other information:
Okay, I found the problem and fixed it in CVS: paphio:~/XML -> xmllint --html tst.html tst.html:2: HTML parser error : error parsing attribute name bad tag <http://bla.com/> ^ tst.html:2: HTML parser error : Tag http: invalid bad tag <http://bla.com/> ^ <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd"> <html><body><p> bad tag <http:></http:></p></body></html> paphio:~/XML -> thanks, Daniel