After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 655218 - HTMLParser does not support HTML5-like <meta charset> encoding declaration
HTMLParser does not support HTML5-like <meta charset> encoding declaration
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
2.7.8
Other Linux
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2011-07-24 19:01 UTC by Jasper St. Pierre (not reading bugmail)
Modified: 2012-05-10 07:39 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Add html5 charset meta tag (3.33 KB, patch)
2011-10-23 18:43 UTC, Denis Pauk
none Details | Review

Description Jasper St. Pierre (not reading bugmail) 2011-07-24 19:01:20 UTC
From:

  http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#the-meta-element

"""
The charset attribute specifies the character encoding used by the document. This is a character encoding declaration. If the attribute is present in an XML document, its value must be an ASCII case-insensitive match for the string "UTF-8" (and the document is therefore forced to use UTF-8 as its encoding).
"""

However, while <meta http-equiv="Content-Type" content="text/html; charset=utf8"> works, <meta charset="utf8"> does not.
Comment 2 Daniel Veillard 2012-05-10 07:39:21 UTC
While libxml2 HTML parser is not tuned for HTML5, this is a simple 
addition, I made some indenting changes and added a test case for
completeness and commited:

http://git.gnome.org/browse/libxml2/commit/?id=868d92da8915fc5dc5e329d93cc7882370a28475

  thanks for the suggestion and the patch !

Daniel