After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 357992 - HTMLParser misses HTML_PARSE_NOCDATA option
HTMLParser misses HTML_PARSE_NOCDATA option
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: htmlparser
2.6.x
Other All
: Normal enhancement
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2006-09-27 16:14 UTC by Stefan Behnel
Modified: 2021-07-05 13:26 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement



Description Stefan Behnel 2006-09-27 16:14:09 UTC
The XML parser supports an XML_PARSE_NOCDATA option that prevents CDATA nodes from appearing in the tree. This simplifies the tree structure and text handling quite a bit.

The problem is that the HTML parser does not support this and, even worse, it happily generates CDATA nodes for style and script tags that were not in the original document. This can be prevented "by hand" by setting the "sax.cdataBlock" function to NULL (the libxml2 code handles this correctly). However, a parser option would make this much cleaner.
Comment 1 GNOME Infrastructure Team 2021-07-05 13:26:28 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.