After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 553511 - entity in attribute parse error
entity in attribute parse error
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: general
2.6.27
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2008-09-24 04:33 UTC by Marc Mongenet
Modified: 2021-07-05 13:27 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Marc Mongenet 2008-09-24 04:33:05 UTC
Please describe the problem:
When parsing a document with an attribute value containing an entity reference (like attr="é"), libxml2 builds a wrong DOM if no special options are given: the entity is removed from the attribute, and placed before the element.

Steps to reproduce:
1. Create this valid XHTML file (libxml2.txt):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head><title>Example</title></head>
 <body><p title="&eacute;">.</p></body>
</html>

2. Run xmllint with no option on this file (xmllint libxml2.txt)
3. Observe that there is a warning (for a valid document!), but a result is still produced, but with &eacute;<p title="">.</p> instead of <p title="&eacute;">.</p>


Actual results:
06:24:53 marc@kameha /tmp
xmllint libxml2.txt 
libxml2.txt:5: parser error : Entity 'eacute' not defined
 <body><p title="&eacute;">.</p></body>
                         ^
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Example</title></head>
 <body>&eacute;<p title="">.</p></body>
</html>
06:25:03 marc@kameha /tmp


Expected results:
06:24:53 marc@kameha /tmp
xmllint libxml2.txt 
                         ^
<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><title>Example</title></head>
 <body><p title="&eacute;">.</p></body>
</html>
06:25:03 marc@kameha /tmp


Does this happen every time?
Yes

Other information:
xmllint is the easiest way to see/reproduce the bug, but it really is in libxml2.
Comment 1 fantasai 2010-07-26 22:03:18 UTC
I'm seeing the same problem using the lxml library.
Comment 2 Nick Wellnhofer 2017-06-15 12:17:42 UTC
If xmlParseEntityRef finds an undeclared entity and [ WFC: Entity Declared ] isn't violated, it must not call sax->reference (this doesn't work in attributes) but should either

1. create a dummy extSubset in the document and add an entity without content, or
2. the callers should check for XML_WAR_UNDECLARED_ENTITY and handle the situation themselvses.

Solution 1 might confuse existing users that don't expect such dummy external subsets. For solution 2 the callers of xmlParseEntityRef have to get the name of the entity somehow. But xmlParseEntityRef throws the name away if it can't resolve the entity.

I'm not sure what's the best approach.
Comment 3 GNOME Infrastructure Team 2021-07-05 13:27:07 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.