After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 596184 - RELAX NG validation fails due to default attribute value
RELAX NG validation fails due to default attribute value
Status: RESOLVED WONTFIX
Product: libxml2
Classification: Platform
Component: relaxng
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2009-09-24 12:42 UTC by Vincent Lefevre
Modified: 2017-06-12 19:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
testcase (archive containing debbug288149.rng and debbug288149.xml) (462 bytes, application/gzip)
2009-09-24 12:42 UTC, Vincent Lefevre
Details

Description Vincent Lefevre 2009-09-24 12:42:44 UTC
Created attachment 143907 [details]
testcase (archive containing debbug288149.rng and debbug288149.xml)

Note: this bug has been reported against Debian here:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=288149

When the DTD has a default attribute value, xmllint ignores it for RELAX NG
validation. According to the RELAX NG spec (see below), this is incorrect. With
the attached testcase:

$ --relaxng debbug288149.rng debbug288149.xml
<?xml version="1.0"?>
<!DOCTYPE root [
<!ELEMENT root (#PCDATA)>
<!ATTLIST root type (text | number) "text">
]>
<root>Test</root>
debbug288149.xml:8: element root: Relax-NG validity error : Element root failed
to validate attributes
debbug288149.xml fails to validate
zsh: exit 3     xmllint --relaxng debbug288149.rng debbug288149.xml

Note that jing validates the XML file.

The explanations I gave in the Debian bug report:

Yes, I think it's a bug in xmllint for the following reason.

The question is whether an attribute given in the DTD with a default
value (but not in the start tag of an element) is regarded as present
or not for RELAX NG validation. According to the RELAX NG spec, the
data model is based on the infoset obtained when all declarations of
the DTD are processed. According to the infoset spec[*], attributes
that have a default value are part of the infoset. The information
concerning a default value is provided by the [specified] property:

  [specified] A flag indicating whether this attribute was actually
  specified in the start-tag of its element, or was defaulted from
  the DTD.

Let's get back to RELAX NG. Its spec doesn't say anything concerning
this [specified] property. This means that the origin of an attribute
(start tag or default value in DTD) is ignored, i.e. an attribute
specified in a start tag and an attribute with default value in the
DTD are regarded as equivalent for RELAX NG.

[*] http://www.w3.org/TR/xml-infoset/
Comment 1 Vincent Lefevre 2009-09-24 12:53:25 UTC
The command above should be:

  xmllint --relaxng debbug288149.rng debbug288149.xml
Comment 2 Daniel Veillard 2009-09-24 18:20:12 UTC
not a bug, if you want to fetch the DTD that's ultimately your choice
Relax-NG validates a tree, the tree is the result of your parsing.
xmllint contrary to jing allows you to lookup or not attributes
coming from the DTD, just use the appropriate option !

	--dtdattr : loaddtd + populate the tree with inherited attributes 

The fact that all Java parsers actually tend to force you to always
load the DTD is a nuisance rather than anything else. An XML parser
can operate in both modes, libxml2 defaults to the less intrusive and less dangerous one, I won't change this, but the user can.
Forcing --dtdattr when processing Relax-NG by default can mean a lot of
disruption in existing processes, and the user has the choice,
programatically or on the command line.

Feature, not bug

Daniel
Comment 3 Vincent Lefevre 2009-09-24 19:14:09 UTC
Reopening because xmllint doesn't behave strictly as documented. The man page says:

       --relaxng SCHEMA
              Use RelaxNG file named SCHEMA for validation.

No more. As documented, if xmllint doesn't use the full infoset (with attribute default values) by default, it breaks the RELAX NG specification, and that's a bug. I can understand that you may want to force the user to use --dtdattr in such a case, but then, this must explicitly be documented. The user can't guess that. Something like:

  In order to conform to the RELAX NG specification, the user may need to use the --dtdattr option too.

Actually I think it would be better to make --relaxng imply --dtdattr by default and have --nodtdattr to disable that (a bit like xsltproc, which has --nodtdattr instead of requiring the user to add --dtdattr for full conformance). I wonder why an inconsistency between xmllint --relaxng and xsltproc. Anyway, just completing the documentation as above is OK, IMHO.

Other two points:

1. The API documentation may need to be completed.

2. I think it should be made clear that it is not an error to use --dtdattr when there is no DTD.
Comment 4 Daniel Veillard 2009-09-25 12:06:48 UTC
No. In addition to the reasons given previously:

   - using DTD for attribute defaulting in an environment where
     Relax-NG means people will validate the instance, find it fine
     but the using a parser in non-validating mode the attribute
     will be missing. This is a bug inducing behaviour, and a
     bad practice, it increases the gap between how validating and
     non-validating parsers will process a document.

   - xmllint and libxml2 in general default to not loading the
     external subset, it is a *good* thing, I stand by it and
     prefer a small deviation in front of a non-sensical use of DTD
     and RNG than change this by default. There is still a way
     to do the opposite.

   - libxml2 is designed as an editing toolkit, it won't modify the
     document when processing it by default, this opposes SAX and
     many other tools who just didn't care about this case, sorry
     libxml2 won't do this. -- nodtdattr is the default it's the
     rule of least surprize, and the best way to preserve the data.

   - people have been using libxml2/xmllint for years to validate
     with Relax-NG without loading the DTD, I don't want to break
     this for obscure and misleading reasons. 

 If you want to argue that libxml2 is not RelaxNG compliant, fine,
use something else, lobby for people to use something else if you 
feel so but I stand where I am, I do feel it's the best for a large
majority of my users.

Daniel
Comment 5 Vincent Lefevre 2009-09-25 12:38:53 UTC
I still don't understand why you don't want to document xmllint properly.
Comment 6 Daniel Veillard 2009-09-25 14:11:28 UTC
Well if it's just about changing the docs, fine, just suggest a
patch on the mailing-list or here. But there is no --nodtdattr
option, and I don't plan to change --relaxng and add it as you suggested
in comment #3
I.e. changing the behaviour which is the bugzilla is WONTFIX, clarifying
docs is fine (but probably too verbose for --help , more for the man page)

Daniel
Comment 7 Vincent Lefevre 2009-09-25 14:38:19 UTC
As I've said, in the xmllint man page, after

       --relaxng SCHEMA
              Use RelaxNG file named SCHEMA for validation.

you could add:

  In order to conform to the RELAX NG specification, the user may need to use
the --dtdattr option too.