After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 148326 - xsltproc fails to process standalone parsed external entities
xsltproc fails to process standalone parsed external entities
Status: VERIFIED NOTABUG
Product: libxslt
Classification: Platform
Component: general
1.1.4
Other Linux
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2004-07-24 08:11 UTC by Akos Maroy
Modified: 2009-08-15 18:40 UTC
See Also:
GNOME target: ---
GNOME version: 2.5/2.6



Description Akos Maroy 2004-07-24 08:11:26 UTC
when a document passed to xsltproc refers to a parsed external entity, that is
declared as standalone, xsltproc failes.

consider the following simple document, called a.xml:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<!DOCTYPE a [
<!ENTITY b SYSTEM "b.xml">
]>
<a>
&b;
</a>

and the document b.xml:

<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<b/>


when processing this document, with any XSLT, the following error is generated:

$ xsltproc a.xsl a.xml
b.xml:1: parser error : parsing XML declaration: '?>' expected
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
                                     ^
a.xml:7: error: Failure to process entity b
&b;
   ^
a.xml:7: parser error : Entity 'b' not defined
&b;
   ^
unable to parse a.xml


if b.xml is not specified as standalone, it is processed fine as a parsed
external entity by xsltproc.
Comment 1 Daniel Veillard 2004-07-24 13:56:09 UTC
If you indicate the document is standalone, while it reference external
parsed entities, this is a Well formedness error, the processing *must*
stop, as the document is not XML. I am 100% sure of this.

  the behaviour is normal, in accordance to the XML-1.0 specification,
and again you should rather read and understand the specs before reporting
erronous bugs. If it was working for you before on other tools, those
tools were just not compliants to the specifications.

Daniel
Comment 2 Akos Maroy 2004-07-24 14:19:06 UTC
Sorry, but I have to disaggree with you.

First, let me have a slight change in my example above, to remove a confusion
here. For a.xml be:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE a [
<!ENTITY b SYSTEM "b.xml">
]>
<a>
&b;
</a>

(note there is no standalone declaration there anymore).

But neveretheless, that's not the heart of the problem, as the it lies with
b.xml. b.xml is marked as standalone, which does _not_ refer to any
externalities. Thus, it is a correct standalone document. Please note the error
message given by xmltproc above: it gives an error for b.xml, not for a.xml.

Second, the standalone declaration in the XML specification states that there
are no external _markup declarations_ to an XML file. In my example, there are
no external markup declarations, in fact, there aren't any markup declarations
at all, neither in the original a.xml, the new a.xml or b.xml. An external
parsed entity is not a markup declaration. Thus, all the sample XML files
presented here are valid XML files.

Moreover, there is a validity constraint in the XML specification for the
standalone document declaration, but the above error is given also if the
--novalid flag is used to invoke xsltproc. This flag, according to the
documentation, turns off validation. So if there in fact would be external
markup declarations (which there aren't), then the files should still pass with
--novalid. (Actually the validiy constraint is not even that picky, see the XML
specification.)

Before your being 100% sure, please read up on the XML specification. In this
case, section 2.9 Standalone Document Declaration,
http://www.w3.org/TR/2004/REC-xml-20040204/#sec-rmd
Comment 3 Daniel Veillard 2004-07-24 14:44:06 UTC
In your initial example a.xml was not well-formed. Processing could not
work, that's normal. The fact taht you're hitting the problem on b instead of 
a firstr is just an implementation issue.
In the second example it's still not well formed because your external
parsed entity is not well formed. You're reading the wrong part of the
spec:
  http://www.w3.org/TR/2004/REC-xml-20040204/#TextEntities

  "External parsed entities SHOULD each begin with a text declaration."

[77] TextDecl ::= '<?xml' VersionInfo? EncodingDecl S? '?>'

  A text declaration is *not* an XML Declaration, it doesn't allow standalone
and the encoding is mandatory instead of optionnal.

  Again, libxslt is right (or rather libxml2), it reports the failure
correctly.
  W.r.t. --novalid, you are confusing validity checking and reporting
with well-formedness checking. The first one consist of checking that
the document conforms to the DTD, it's optional, libxml2 does not do
it by default and libxslt doesn't either since the XSLT (or rather XPAth)
spec does not require it. The error you are seeing are well-formedness 
errors, i.e. fatal errors to conform to the XML grammar. They must be checked
and the parser is forbidden to recover from them.

  The spec is not trivial. Try to have a full grasp of it, libxml2 is
compliant, really, and I know what I'm talking about since I'm in the W3C
group which maintains the specification. There might be bugs, but not
trivial ones.

Daniel
Comment 4 Akos Maroy 2004-07-25 06:36:42 UTC
Indeed, b.xml fails the TextDecl rule, that's the problem. Thanks for pointing
it out.

(But than again, this has nothing to do with the standalone validity constraint,
as posted in your first reply. On the contrary, standalone documents are free to
reference external parsed entities.)