GNOME Bugzilla – Bug 333522
Error message for malformed internal parsed entities
Last modified: 2007-06-12 12:20:39 UTC
Quote from http://www.w3.org/TR/2004/REC-xml-20040204/#wf-entities: An internal general parsed entity is well-formed if its replacement text matches the production labeled content. All internal parameter entities are well- formed by definition. Appears like libxml2, release 2.6.23 doesn't reflect this paragraph: <?xml version="1.0"?> <!DOCTYPE root [ <!ELEMENT root (#PCDATA|child)*> <!ELEMENT child (#PCDATA)*> <!ENTITY a "<child>"> <!ENTITY b "</child>"> ]> <root>&a;test&b;</root> When executing "xmllint -valid --noout" you get: Entity: line 1: parser error : Premature end of data in tag child line 1 <child> ^ test.xml:10: parser error : Unregistered error message <root>&a;test&b;</root> ^ The first error message is misleading as it doesn't properly point to the problematic location: Instead of "Entity: line 1" it should read "test.xml: Line 6". Guess the second error message reveals another internal bug in libxml2?
libxml2 handling is right. Entities must be pushed on the stack to detect that kind of errors, for example the following is well-formed: <?xml version="1.0"?> <!DOCTYPE root [ <!ELEMENT root (#PCDATA|child)*> <!ELEMENT child (#PCDATA)*> <!ENTITY a "<child>"> <!ENTITY b "</child>"> ]> <root>test</root> And entity can have multiple line, so you get the line number where it occured in it. At best for entities defined in the internal subset the name of the entity could be reported but that would require to augment the amount of data used for entities pushed on the stack at runtime. Doesn't look a serious problem. The second message precisely allow to tell where in the entity stack the error occured, you get the full context for all pushed entities. Daniel
Created attachment 60715 [details] [review] libxml2-2.6.23-bug-333522.patch Fixes the "Unregistered error message" problem and eject an additional error message "Referencing malformed entity. See previous error message for details" at the cost of being verbose and generating two error messages when referencing the malformed entity.
I accept libxml's behaviour being correct regarding XML handling. Nevertheless the patch should be applied as it makes sense to improve the error reporting here to save other users of libxml some time.
Current error message is: paphio:~/XML -> xmllint test.xml Entity: line 1: parser error : Premature end of data in tag child line 1 <child> ^ test.xml:10: parser error : Entity 'a' failed to parse <root>&a;test&b;</root> ^ paphio:~/XML -> I think it is sufficient, at least there is no unregistered error raised. Daniel