GNOME Bugzilla – Bug 402222
xmllint refuses unnormalized W3C Schema strings
Last modified: 2017-06-14 12:17:47 UTC
Please describe the problem: I have the following RelaxNG Schema (but I use W3C schema types so it should be the same with a XSD): <?xml version="1.0" encoding="UTF-8"?> <grammar xmlns="http://relaxng.org/ns/structure/1.0" datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> <start> <ref name="code"/> </start> <define name="code"> <element name="code"> <data type="NMTOKEN"> <param name="minLength">2</param> <param name="maxLength">3</param> <param name="pattern">[a-z]+</param> </data> </element> </define> </grammar> And the following XML file: <?xml version="1.0" encoding="iso-8859-1"?> <code> fr</code> xmllint refuses it: bidon.xml:2: element code: Relax-NG validity error : Error validating datatype NMTOKEN bidon.xml:2: element code: Relax-NG validity error : Element code failed to validate content bidon.xml fails to validate According to XML guru Eric van der Vlist and to the standard http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#rf-whiteSpace, it should be accepted since only xsd:string are not normalized before testing the pattern. NMTOKEN (or integers) are normalized so the above file should be accepted. (xmllint also refuses integers with a pattern "[0-9]+" when they have trailing whitespace. For the same reason, it should accept them, IMHO) Steps to reproduce: 1. Use the above files 2. % xmllint --noout --relaxng bidon.rng bidon.xml 3. Actual results: See the above error message. Expected results: I would expect the XML file to be declared valid. Does this happen every time? Yes Other information:
Works for me with libxml 2.9.4. Probably fixed with this commit: https://git.gnome.org/browse/libxml2/commit/?id=cad102b861f74d56e3f6e710c466cf1a38a5db56