After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 542666 - Duplicate resolution of CharRefs in regexp (XML Schema)
Duplicate resolution of CharRefs in regexp (XML Schema)
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
git master
Other All
: Normal major
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2008-07-12 14:01 UTC by Volker Diels-Grabsch
Modified: 2008-08-26 07:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Solves the bug and adds two regression tests (5.05 KB, patch)
2008-07-12 14:05 UTC, Volker Diels-Grabsch
none Details | Review

Description Volker Diels-Grabsch 2008-07-12 14:01:20 UTC
Please describe the problem:
When an XML Schemas contains a line like this:

    <xs:pattern value="[56;&amp;#]"/>

the regexp sould be understood as:

    [56;&#]

i.e. a pattern that matches "5", "6", ";", "&" and "#".

However, libxml2 tries to resolve the ampersand again
and complains about the illegal CharRef in the schema:

     &#]

Even worse, when the characters are given in a different order:

    <xs:pattern value="[&amp;#65;]"/>

the patters is first resolved to:
 
    [&#65;]

and then again to:

    [A]

In that case libxml2 does not even complain!
Instead, it silently checks against a completely
different regexp!

The reason for that strange behaviour seems to be
grammar rule [19] of an old XML Schemas specification:

http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dt-charrange

In the current specification this misguiding grammar
rule has been removed:

http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/#dt-charrange

The bug can be fixed by simply removing all code that
handles grammar rule [19].

The attached patch does exactly that and adds two
regression tests.


Steps to reproduce:
xmllint --noout --schema schema1.xsd doc.xml

-- and --

xmllint --noout --schema schema2.xsd doc.xml

Content of "doc.xml":

<?xml version="1.0"?>
<test>5</test>

Content of "schema1.xsd":

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="test">
        <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:pattern value="[56;&amp;#]"/>
            </xs:restriction>
        </xs:simpleType>
    </xs:element>
</xs:schema>

Content of "schema2.xsd":

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="test">
        <xs:simpleType>
            <xs:restriction base="xs:string">
                <xs:pattern value="[&amp;#65;]"/>
            </xs:restriction>
        </xs:simpleType>
    </xs:element>
</xs:schema>


Actual results:
regexp error : failed to compile: Char ref: expecting [0-9]
schema1.xsd:6: element pattern: Schemas parser error : Element '{http://www.w3.org/2001/XMLSchema}pattern': The value '[56;&#]' of the facet 'pattern' is not a valid regular expression.
WXS schema schema1.xsd failed to compile

--- and ---

doc.xml:2: element test: Schemas validity error : Element 'test': [facet 'pattern'] The value '5' is not accepted by the pattern '[&#65;]'.
doc.xml:2: element test: Schemas validity error : Element 'test': '5' is not a valid value of the local atomic type.
doc.xml fails to validate


Expected results:
doc.xml validates

-- and --

doc.xml validates


Does this happen every time?
yes

Other information:
Comment 1 Volker Diels-Grabsch 2008-07-12 14:05:03 UTC
Created attachment 114436 [details] [review]
Solves the bug and adds two regression tests

In addition to this patch, you need to create some empty files:

    touch result/schemas/regexp-char-ref_0_0.err
    touch result/schemas/regexp-char-ref_1_0.err
Comment 2 Daniel Veillard 2008-08-26 07:46:11 UTC
Very good patch and explanations, perfect !

 Applied and commited to SVN,

   many thanks !

Daniel