GNOME Bugzilla – Bug 704017
Bug in the regular expression parser; character range with escaped characters not working
Last modified: 2021-07-05 13:24:01 UTC
Created attachment 248941 [details] Test files for reproducing the bug I think there is a bug in the regular expression parser for character ranges, i.e. the character class [\]-a] with an escaped character (here \]) is not recognized by the libXML regular expression parser. E.g.: The simple type is not working: <xs:simpleType name="LimitedString"> <xs:restriction base="xs:string"> <xs:pattern value="[\]-a]*"/> </xs:restriction> </xs:simpleType> but <xs:simpleType name="LimitedString"> <xs:restriction base="xs:string"> <xs:pattern value="[Z-a]*"/> </xs:restriction> </xs:simpleType> If one looks at the ASCII table: !"#$%&'()*+,-./0123456789:;<=>? @ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_ `abcdefghijklmnopqrstuvwxyz{|}~ one sees that Z is the most right character before the character ] which does not have to be escaped in the character range definition. This indicates that there is a bug. I have prepared some example files to demonstrate the shortcoming: test_not_validating.xsd is using the first simple type definition and test_validating.xsd the second one, respectively. You can try to validate test.xml with: xmllint --noout --schema test_not_validating.xsd ./test.xml and xmllint --noout --schema test_validating.xsd ./test.xml respectively.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.