GNOME Bugzilla – Bug 726093
Schema validation fails due to issue with pattern interpretation
Last modified: 2021-07-05 13:24:26 UTC
Created attachment 271506 [details] Example files showing working (test1) and failing (test2) patterns I am working with the GEIA 0007 XML standard, which includes the following pattern in the schema for one of the elements: <xsd:pattern value="[A-Z_0-9 `~!@#$%^&*()_+-=\[\]\\{}\|;'",./?><]{0,10}"/> I hit some errors with this failing to match simple valid data such as "T850-47". On investigation it seems that this may be due to an error in the way libxml interprets patterns (the same file parses clean against the schema in the .NET and Xerces parsers). I have tested this with the 2.9.1 Win32 binaries and a clean build of 2.9.1 on CentOS. I can get it to parse correctly by escaping the - and moving it closer to the front of the pattern: [A-Z_0-9\- `~!@#$%^&*()_+=\[\]\\{}\|;'",./?><]{0,10} ^^ simply escaping it doesn't work. I have attached some test files: libxml-test1.xml and xsd -- works on libxml, using the pattern changed as described libxml-test2.xml and xsd -- fails on libxml: <?xml version="1.0" encoding="utf-8"?> <RootElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespa ceSchemaLocation="file:///U:/Projects/lxml/lxml-test2.xsd"> <test>AAA-AAA</test> <test>AAA-1111</test> </RootElement> Element 'test': [facet 'pattern'] The value 'AAA-AAA' is not accepted by the pat tern '[A-Z_0-9 `~!@#$%^&*()_+-=\[\]\\{}\|;'",./?><]{0,10}'. Element 'test': 'AAA-AAA' is not a valid value of the atomic type 'end_item_acro nym_code_Type'. Element 'test': [facet 'pattern'] The value 'AAA-1111' is not accepted by the pa ttern '[A-Z_0-9 `~!@#$%^&*()_+-=\[\]\\{}\|;'",./?><]{0,10}'. Element 'test': 'AAA-1111' is not a valid value of the atomic type 'end_item_acr onym_code_Type'. lxml-test2.xml fails to validate I will submit the change to the pattern to the GEIA standards people. I'm not sure if this error is a genuine issue with libxml but given the same data parses cleanly with .NET/Xerces I thought it would be worth reporting.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/libxml2/-/issues/ Thank you for your understanding and your help.