After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 726093 - Schema validation fails due to issue with pattern interpretation
Schema validation fails due to issue with pattern interpretation
Status: RESOLVED OBSOLETE
Product: libxml2
Classification: Platform
Component: xmlschema
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2014-03-11 07:42 UTC by Nick Barron
Modified: 2021-07-05 13:24 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Example files showing working (test1) and failing (test2) patterns (931 bytes, application/x-zip-compressed)
2014-03-11 07:42 UTC, Nick Barron
Details

Description Nick Barron 2014-03-11 07:42:11 UTC
Created attachment 271506 [details]
Example files showing working (test1) and failing (test2) patterns

I am working with the GEIA 0007 XML standard, which includes the following pattern in the schema for one of the elements:

<xsd:pattern value="[A-Z_0-9 `~!@#$%^&amp;*()_+-=\[\]\\{}\|;&apos;&quot;,./?&gt;&lt;]{0,10}"/>

I hit some errors with this failing to match simple valid data such as "T850-47". On investigation it seems that this may be due to an error in the way libxml interprets patterns (the same file parses clean against the schema in the .NET and Xerces parsers). I have tested this with the 2.9.1 Win32 binaries and a clean build of 2.9.1 on CentOS.

I can get it to parse correctly by escaping the - and moving it closer to the front of the pattern:

[A-Z_0-9\- `~!@#$%^&amp;*()_+=\[\]\\{}\|;&apos;&quot;,./?&gt;&lt;]{0,10}
        ^^

simply escaping it doesn't work.

I have attached some test files:

libxml-test1.xml and xsd -- works on libxml, using the pattern changed as described
libxml-test2.xml and xsd -- fails on libxml:


<?xml version="1.0" encoding="utf-8"?>
<RootElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespa
ceSchemaLocation="file:///U:/Projects/lxml/lxml-test2.xsd">
    <test>AAA-AAA</test>
    <test>AAA-1111</test>
</RootElement>
Element 'test': [facet 'pattern'] The value 'AAA-AAA' is not accepted by the pat
tern '[A-Z_0-9 `~!@#$%^&*()_+-=\[\]\\{}\|;'",./?><]{0,10}'.
Element 'test': 'AAA-AAA' is not a valid value of the atomic type 'end_item_acro
nym_code_Type'.
Element 'test': [facet 'pattern'] The value 'AAA-1111' is not accepted by the pa
ttern '[A-Z_0-9 `~!@#$%^&*()_+-=\[\]\\{}\|;'",./?><]{0,10}'.
Element 'test': 'AAA-1111' is not a valid value of the atomic type 'end_item_acr
onym_code_Type'.
lxml-test2.xml fails to validate

I will submit the change to the pattern to the GEIA standards people. I'm not sure if this error is a genuine issue with libxml but given the same data parses cleanly with .NET/Xerces I thought it would be worth reporting.
Comment 1 GNOME Infrastructure Team 2021-07-05 13:24:26 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/libxml2/-/issues/

Thank you for your understanding and your help.