GNOME Bugzilla – Bug 327167
2.6.23 - support for 'pattern' restriction broken
Last modified: 2006-04-20 09:38:44 UTC
Please describe the problem: The support for most regular expressions in 'pattern' restriction is broken. The regexes fail to match. Platform: Debian unstable. Steps to reproduce: 1. define a type with 'pattern' restriction like the one below <simpleType name="street_t"> <restriction base="string"> <pattern value=".*[^\s]+.*"/> <maxLength value="50" /> </restriction> </simpleType> 2. use that type in some element 3. try to validate some document Actual results: for example: /tmp/n.xml:28: element street: Schemas validity error : Element '{http://www.cc.com.pl/ns/peimp/peimp-update-1.0}street': [facet 'pattern'] The value 'Rakowiecka' is not accepted by the pattern '.*[^\s]+.*'. /tmp/n.xml:28: element street: Schemas validity error : Element '{http://www.cc.com.pl/ns/peimp/peimp-update-1.0}street': 'Rakowiecka' is not a valid value of the atomic type '{http://www.cc.com.pl/ns/peimp/peimp-update-1.0}street_t'. Or (to exclude the possibility of a bad character class \s): /tmp/n.xml:75: element city: Schemas validity error : Element '{http://www.cc.com.pl/ns/peimp/peimp-update-1.0}city': [facet 'pattern'] The value 'Lublin' is not accepted by the pattern '.*n'. /tmp/n.xml:75: element city: Schemas validity error : Element '{http://www.cc.com.pl/ns/peimp/peimp-update-1.0}city': 'Lublin' is not a valid value of the atomic type '{http://www.cc.com.pl/ns/peimp/peimp-update-1.0}city_t'. Expected results: Under libxml2 the document validates. Does this happen every time? Yes. Other information: validated document: standalone, encoding ISO-8859-2, schema encoding: iso-8859-2
certainly not a major bug, pumping up severity or priority doesn't help, at all ! did you tried with other schemas validators ? Daniel
Bug confirmed.
(In reply to comment #1) > certainly not a major bug, pumping up severity or priority doesn't help, at all > ! > did you tried with other schemas validators ? > Sorry for that status - I wasn't sure. The schema was validated for correctness using http://www.w3.org/2001/03/webdata/xsv and it produced no warnings.
Example: if the expression is ".*X", then the regexp module consumes all input characters due to ".*", no state is saved for rollbacks to process the "X" in the expression. This seems to apply to expressions with ".+" as well. For ".*X" an input of "aaaX" is rejected. For ".+X" an input of "aX" is rejected.
I will try to look at this soonish, I guess I know where this was introduced, I just need to figure out why ! Daniel
Okay, that is fixed in CVS, when trying to fix bug 316338 I introducted some reductions and unfortunately news bugs too. Daniel
*** Bug 339121 has been marked as a duplicate of this bug. ***