GNOME Bugzilla – Bug 172215
XML Schema <any> with namespace "##other" is not working correctly
Last modified: 2009-08-15 18:40:50 UTC
Please describe the problem: If having a non-absent targetNamespace and using an <any> wildcard with the namespace specifier "##other", elements bound to the targetNamespace are incorrectly still let through. Such a negated wildcard is build via two transitions: (*|urn:some:ns) [Tr1] / \ [start] NULL [sink-state] \ (*|*) [Tr2] \ [final state] The first transition [Tr1] tries to catch elements in the wrong namespace - to lead them to a sink-state. The second [Tr2] lets through elements in all other namespaces. The regex does correctly identify the built automaton as a "sink-state"-case. The following seems to happen: 1. An invalid element is pushed. 2. The automaton reaches the "sink-state" 3. xmlRegExecPushString is called with NULL to finish the process 4. xmlRegExecPushString returns 0, since it somehow rollbacks the "sink-state" and finds a way to push the last invalid element through transition [Tr2], making it valid again. Steps to reproduce: I'll attach a test case. Actual results: xmllint --noout --schema any5_0.xsd any5_0.xml any5_0.xml validates Expected results: I get the following with a workaround in xmlschemas.c: xmllint --noout --schema any5_0.xsd any5_0.xml Element 'boo': This element is not expected. Expected is one of ( {##other:urn:t est:foo}* ). any5_0.xml fails to validate Does this happen every time? Other information:
Created attachment 39490 [details] any5_0.xsd
Created attachment 39491 [details] any5_0.xml
The workaround is in the CVS now (xmlschemas.c).
Removed the workaround, since it was not working in every case. We need somehow the transitions [Tr1] and [Tr2] not to be handled as a "choice": if [Tr1] leads to the sink state, the automaton must not rollback, but stay captured in the sink state.
More work at the automaton level needed apparently then Daniel
As of xmlschemas.c revision 1.134 a temporary workaround exists for a scenario where the minOccurs and maxOccurs of the <any> wildcard are both 1. For other min/maxOccurs values this bug is still existent.
This is fixed in CVS, this required extending the automata to handle transition for non-values, but this is the only right way to fix this unfortunately :-) I added a number of regression tests, thanks, Daniel
This should be closed by release of libxml2-2.6.21, thanks, Daniel