After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 326031 - RELAXNG fails to allow valid grammar choices
RELAXNG fails to allow valid grammar choices
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: relaxng
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2006-01-06 21:36 UTC by Chris Darroch
Modified: 2017-06-12 19:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
simple patch to bypass epsilon transition removals (428 bytes, patch)
2006-01-06 21:38 UTC, Chris Darroch
none Details | Review
simple test case RNG file (684 bytes, text/xml)
2006-01-06 21:40 UTC, Chris Darroch
  Details
simple test case XML file (122 bytes, text/xml)
2006-01-06 21:40 UTC, Chris Darroch
  Details
simple test case XML file (122 bytes, text/xml)
2006-01-09 15:17 UTC, Chris Darroch
  Details
simple test case RNG file (864 bytes, application/xml)
2006-01-09 15:21 UTC, Chris Darroch
  Details
hack patch around epsilon transition removal (428 bytes, patch)
2006-01-09 15:52 UTC, Chris Darroch
none Details | Review
same hack patch, for latest CVS version (540 bytes, patch)
2006-04-13 20:31 UTC, Chris Darroch
none Details | Review

Description Chris Darroch 2006-01-06 21:36:45 UTC
Please describe the problem:
Valid choices in a RELAX NG grammar are ignored, possibly due to overly
aggresive optimizations of "epsilon transitions".  The following patch
seems to prevent the problem, but likely slows down processing and/or
generates other problems.  However, it works well for me at the moment.

================================ xmlregexp.c PATCH ============
--- xmlregexp.c.orig	2005-08-23 09:37:26.000000000 -0400
+++ xmlregexp.c	2006-01-06 15:34:48.000000000 -0500
@@ -1686,6 +1686,7 @@
 		printf("Found simple epsilon trans from start %d to %d\n",
 		       statenr, newto);
 #endif     
+#if 0
             } else {
 #ifdef DEBUG_REGEXP_GRAPH
 		printf("Found simple epsilon trans from %d to %d\n",
@@ -1725,6 +1726,7 @@
 		state->nbTrans = 0;
 
 
+#endif
 	    }
             
 	}


Steps to reproduce:
Attempt to validate the following XML with the following RELAX NG schema:

================================ RELAX NG schema ============
<?xml version="1.0" ?>
<!DOCTYPE grammar>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
    datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

  <start>
    <element name="bug">
      <choice>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
          </element>
        </group>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
            <element name="content">
              <text />
            </element>
          </element>
        </group>
      </choice>
    </element>
  </start>

</grammar>
================================ XML sample file ============
<?xml version="1.0" ?>
<!DOCTYPE bug>
<bug>
  <test>
    <title>hey</title>
    <content>there</content>
  </test>
</bug>


Actual results:
The error message "Did not expect element content there" is generated and the
XML is (incorrectly) deemed invalid.

Expected results:
No error message should be generated and the XML should be deemed valid.  For
example, validating this with jing generates no error output.

Does this happen every time?
Yes.

Other information:
This problem seems similar to the one reported in bug #302836.
Comment 1 Chris Darroch 2006-01-06 21:38:30 UTC
Created attachment 56886 [details] [review]
simple patch to bypass epsilon transition removals
Comment 2 Chris Darroch 2006-01-06 21:40:12 UTC
Created attachment 56887 [details]
simple test case RNG file
Comment 3 Chris Darroch 2006-01-06 21:40:44 UTC
Created attachment 56888 [details]
simple test case XML file
Comment 4 Chris Darroch 2006-01-09 15:17:55 UTC
Created attachment 57038 [details]
simple test case XML file
Comment 5 Chris Darroch 2006-01-09 15:21:01 UTC
Created attachment 57039 [details]
simple test case RNG file
Comment 6 Daniel Veillard 2006-01-09 15:32:20 UTC
I can confirm the bug, even with 2.6.23, but I have doubts about the patch
though...

Daniel

Comment 7 Chris Darroch 2006-01-09 15:51:18 UTC
Thanks for looking at this!  As you noted, it's still a problem with 2.6.23, although I had to change the examples to demonstrate it.  In general, I happen to have a lot of test cases involving a <choice> between multiple variant definitions of an XML element.  I, too, am suspicious of my hack patch -- I'll attach another one for 2.6.23/CVS head, but it's probably not the right thing to do.

An interesting tangent: on my RHEL 3 system, I get the following from the 2.6.23 Regexp regression tests, *without* any patches to the sources.  I'm using libiconv 1.10 and zlib 1.2.3.  I'll try on an RHEL 4 system in a bit.

## Regexp regression tests
xpath result
7c7
< a/b/c: Ok
---
> a/b/c: Fail
9,11c9,11
< a:*/b:*/c:*: Ok
< child::a/child::b:*: Ok
< child::a/child::b:*|a/*/b|.//a:b: Ok
---
> a:*/b:*/c:*: Fail
> child::a/child::b:*: Fail
> child::a/child::b:*|a/*/b|.//a:b: Fail

Adding my "hack patch" simply adds the following one extra failure:

hard result
7c7
< b0aaa: Ok
---
> b0aaa: Fail
Comment 8 Chris Darroch 2006-01-09 15:52:20 UTC
Created attachment 57041 [details] [review]
hack patch around epsilon transition removal
Comment 9 Chris Darroch 2006-01-09 16:02:31 UTC
Yup, same reports from the Regexp regression tests on RHEL 4 (although I don't see why it would make a difference), without any patches ... maybe not a concern, I can't tell offhand, as the overall test suite reports "Success!"
Comment 10 Daniel Veillard 2006-02-19 11:16:21 UTC
I worked on fixes in the regexps a couple of weeks ago, and this 
seems fixed in CVS as far as I can tell:

paphio:~/XML -> ./xmllint --noout --relaxng  tst.rng tst.xml
tst.xml validates

Daniel
Comment 11 Chris Darroch 2006-04-13 20:15:11 UTC
I've tried both CVS head as of today, and 2.6.22 with the latest xmlregexp.c and relaxng.c from CVS head, and in both cases, I see a failure with the following RELAX NG file when validating the same sample XML file from the original bug report.  I'll attach both as files as well.  In general, my real-life test cases
are simply more complex versions of these files, with multiple alternate definitions of an element, often with 5 or more variations in terms of which sub-elements are allowed and in which order.

================================ RELAX NG schema ============
<?xml version="1.0" ?>
<!DOCTYPE grammar>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
    datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

  <start>
    <element name="bug">
      <choice>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
            <element name="content">
              <text />
            </element>
          </element>
        </group>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
            <element name="other">
              <text />
            </element>
            <element name="content">
              <text />
            </element>
          </element>
        </group>
      </choice>
    </element>
  </start>

</grammar>
================================ XML sample file ============
<?xml version="1.0" ?>
<!DOCTYPE bug>
<bug>
  <test>
    <title>hey</title>
    <content>there</content>
  </test>
</bug>
Comment 12 Chris Darroch 2006-04-13 20:31:34 UTC
Created attachment 63408 [details] [review]
same hack patch, for latest CVS version

Actually, looks like the XML and RNG test files I updated a while back are sufficient to show the problem; they're the same as the ones I quote in the previous comment.  This hack patch still seems to work to make them work, too.
Comment 13 Daniel Veillard 2006-10-13 16:46:52 UTC
Seems I fixed this when fixing #302836 earlier:

the original test case validates as expected

paphio:~/XML -> cat tst2.rng
<?xml version="1.0" ?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
    datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

  <start>
    <element name="bug">
      <choice>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
          </element>
        </group>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
            <element name="content">
              <text />
            </element>
          </element>
        </group>
      </choice>
    </element>
  </start>

</grammar>
paphio:~/XML -> xmllint --relaxng tst2.rng tst.xml
<?xml version="1.0"?>
<bug>
  <test>
    <title>hey</title>
    <content>there</content>
  </test>
</bug>
tst.xml validates
paphio:~/XML ->

and the one with the extra 'other' element correctly indicates failure:

paphio:~/XML -> cat tst.rng
<?xml version="1.0" ?>
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
    datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">

  <start>
    <element name="bug">
      <choice>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
            <element name="content">
              <text />
            </element>
          </element>
        </group>
        <group>
          <element name="test">
            <element name="title">
              <text />
            </element>
            <element name="other">
              <text />
            </element>
            <element name="content">
              <text />
            </element>
          </element>
        </group>
      </choice>
    </element>
  </start>

</grammar>
paphio:~/XML -> xmllint --relaxng tst.rng tst.xml
<?xml version="1.0"?>
<bug>
  <test>
    <title>hey</title>
    <content>there</content>
  </test>
</bug>
tst.xml:4: element content: Relax-NG validity error : Did not expect element content there
tst.xml fails to validate
paphio:~/XML ->

   so it seems it was a dup of 302836,
at least it's fixed in CVs now,

  thanks,

Daniel

*** This bug has been marked as a duplicate of 302836 ***
Comment 14 Chris Darroch 2006-12-28 20:25:22 UTC
Sorry for the delay -- I finally had some time to take a look at this.
I tried my test cases using 2.6.27, which looks like it has the fix for 302836
in it in the ChangeLog.  As you note above, the test with the extra <other>
element fails with the message "Did not expect element content there".

Alas ... it should not fail, I think.  Testing it with jing, it doesn't
fail, and I don't see why the RelaxNG grammar would be invalid.  It's
just providing a choice between two definitions of the <test> element,
one with and one without an <other> sub-element.  So the test should
pass, not fail, I'm afraid.

Which means, I guess, that this can't be just a duplicate of 302836.
So, I'm re-opening the bug report -- my apologies!
Comment 15 bje 2007-05-25 04:31:24 UTC
RELAXNG is now correctly validating this sort of situation, but is reporting very much the wrong error.

<?xml version="1.0" encoding="UTF-8"?>
<grammar ns="http://www.example.com/choice"
         xmlns="http://relaxng.org/ns/structure/1.0">
  <start> 
    <element name="doc">
      <choice><ref name="option_2"/><ref name="option_1"/></choice>
    </element>
  </start>
  <define name="option_1">
    <attribute name="type"><value>content</value></attribute>
    <zeroOrMore><element name="something"><empty/></element></zeroOrMore>
  </define>
  <define name="option_2">
    <attribute name="type"><value>no-content</value></attribute>
  </define>
</grammar>

This grammar should lead to <doc type="content"><something/></doc> being valid, and <doc type="no-content"/> being valid, and <doc type="content"><something>Hah, not empty!</something></doc> being invalid.  xmllint (compiled against libxml 20627) correctly reports these validity states, but for the third one:

$ xmllint --noout --relaxng sample.rng sample.xml
sample.xml:4: element something: Relax-NG validity error : Element doc has extra content: something
sample.xml fails to validate

This problem does not persist if the <choice> is collapsed down to remove the type element, or if the <zeroOrMore> is either removed or changed to <oneOrMore> -- in these cases, the correct error (unexpected text content in <something>) is reported.
Comment 16 Daniel Veillard 2012-05-11 12:51:09 UTC
That's a very different issue, the error reporting is far from
perfect, but I think the original bug of failing to validate
(or not) the instances is correctly fixed,

Daniel