After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 78729 - Validation bug for strange content model
Validation bug for strange content model
Status: VERIFIED FIXED
Product: libxml2
Classification: Platform
Component: general
2.4.18
Other Linux
: Normal normal
: ---
Assigned To: Daniel Veillard
Daniel Veillard
Depends on:
Blocks: 78966
 
 
Reported: 2002-04-15 07:08 UTC by Morus Walter
Modified: 2009-08-15 18:40 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Morus Walter 2002-04-15 07:08:47 UTC
When trying to validate (e.g. using xmllint --valid) the following 
invalid document, libxml seems to hang in some endless loop eating 
more and more memory and finally crashes with 
realloc failed !nSegmentation fault

(I had .5 G physical plus 1G virtual memory, so I don't think that this is 
an effect of to few memory)

---
<!DOCTYPE x [
<!ELEMENT x (a+ | ((b), (c?, d*)+)+)>
<!ELEMENT a EMPTY>
<!ELEMENT b EMPTY>
<!ELEMENT c EMPTY>
<!ELEMENT d EMPTY>
]>
<x><b/><a/></x>
---

Of course the form of the content model is a bit strange and
can be replaced by the equivalent
<!ELEMENT x (a+ | ((b), (c | d )*)+)>
in which case xmllint perfectly finds, that the document is invalid.

It seems that the realloc failes in valid.c:vstateVPush since this is
the only error message missing the \ before the 'n' (so typos can be
helpful ;-). I didn't manage to track the problem down, though.

As I said, the content model is a bit strange, but it is perfectly
legal.
Actually I came across this problem, when playing around with
the xmlValidGetValidElements function on some TEI-document using
the xteilite.dtd which contains the element declaration
<!ELEMENT publicationStmt (p+ | ((publisher | distributor | authority) , 
                                 (pubPlace?, address?, idno*,
availability?, date?)+)+)>

I checked this with libxml 2.4.17 and 2.4.18.
Comment 1 Daniel Veillard 2002-04-15 10:18:55 UTC
Hum, that's serious ! There is 2 things:
  1/ the unbounded use of memory, this has to be fixed ASAP
  2/ the problem in the validation engine

For the first, I just commited a fix to prevent the memory
usage explosion:

http://cvs.gnome.org/bonsai/cvsquery.cgi?module=gnome-xml&branch=HEAD&branchtype=match&dir=gnome-xml&file=&filetype=match&who=veillard&whotype=match&sortby=Date&hours=&date=explicit&mindate=04%2F15%2F02+06%3A14&maxdate=04%2F15%2F02+06%3A16&cvsroot=%2Fcvs%2Fgnome

For the second part, I think the right approach will be to
drop the current ad-hoc algorithm and use the regexp engine
which I'm developping for XML Schemas. That integration will
not make it in the next release but I expect the following one
to have the fix based on a far more reliable core.

Daniel
Comment 2 Daniel Veillard 2002-09-19 21:48:32 UTC
Okay this should be fixed for good now. The code in CVs now use
the regexp implementation designed for XML schemas to do the DTD
cvalidation of the element content model. This is a large change
that you may get either from CVs or by waiting for the next release,
thanks for the bug report and the example !

paphio:~/XML -> xmllint --valid tst.xml
<?xml version="1.0"?>
<!DOCTYPE x [
<!ELEMENT x (a+ | (b , c? , d*+)+)>
<!ELEMENT a EMPTY>
<!ELEMENT b EMPTY>
<!ELEMENT c EMPTY>
<!ELEMENT d EMPTY>
]>
<x><b/><a/></x>
paphio:~/XML -> cat .memdump 
      11:45:56 PM

      MEMORY ALLOCATED : 0, MAX was 23985
BLOCK  NUMBER   SIZE  TYPE
paphio:~/XML -> 

  fixed !

Daniel
Comment 3 Daniel Veillard 2002-09-26 21:39:07 UTC
Shoud be closed in the last release,

 thanks,

Daniel