After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 133787 - Overwriting _private field of xmlNode in relaxng.c crashes XML::LibXML
Overwriting _private field of xmlNode in relaxng.c crashes XML::LibXML
Status: VERIFIED FIXED
Product: libxml2
Classification: Platform
Component: relaxng
git master
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
Depends on:
Blocks:
 
 
Reported: 2004-02-08 13:32 UTC by Alexei Agafonov
Modified: 2017-06-12 19:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Example segaulting Perl script (1006 bytes, text/plain)
2004-02-08 13:38 UTC, Alexei Agafonov
  Details
One of the possible solutions (node data hash, see comments above) (3.88 KB, patch)
2004-02-19 20:27 UTC, Alexei Agafonov
none Details | Review
Alternate solution (save/restore original node data) (3.71 KB, patch)
2004-02-19 20:28 UTC, Alexei Agafonov
none Details | Review
Sorry, patch (id=24584) is incorrect, here is the corrected version (I thought data is never saved twice during RelaxNG validation) (4.59 KB, patch)
2004-02-20 15:20 UTC, Alexei Agafonov
none Details | Review

Description Alexei Agafonov 2004-02-08 13:32:50 UTC
Perl script segfaults attempting to XML::LibXML::Element after RelaxNG 
schema validation.
(reported at http://aspn.activestate.
com/ASPN/Mail/Message/perl-xml/1991286)
Investigating this problem, Petr Pajas found that the segfault caused by
loss of XML::LibXML-specific data stored in the mentioned _private field
as a result of overwriting of _private field of xmlNode structure in 
relaxng.c.
( http://aspn.activestate.com/ASPN/Mail/Message/perl-xml/1992454):

> while investigating why RelaxNG validation via Perl bindings for libxml2
> occasionally result in sigsegv (as reported by Alexei Agafonov on 
> perl-xml mail list),
> I found the cause to be at relax.c (CVS) around line 9755, where a value 
> is assigned to node->_private, while _private is used by Perl bindings 
> to store some critical
> binding-specific structures (reference counting etc.).  It seems this is 
> the only place where
> RelaxNG uses _private slot of a xmlNode of the document being validated, 
> and it
> seems it's only used for some kind of an optimalization. Do you think 
> this part could
> possibly be rewritten, so that xmlNode's _private slot is preserved 
> within RelaxNG validation?
> 
> The problematic lines are:
> relaxng.c:9755
> if (ret == 0) {
>   node->_private = define;
> }

This really like as an optimization. Maybe the quick solution is to comment 
out line:

	node->_private = define;

What about full solution, it may be to store optimization information 
in temporary hash with xmlNode addresses as entries.
(I suppose this to be safe as the document structure is presistent during 
validation.)

Sincerely,
Alexei Agafonov
Comment 1 Alexei Agafonov 2004-02-08 13:38:07 UTC
Created attachment 24201 [details]
Example segaulting Perl script
Comment 2 Alexei Agafonov 2004-02-19 20:25:31 UTC
After playing with code I've inclined to use node address / data map
(while there is another solution, see below).
The price is 25% performance overhead, instead there guaranteed
that the original document is not altered during schema validation.
(i.e. original _private field is not altered).
I've used functions from hash.c for managing node data map.
This is subject for optimization - as the functions from hash.c take 
zero-terminated strings as a keys, node address is converted
into hex string representation each time we need to access node data.

Proposed solution (for libxml2 2.6.5) is in the attached patch:

relaxng.c.patch

This solution seem to be not bad yet there may be others.
One of the possible solution is to save _primary value 
of each node before schema validation and restore saved
value after validation.

I've tried this solution as well.
It works much faster than the node hash method
(see statistics below), even though memory allocation 
is rather primitive (too many mallocs).

The alternative solution (for libxml2 2.6.5) is in the attached patch:
relaxng.c.alt.patch

For the moment, I prefer hash method as it seems more safe.

Here is the timing statistics for different solutions:

default (no solution at all, the _private field is corrupted):
  2.1 sec

node hash, use sprintf for hash key creation:
  3.1 sec

node hash, use fast inline code instead of sprintf for hash key 
creation:
  2.6 sec

save/restore original node data:
  2.2 sec

disable optimization (just return NULL on the each node data request):
  171.1 sec (It seems the we should not go that way :))
Comment 3 Alexei Agafonov 2004-02-19 20:27:05 UTC
Created attachment 24584 [details] [review]
One of the possible solutions (node data hash, see comments above)
Comment 4 Alexei Agafonov 2004-02-19 20:28:23 UTC
Created attachment 24585 [details] [review]
Alternate solution (save/restore original node data)
Comment 5 Alexei Agafonov 2004-02-20 15:20:58 UTC
Created attachment 24609 [details] [review]
Sorry, patch (id=24584) is incorrect, here is the corrected version (I thought data is never saved twice during RelaxNG validation)
Comment 6 Daniel Veillard 2004-02-21 12:03:43 UTC
I didn't had time yet to look at it. I think I will rather use
the new psvi field of the xmlNode structure.

Daniel
Comment 7 Daniel Veillard 2004-03-04 13:57:27 UTC
That should be fixed in CVS, can you check ?

Daniel
Comment 8 Daniel Veillard 2004-03-25 11:21:23 UTC
This should be closed by release of libxml2-2.6.8,
                                                                                
  thanks,
                                                                                
Daniel
Comment 9 Alexei Agafonov 2004-03-27 22:14:15 UTC
I've been tested the CVS version for a while under 
1. MS Windows XP, Perl 5.8.0
2. FreeBSD 4.8, Perl 5.6.1
All is is OK, thanks.