GNOME Bugzilla – Bug 751764
segfault in xsltproc on i386
Last modified: 2015-08-08 17:18:36 UTC
Created attachment 306461 [details] one of the input files I get a segfault on Ubuntu 15.04 i386 with the following command: /usr/bin/xsltproc --nonet -o default/docs-xml/manpages/smb.conf.5 /home/ubuntu/autobuild/b22271/samba/docs-xml/xslt/man.xsl default/docs-xml/manpages/smb.conf.5.xml $ xsltproc -V Using libxml 20902, libxslt 10128 and libexslt 817 xsltproc was compiled against libxml 20902, libxslt 10128 and libexslt 817 libxslt 10128 was compiled against libxml 20902 libexslt 817 was compiled against libxml 20902 I'll attach the files, and the last 1000 lines of the -v output. These are from the Samba master git repo, and work on x86_64.
Created attachment 306462 [details] the other input file
Created attachment 306463 [details] the last 1000 lines of --verbose output
Can you run that command under valgrind and provide the valgrind error output ? I'm cloning the samba git repo (slow I'm in china) but not having an i386 around, unless I download and install a virtual i386 image (more slow download) I'm not sure I will be able to reproduce thanks, Daniel
indeed on X86_64 I can't reproduce at all: thinkpad:~/Upstream/samba -> valgrind ~/XSLT/xsltproc/xsltproc --nonet -o default/docs-xml/manpages/smb.conf.5 docs-xml/xslt/man.xsl docs-xml/manpages/smb.conf.5.xml Element smbconfsection in namespace '' encountered in programlisting, but no template matches. ... Note: Writing smb.conf.5 thinkpad:~/Upstream/samba -> so it goes to the end without valgring detecting an issue using libxml2 and libxslt git code bases Daniel
I rebuilt and tested with git head libxml2 and libxslt on an armv7l machine (the only 32 bits one I have around) and valgrind passed with same message but without any error at runtime. Daniel
> Can you run that command under valgrind and provide the valgrind error output ? No -- it runs fine under valgrind and gdb. (I've rarely seen valgrind say less!). Thanks for the quick response Daniel. Below is a bit of what gdb says about the core dump: (gdb) info args ctxt = 0xba25d3e8 op = 0xb86bc818 (gdb) bt (gdb) p *ctxt $1 = {cur = 0x0, base = 0x0, error = 0, context = 0xb99bd308, value = 0x0, valueNr = 0, valueMax = 10, valueTab = 0xb9f8ec78, comp = 0xb866a948, xptr = 0, ancestor = 0x0, valueFrame = 0} (gdb) p *op $2 = {op = XPATH_OP_ARG, ch1 = -1, ch2 = 1, value = 0, value2 = 0, value3 = 0, value4 = 0x0, value5 = 0x0, cache = 0x0, cacheURI = 0x0} (gdb) info locals equal = -1200896016 ret = -1200896016 (gdb) bt 10
+ Trace 235219
Two other things I perhaps should have said: This is an openstack instance. The xslproc in question is an ubuntu package and has the patches shown listed here: http://patches.osdyson.org/package/libxslt/1.1.28-2+dyson1 but none of those look relevant to me. I'll try compiling from upstream.
Bad news for you, if I rebuild the binary locally without optimization from git statically linking the libxml2/libxslt/libexslt inside, it doesn't crash and works to the end: ubuntu@samba-build-i386-4-32bit:~/autobuild/b22271/samba/bin$ ~/XSLT/xsltproc/xsltproc --nonet -o default/docs-xml/manpages/smb.conf.5 \ > /home/ubuntu/autobuild/b22271/samba/docs-xml/xslt/man.xsl \ > default/docs-xml/manpages/smb.conf.5.xml Note: Writing smb.conf.5 ubuntu@samba-build-i386-4-32bit:~/autobuild/b22271/samba/bin$ if I run the system xsltproc under gdb, it doesn't crash either as you noticed. I also rebuilt xsltproc from sources with optimization that time -O2 and generated again a binary with libxml2/libxslt/libexslt linked in, and it doesn't crash either. So I'm tempted to say that the binary was misgenerated or there is a problem in the environment . Also the crash is actually in libxml2 XPath not in libxslt, so the question is more about the version of libxml2 on the system rather than the libxslt one. Seems there is a couple of changes to xpath.c module coming from upstream. Daniel
Thanks again Daniel. Recompiling the libxml Ubuntu package with -O0 fixes it (they use -O3 by default). I have opened a distro bug here: https://bugs.launchpad.net/ubuntu/+source/libxml2/+bug/1471029
It could be related somehow to running virtualized, but that looks unlikely, the stylesheet isn't very complex and it's not like we are exhausting the stack or something, so yes it feels like an failure from the compiler on optimizing libxml2 XPath code. Weird thing is that if you look at the logs from libxslt, the same template led to other XPath evaluation successfully, it looks even weirder. But yes at this point I don't see what I could do, thanks, Daniel
Gah, actually I was wrong about -O0 working in comment #9. (I up-arrowed to ~/XSLT/xsltproc/xsltproc, not /usr/bin/xsltproc).
Looking at 3 crash dumps of the ubuntu packages recompiled with -O0, the segfault occurs in different places each time -- on a push, a call, and a "mov %eax,0x4(%esp)" (that last one in malloc.c). The stack depth differs with every run, somewhere in the range 3000 - 5000, and the stack pointer differs by less than 400000 between main() and the failure.
so something is corrupting the stack pointer, usually valgrind is very good to find those kind of errors, and here it just works... this is mysterious, Daniel
This has been bugging me more than it should. As it really doesn't seem to have much to do with xsltproc itself, I will make any further reports over on the Ubuntu bug, here: https://bugs.launchpad.net/ubuntu/+source/libxml2/+bug/1471029 The upshot is it is a stack overflow, but the stack seems to be in the wrong place from the beginning.
This is caused by a stack overflow related to docbook-xsl's use of recursive templates. Also see https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=750593 *** This bug has been marked as a duplicate of bug 736077 ***