After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 559501 - SIGSEGV in xmlNanoHTTPMethodRedir()
SIGSEGV in xmlNanoHTTPMethodRedir()
Status: RESOLVED FIXED
Product: libxml2
Classification: Platform
Component: general
2.6.27
Other All
: Normal normal
: ---
Assigned To: Daniel Veillard
libxml QA maintainers
: 554243 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2008-11-05 20:10 UTC by Raphael Prevost
Modified: 2009-08-24 12:19 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch (3.59 KB, patch)
2009-08-10 20:52 UTC, Raphael Prevost
none Details | Review
fixed patch..... (3.59 KB, patch)
2009-08-11 20:37 UTC, Raphael Prevost
none Details | Review
nanoFTP fix (3.55 KB, patch)
2009-08-23 12:20 UTC, Raphael Prevost
none Details | Review

Description Raphael Prevost 2008-11-05 20:10:30 UTC
Steps to reproduce:
Code triggering the crash:

    while (! hostess_stop) {

        memset(online, 0, sizeof(online)); count = 0;

        /* parse the feed */
        if (! (feed = xmlIOHTTPOpen(url)) ) {
            debug("HOSTESS: could not download XML feed.\n");
            goto _hostess_continue;
        }

        r = xmlReaderForIO(xmlIOHTTPRead, xmlIOHTTPClose, feed, "", NULL, 0);
        if (! r) {
            debug("HOSTESS: could not parse XML feed.\n");
            xmlIOHTTPClose(feed);
            goto _hostess_continue;
        }

        pthread_mutex_lock(& hostess_lock);

        while (xmlTextReaderRead(r) == 1) {
          (various parsing)
        }
        xmlTextReaderClose(r); xmlFreeTextReader(r);

        (additional stuff)
        pthread_mutex_unlock(& hostess_lock);

_hostess_continue:
        /* sleep until next refresh */
        if (! hostess_stop) sleep(60);
    }


Stack trace:
Stack trace:
Valgrind output:

==11611== Syscall param select(writefds) points to uninitialised byte(s)
==11611==    at 0x458DCD7: select (in /lib/tls/libc-2.3.6.so)
==11611==    by 0x638F463: (within /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x639061B: xmlNanoHTTPMethodRedir (in
/usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6390B59: xmlNanoHTTPMethod (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6390B99: xmlNanoHTTPOpen (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6353D06: xmlIOHTTPOpen (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6A38FED: ???
==11611==    by 0x46040BC: start_thread (in /lib/tls/libpthread-2.3.6.so)
==11611==    by 0x459501D: clone (in /lib/tls/libc-2.3.6.so)
==11611==  Address 0x72A9DF4 is on thread 2's stack
==11611==
==11611== Jump to the invalid address stated on the next line
==11611==    at 0x0: ???
==11611==    by 0x639061B: xmlNanoHTTPMethodRedir (in
/usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6390B59: xmlNanoHTTPMethod (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6390B99: xmlNanoHTTPOpen (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6353D06: xmlIOHTTPOpen (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6A38FED: ???
==11611==    by 0x46040BC: start_thread (in /lib/tls/libpthread-2.3.6.so)
==11611==    by 0x459501D: clone (in /lib/tls/libc-2.3.6.so)
==11611==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==11611==
==11611== Process terminating with default action of signal 11 (SIGSEGV):
dumping core
==11611==  Bad permissions for mapped region at address 0x0
==11611==    at 0x0: ???
==11611==    by 0x639061B: xmlNanoHTTPMethodRedir (in
/usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6390B59: xmlNanoHTTPMethod (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6390B99: xmlNanoHTTPOpen (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6353D06: xmlIOHTTPOpen (in /usr/lib/libxml2.so.2.6.27)
==11611==    by 0x6A38FED: ???
==11611==    by 0x46040BC: start_thread (in /lib/tls/libpthread-2.3.6.so)
==11611==    by 0x459501D: clone (in /lib/tls/libc-2.3.6.so)

Other information:
libxml2 2.6.27.dfsg-2 etch32

Seems like a NULL dereference, it does not occur systematically, seems the xml
feed URL going down helps triggering the crash.
Comment 1 Raphael Prevost 2008-11-05 22:57:35 UTC
Stacktrace obtained with the coredump generated by valgrind (note: some URL details like the domain name and the passed IP parameter have been edited):

  • #0 ??
  • #1 xmlNanoHTTPConnectAttempt
    at nanohttp.c line 904
  • #2 xmlNanoHTTPMethodRedir__internal_alias
    at nanohttp.c line 718
  • #3 xmlNanoHTTPMethodRedir__internal_alias
    at nanohttp.c line 763
  • #4 xmlNanoHTTPMethodRedir__internal_alias
    at nanohttp.c line 750
  • #5 __xmlParserInputBufferCreateFilename
    at xmlIO.c line 2371
  • #6 _foo_hostess
    at plugins/foo/foo_hostess.c line 205
  • #7 start_thread
    from /lib/tls/libpthread.so.0
  • #8 clone
    from /lib/tls/libc.so.6

Comment 2 Raphael Prevost 2008-11-06 16:34:34 UTC
Actually I think this bug results from a memory corruption due to linking with a software using a modified FD_SETSIZE. My software bumps FD_SETSIZE to 32768, and the xmlIOHTTPOpen() starts to fail on load, because it uses select() and thus can not cope with socket descriptors above the system FD_SETSIZE. The crash occurs usually after a few of such failures (error : Operation in progress). You can reproduce easily by using http://cvs.apache.org/~jorton/bumpfd.c to artificially increase the fd numbers. Since a single socket is polled, may I suggest using the FIONREAD ioctl to avoid this issue altogether? It is quite portable (probably not on BeOS or VMS though).
Comment 3 Raphael Prevost 2008-11-06 17:56:49 UTC
http://securityvulns.com/Hdocument669.html

I think this supports the above hypothesis...
Comment 4 Raphael Prevost 2008-11-06 19:11:10 UTC
I recompiled libxml2 using this hack:

#ifdef __linux__
#include <bits/types.h>
#undef __FD_SETSIZE
#define __FD_SETSIZE 32767
#endif

at the beginning of nanohttp.c, and the crash does not happen anymore. This confirms the problem is due to a stack corruption triggered by using select(2) on descriptor numbers > default FD_SETSIZE.

As a conservative solution, I suggest to check for descriptors > to the default FD_SETSIZE, and avoid passing the offending sockets to select(2). A probably better fix would be to get rid of select(2) altogether (i.e. using poll(2) or ioctl(FIONREAD) if available).

Currently, using nanoHTTP is *dangerous* in any application having to handle more than 1024 simultaneous connections.
Comment 5 Raphael Prevost 2008-12-14 17:06:05 UTC
Hi, I just wanted to let you know that I'm now using a hand patched libxml2 using poll(2) instead of select(2) and that everything is working fine. It would be great however if this issue was patched upstream.
Comment 6 Daniel Veillard 2009-08-10 12:22:33 UTC
I discover this now, it's late but if you still have the patch to use
poll instead of select, please send to the list or here,

  thanks,

Daniel
Comment 7 Raphael Prevost 2009-08-10 20:52:29 UTC
Created attachment 140375 [details] [review]
patch
Comment 8 Raphael Prevost 2009-08-10 20:53:49 UTC
Sure, I didn't have the original patch but I rewrote something nicer. If poll() is not available, nanoHTTP will fallback to select but check the socket descriptor. Please review and tell me if it's OK.
Comment 9 Daniel Veillard 2009-08-11 16:12:22 UTC
Thanks !
That looks fine just out of reviewing the patch, but apparently it
just doesn't work :-\

paphio:~/XML -> ./xmllint http://xmlsoft.org/
error : Operation in progress
warning: failed to load external entity "http://xmlsoft.org/"
paphio:~/XML -> 

  Do you think you will have a chance to debug this ? Seems
on EAGAIN or something we need to loop on the operation instead of
failing, but I didn't try to gdb it ... If you can't I will try

Daniel
Comment 10 Raphael Prevost 2009-08-11 20:37:03 UTC
Created attachment 140481 [details] [review]
fixed patch.....
Comment 11 Raphael Prevost 2009-08-11 20:39:22 UTC
Ah, sorry, I did a stupid typo while rewriting the patch, wrote POLLERR instead of POLLOUT in a condition. This new patch must work fine.
Comment 12 Raphael Prevost 2009-08-16 17:16:56 UTC
Please tell me once this patch gets accepted upstream or whether I have to change some other things.
Comment 13 Daniel Veillard 2009-08-23 11:12:59 UTC
It looks fine now, of course that code is a bad mix of conditional
code due to portability issues, so I try to clean it up, but it
looks fine and works, so applied and commited to git,

  thanks a lot !

Daniel

P.S.: nanoftp, while less used has the same kind of problems you may
   want to fix them too if you have the courage :-)
Comment 14 Raphael Prevost 2009-08-23 12:20:34 UTC
Created attachment 141494 [details] [review]
nanoFTP fix

Same fix for nanoFTP
Comment 15 Raphael Prevost 2009-08-23 12:24:58 UTC
Hi, I fixed nanoFTP in the same way, you can test using this command:
./xmllint ftp://ftp.pramberger.at/systems/linux/contrib/rhel5/i386/repodata/repomd.xml                 

I agree it's quite ad-hoc fixes, the best solution would probably be to add a private _xmlWaitOnSocket() function or the like to avoid code duplication and the numerous #ifdef. I'll try to refactor both nanoFTP and HTTP that way when I have the time.
Comment 16 Daniel Veillard 2009-08-24 12:19:08 UTC
*** Bug 554243 has been marked as a duplicate of this bug. ***