GNOME Bugzilla – Bug 161503
Tru64 UNIX: Need not-yet-released libtool (all tests segfault)
Last modified: 2007-02-21 22:54:32 UTC
I'm building libsigc++ 2.0.6 on Tru64 UNIX 5.1b, with vendor C/C++ compilers. Note that it won't compile out of the box, because of the issue I reported in bug #161502. As a temporary workaround I made ``struct is_base_and_derived'' in sigc++-2.0.6/type_traits.h so that both ``big'' and ``test'' are public. I'm not suggesting that as a "fix" for #161502, I'm just using it as a workaround. Once that was done, the rest of libsigc++ compiled just fine, and all the tests *except* test_lambda compile OK. I just skipped that test, for now. All 19 of the remaining tests run, and then apparently segfault in a destructor. The stack traces all look very similar: $test_retype foo(int 1) 1.5 foo(float 5) 25 bar(short 6) foo(int 1) 1.5 foo(float 5) 25 bar(short 6) foo(int 5) 7.5 Segmentation fault (core dumped) $ladebug .libs/test_retype core Welcome to the Ladebug Debugger Version 69 (built May 4 2003 for Tru64 UNIX) ------------------ object file name: .libs/test_retype core file name: core Reading symbolic information ...done Core file produced from executable 'lt-test_retype' Thread terminated at PC 0x3ff81f25a10 by signal SEGV (ladebug) where >0 0x3ff81f25a10 in flush(...) in /usr/lib/cmplrs/cxx/libcxxma.so
+ Trace 53609
The version in cvs does now build without changes. I think the segfault only happens when it is built with CXXFLAGS="std strict_ansi -model ansi".
It also segfaults when using CXX="-timplicit_local -model ansi -D__USE_STD_IOSTREAM"
It does not segfault when using CXX="-std".
I verified last night that it seems to be the `-model ansi' that's doing it. That's unfortunate, because `-model ansi' may be required by other libraries or later applications, and in that case all libraries need to be built with `-model ansi' (because of the change to name mangling, I suppose). It basically tells the compiler to completely conform to the ANSI/ISO C++ standard with regard to exceptions and name mangling. libsigc++ is the first package I've seen that has this problem with -model ansi. Note that two of the tests that I've tried (test_bind_return, test_slot) now generate unaligned accesses and then hang, rather than coredumping. This is pretty much always caused by a program making incorrect assumptions about data alignment or pointer size. All 17 of the other tests pass. Thanks for the work making this compile without any language/feature selection by the person doing the compile. That's definitely a big improvement.
> This is pretty much always caused by a program making incorrect assumptions about data alignment or pointer size. Yes, that's my suspicion too. Hopefully it's something simple. However, it is unlikely that I'll have time to fix this, and I'm not familiar with debugging on this platform. Could you try? You can get a pre-release of 2.0.7 here that should build without changes: http://www.murrayc.com/temp/glibmm-2.5.4.tar.gz
Sorry, I meant. http://www.murrayc.com/temp/libsigc++-2.0.6.tar.gz
I have a CVS copy from last night, so I'll try with that. I have debugged problems like this in the past (in C source, not C++), so I'll give it a whirl. The trick is to use the `uac' command to tell the operating system to cause any unaligned access to force a sigbus, and then run the program in a debugger and wait for it to segv. If I do that for test_bind_return, the stack trace looks like: $ladebug .libs/test_bind_return Welcome to the Ladebug Debugger Version 69 (built May 4 2003 for Tru64 UNIX) ------------------ object file name: .libs/test_bind_return Reading symbolic information ...done (ladebug) run Thread received signal SEGV stopped at [<opaque> operator <<(void) 0x3ff81f263a0] Information: An <opaque> type was presented during execution of the previous command. For complete type information on this symbol, recompilation of the program will be necessary. Consult the compiler man pages for details on producing full symbol table information using the '-g' (and '-gall' for cxx) flags. (ladebug) where >0 0x3ff81f263a0 in operator <<(...) in /usr/lib/cmplrs/cxx/libcxxma.so
+ Trace 54786
I must admit that I'm pretty weak in C++. It looks like operator << is getting something it doesn't like, in both cases coming through line 91 of adaptor_trait.h. I'll try look at it more a little later this weekend.
Thanks. A different -g option might give you more information, because it might optimise less stuff away.
All of sigc++ was compiled with `-gall' so it has as much debugging info as it's going to get. The "Information: An <opaque> ..." message is coming because the core is happening in a system library (libcxxma.so), which was not compiled with full debugging. There's nothing I can do about that.
Works for me with: CC=cc CFLAGS="-O2 -msym -readonly_strings" CXX=cxx CXXFLAGS="-O2 -readonly_strings -timplicit_local \ -model ansi -D__USE_STD_IOSTREAM" We have upgraded our version of libtool and made some Makefile.am patches but we didn't modify the source. Tim, if you want to test with our patch, ftp it from: ftp://support.thewrittenword.com/outgoing/libsigc++-2.0.6
BTW, we've built on 4.0D and 5.1.
Thanks Albert. I will try your patch. I generally rebuild the configure machinery and use a slightly-updated libtool (with one of your patches) when I build GNOME packages. I have some more to report about the failures and some of the testing I've done, but it will have to wait until I have some more time transcribe it.
I promised I would report more on the build status, so here goes. I'm building on Tru64 5.1b, with all vendor patches and the latest updates to the build tools. I'm using cxx -V: Compaq C++ V6.5-042 for Compaq Tru64 UNIX V5.1B (Rev. 2650) Compiler Driver V6.5-042 (cxx) cxx Driver If I build 2.0.6 or 2.0.7 CVS the way I was trying, with maximum ISO/ANSI C++ conformance (cxxflags include both `-std strict_ansi' and `-model ansi'), all tests coredump. If I build 2.0.7 CVS with no special CXXFLAGS (so the cxx compiler is essentially as modern as it can be while still retaining full backwards compatibility for ARM), I get 17 of 19 tests pass, 2 fail with core dumps, as reported in my comment #7. If I build 2.0.7 CVS with CXXFLAGS of `-std strict_ansi' (but *NOT* `-model ansi'), I likewise get 17 of 19 tests pass. The difference is that with -std strict_ansi, the sun_forte_workaround isn't used, so the stack trace looks slightly different. Also, I can't load `.libs/test_slot' in the debugger -- I get the following error: $ladebug .libs/test_slot Welcome to the Ladebug Debugger Version 69 (built May 4 2003 for Tru64 UNIX) ------------------ object file name: .libs/test_slot Reading symbolic information ...done 92145:.libs/test_slot: /sbin/loader: Error: libsigc-2.0.so.0: symbol "__7__T_Q2_3std9bad_alloc" unresolved 92145:.libs/test_slot: /sbin/loader: Fatal Error: Load of ".libs/test_slot" failed: Unresolved symbol name Process has exited with status 1 Error: could not start debuggee Doing some reading about `bad_alloc', I see it's a type of exception class, and appears to require that `#include <new>' be in the source file. If I actually add #include <new> and SIGC_USING_STD(new) to test_slot.cc and rebuild, then it too passes a `make check', so we're now up to 18 of 19 passing, with only test_bind_return still failing. Loading .libs/test_bind_return into the debugger shows the exact same problem as test_slot, and if I make the same change to test_bind_return.cc, it too passes its checks, so with the right CXXFLAGS and the inclusion of `<new>' in the two programs that might throw a bad_alloc, all tests pass. Regarding Albert's comment #10 : He and I have communicated about sigc++ in the past, and I tried his suggested CXXFLAGS previously, with no success. I wanted to try again, though, so I first tried just what he's using for CFLAGS and CXXFLAGS, without using his patch. For me, all tests *fail*. Murray's comment #2 basically mirrors my results when using just Albert's CXXFLAGS. If I try both the CFLAGS and CXXFLAGS and the patch, then all but one of the tests *pass*. Albert's not patching any of the sigc++ C++ files, just the build machinery, so something in his patch (probably libtool related) allows sigc++ 2.0.6 to pass nearly all its tests, even when `-model ansi' is part of CXXFLAGS. I haven't tried his CXXFLAGS and patch against 2.0.7 CVS, since it will be difficult to get the patch to apply to the CVS version. Perhaps after the 2.0.7 version is released, he or I can generate an updated version of the patch against 2.0.7 and test that too. Ideally, I would like to find out what the problem is with the test suite when I build sigc++ with maximum ISO/ANSI C++ conformance. From my point of view, the `-model ansi' flag is most important, though, because it affects everything before and after it in a library dependency chain. In the meantime, Murray, you may wish to add the #include <new> and SIGC_USING_STD(new) to the two tests I mentioned.
Why don't you compare the build log with and without my patch? Without my patch, are CXXFLAGS being passed through by libtool when a link is being performed (binary and library)?
That explains it. Without your patch, all CXXFLAGS are passed to the C++ compiler for each invokation of libtool `--mode=compile' *but* some of the CXXFLAGS are not passed when `--mode=link' and the link is the creation of the shared library. In particular, `-model ansi' is passed for each source file compilation, but it's not passed when the library is created. That's what's causing the problem. If I build the software with your CXXFLAGS but leave out your patch, all the tests fail. If I then rebuild the shared library by copying what libtool did and I add the `-model ansi' to the command line, I can re-run the tests (without needing to relink any of the test programs) and 19 of 20 now pass. So, this whole time the problem has been that libtool isn't passing enough of the CXXFLAGS when it actually creates the library. The the object files are created with `-model ansi', then the shared library also needs to be created with that flag. Makes sense. Note that with or without your patch, libtool still doesn't pass -timplicit_local to cxx when linking the shared library, but I'm not sure if it should. The absense of that from the shared library creation line doesn't seem to be causing the problems that the absense of `-model ansi' does. Albert, do you want to report the problem on the libtool list or should I?
The 2.0 and HEAD branch of libtool already pass -model through. We submitted the patch last year. I don't think passing -timplicit_local through matters.
Excellent. I'll use this patch on this platform in future. We probably can't actually use an unreleased libtool version regularly because the build files will not work with both old and new libtools, and we can't break the GNOME build for everyone who doesn't have the CVS version of libtool. I'll probably put the patch in our cvs, or at least a README.
The libtool developers plan another release of libtool-1.5. The 1.5 branch already has our patch so you can use the next released version of libtool-1.5.
GNU libtool 1.5.12 has been released. Someone will need to check if upgrading to it fixes this bug :)
Try this tarball, built from cvs with libtool 1.5.12: http://www.murrayc.com/temp/libsigc++-2.0.9.tar.gz
It won't build because of a different problem: cxx -DHAVE_CONFIG_H -I.. -I.. -std strict_ansi -model ansi -O2 -g3 -readonly_strings -pthread -c -MD signal.cc -DPIC -o .libs/signal.o cxx: Error: /usr/lib/cmplrs/cxx/V6.5-042/include/cxx/list.cc, line 205: "sigc::slot_base &sigc::slot_base::operator=(const sigc::slot_base &)" is inaccessible detected during instantiation of "std::list<T, Allocator> &std::list<T, Allocator>::operator=(const std::list<T, Allocator> &) [with T=sigc::slot_base, Allocator=std::allocator<sigc::slot_base>]" while (first1 != last1 && first2 != last2) *first1++ = *first2++; -----------------------------------------------------------^ cxx: Info: 1 error detected in the compilation of "signal.cc".
What if you build without -std string_ansi and use the default -std model? Also, libtool-1.5.12 is broken wrt -pthread. Another libtool release should be out by next week with the fix.
I tried it with what you've been using for CXXFLAGS too, and it didn't make a difference: cxx -DHAVE_CONFIG_H -I.. -I.. -O2 -readonly_strings -model ansi -timplicit_loca l -D__USE_STD_IOSTREAM -pthread -c -MD signal.cc -DPIC -o .libs/signal.o cxx: Error: /usr/lib/cmplrs/cxx/V6.5-042/include/cxx/list.cc, line 205: "sigc::slot_base &sigc::slot_base::operator=(const sigc::slot_base &)" is inaccessible detected during instantiation of "std::list<T, Allocator> &std::list<T, Allocator>::operator=(const std::list<T, Allocator> &) [with T=sigc::slot_base, Allocator=std::allocator<sigc::slot_base>]" while (first1 != last1 && first2 != last2) *first1++ = *first2++; -----------------------------------------------------------^ Albert's right about libtool 1.5.12 having a problem with -pthread and another release being forthcoming. I'm not sure whether that's important for sigc++, but Murry might wish to keep that in mind, if it doesn't cause problems for the sigc++ schedule.
Tim, is that the whole error? It should point to a line number in the libsigc++ source as well. Attach if if it's huge. This might be due to the changes after libsigc++ 2.0.9.
Yeah, it is the whole error. I agree it's a little odd. The error is actually happening in /usr/lib/cmplrs/cxx/V6.5-042/include/cxx/list.cc at line 205. The code around line 205 of list.cc looks like: template <class T, class Allocator> list<T, Allocator>& list<T, Allocator>::operator= (const list<T, Allocator>& x) { if (this != &x) { iterator first1 = begin(); iterator last1 = end(); const_iterator first2 = x.begin(); const_iterator last2 = x.end(); while (first1 != last1 && first2 != last2) *first1++ = *first2++; if (first2 == last2) erase(first1, last1); else insert(last1, first2, last2); } return *this; }
slot_base::operator=() is indeed protected. You should try making it public, though I'd like to find out why it is protected.
Tim, does that fix it?
libtool-1.5.14 has been released. This has the -pthread fix.
If I take the 2.0.9 snapshot from your site, patch slot_base.h to make operator=() so that it's not protected, and then update the tarball so that it's using libtool 1.5.14, I can build to the source and tests. With Albert's CXXFLAGS='-O2 -readonly_strings -model ansi -timplicit_local -D__USE_STD_IOSTREAM', I get two test failures (test_disconnect & test_retype_return; test_retype_return hangs and must be killed). With CXXFLAGS='std strict_ansi -model ansi -O2 -g3 -readonly_strings -pthread', I get four test failures (test_slot, test_disconnect, test_bind_return, test_retype_return; the last two of which hang and must be killed). The two additional test failures would go away if the source was changed as I mentioned in comment #13. If the operator()= problem for slot_base.h is fixed and the sources are upgraded to libtool-1.5.14, libsigc++ should build and most of the tests will pass, no matter which CXXFLAGS the person running configure chooses. We still have at least one new test failure that didn't happen with the earlier 2.0.7 CVS sources, but overall I think the build situation has improved. You're now no longer shooting yourself in the foot if you choose `-model ansi' as part of CXXFLAGS. If you're planning to do a new release of libsigc++ soon, I'll wait for that release and then see if I can figure out what's breaking with test_disconnect and test_retype_return.
Re. these test failures, are you talking about compilation failures, or runtime failures? The additional include that you mention in comment 13 is unlikely to change runtime behaviour. > We still have at least one new test failure that didn't happen with the earlier > 2.0.7 CVS sources Which one is that, with which CXXFLAGS? > If you're planning to do a new release of libsigc++ soon We'll do one in a couple of days.
I added the include <new> and USING_STD(new) to test_slot.cc and test_disconnect.cc.
Please try libsigc++ 2.0.10, and open a new bug if there are any remaining problems, even if they are problems with the tests. Well done.
BTW, Tim, 2.0.10 passes test_retype_return if you switch which version of visit_each() you use in sigc++/visit_each.h.
Fails to compile here against libsigc++-2.0.17 with your patch :-/ source='test_slot.cc' object='test_slot.o' libtool=no \ DEPDIR=.deps depmode=tru64 /bin/ksh ../depcomp \ cxx -I. -I. -I.. -I.. -I.. -I/usr/include -O2 -ieee -I/usr/include -ieee -c -o test_slot.o test_slot.cc cxx: Error: test_slot.cc, line 13: expected an identifier SIGC_USING_STD(new) ^ cxx: Info: 1 error detected in the compilation of "test_slot.cc". > uname -a OSF1 masso V5.1 2650 alpha alpha unknown Tru64 > /bin/cxx -V Compaq C++ V6.3-008 for Compaq Tru64 UNIX V5.1B (Rev. 2650) Compiler Driver V6.3-008 (cxx) cxx Driver
What patch?
I meant latest sources (2.0.17).