GNOME Bugzilla – Bug 649433
/usr/local/libexec/e-calendar-factory SIGSEGV while reading hash_table
Last modified: 2011-05-09 13:42:54 UTC
click in Evolution on 'calendar' (or starting Evo with -c calendar) crashes e-calendar-factory fully reproduceable with SIGSEGV; the gdb bt is like this: Program received signal SIGSEGV, Segmentation fault.
+ Trace 227012
Thread 29804300 (LWP 100085/initial thread)
Server is up and running... DEBUG e2k-autoconfig: entered read_config() DEBUG e2k-autoconfig: enterin g_hash_table_new(e2k_ascii_strcase_hash, e2k_ascii_strcase_equal): 29ed1f6b, 29ed1f39 DEBUG e2k-autoconfig: right after g_hash_table_new() DEBUG e2k-autoconfig: read_config() config_options=2995e9b0: 00000008 00000007 00000007 00000000 00000000 2995f460 29ed1f6b 29ed1f39 DEBUG e2k-autoconfig: read_config() return, fd=-1 DEBUG e2k-autoconfig: read_config() config_options=2995e9b0: 00000008 00000007 00000007 00000000 00000000 2995f460 29ed1f6b 29ed1f39 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 29804300 (LWP 100134/initial thread)] 0x29ed1f6b in ?? () (gdb) p e2k_ascii_strcase_hash $1 = {guint (gconstpointer)} 0x2a06d7db <e2k_ascii_strcase_hash> now I'm jumping to e2k_ascii_strcase_hash() which has a fprintf as well and it gets printed after the jump: (gdb) j e2k_ascii_strcase_hash Continuing at 0x2a06d7e0. DEBUG e2k-utils.c: e2k_ascii_strcase_hash so the problem is that the symbol e2k_ascii_strcase_hash() in read_config() has this broken value 0x29ed1f6b while it should be 0x2a06d7e0 this is with 2.32.1 and 2.32.3 in FreeBSD 9-CURRENT (HEAD) Matthias
I checked in /proc/PID/map where the two addr are: (gdb) p *hash_table $2 = {size = 8, mod = 7, mask = 7, nnodes = 0, noccupied = 0, nodes = 0x29972d60, hash_func = 0x29ed1f6b, key_equal_func = 0x29ed1f39, ref_count = 1, version = 0, key_destroy_func = 0, value_destroy_func = 0} (gdb) p e2k_ascii_strcase_hash $1 = {guint (gconstpointer)} 0x2a06d7db <e2k_ascii_strcase_hash> the good addr 0x2a06d7db is in the area of the shared lib: 0x2a02c000 0x2a09b000 87 272 0xc4b14908 r-x 2 1 0x0 COW NC vnode +/usr/local/lib/evolution-data-server-1.2/extensions/libecalbackendexchange.so NCH -1 the broken addr 0x29ed1f6b does not is in any area of /proc/PID/map then I checked all shared libs in /usr/local/lib/evolution-data-server-1.2/extensions/ for the symbol e2k_ascii_strcase_hash(): libecalbackendweather.so libecalbackendhttp.so libecalbackendgroupwise.so libecalbackendfile.so libecalbackendcontacts.so libecalbackendcaldav.so libebookbackendwebdav.so libebookbackendvcf.so libebookbackendldap.so libebookbackendgroupwise.so libebookbackendgoogle.so libebookbackendfile.so libebookbackendexchange.so 0003bf6b T e2k_ascii_strcase_hash ^^^ libecalbackendexchange.so 000417db T e2k_ascii_strcase_hash ^^^ as you can see e2k_ascii_strcase_hash() is in two shared libs and with the same last bits of the correct addr and the broken addr; as I wild guess I simply renamed 'libecalbackendexchange.so' to get it out of the way; the e-calendar-factory complains about it: (e-calendar-factory:36266): e-data-server-WARNING **: Cannot open "/usr/local/lib/evolution-data-server-1.2/extensions/libebookbackendexchange.so" but for the rest it works fine and can access my calendar data on the Echange server; any comments about this clash in the shared lib mapping?
Created attachment 187358 [details] [review] Stop loading plugins with RTLD_GLOBAL. There is no reason for plugins to export symbols to the rest of the process in which they are loaded, and this certainly seems to be a reason for them *not* to. Although I do think this is a bug in the FreeBSD dynamic linker. First we load a DSO which provides the symbol. Then we *unload* it, then we load *another* DSO which provides the same symbol. And internal references within that second DSO get resolved to the address where the function *used* to reside in the original DSO. That *has* to be broken, surely?
small correction: the two DSO providing the same symbol get load one after another; then the 1st(!) one is unloaded again and the symbol (in our case e2k_ascii_strcase_hash) is still pointing to the now detached addr space of the detached DSO;
OK, in that case the behaviour of the dynamic linker is probably excusable. When we loaded a second DSO with the *same* symbols as the first, we probably got what we deserve. RTLD_LOCAL is almost certainly the answer. I cannot think of any situation in which we would *want* plugins to be able to export their symbols. Some kind of incestuous communication between two plugins might try that, perhaps, but quite frankly I'd rather it *failed*.
(In reply to comment #2) > First we load a DSO which provides the symbol. Then we *unload* it, then we > load *another* DSO which provides the same symbol. And internal references > within that second DSO get resolved to the address where the function *used* to > reside in the original DSO. That *has* to be broken, surely? This is fixed already in 3.0 by splitting the installed backend modules into separate addressbook and calendar directories. Same type of thing was also wrecking havoc with the GType system.
Removing blocker status.
The backends need to export certain symbols for initialization and shutdown: eds_module_initialize() eds_module_shutdown() eds_module_list_types() Sounds to me like RTLD_LOCAL could break that.
David's proposed patch works fine so far w/o any visible side effect
(In reply to comment #7) > The backends need to export certain symbols for initialization and shutdown: > > eds_module_initialize() > eds_module_shutdown() > eds_module_list_types() > > Sounds to me like RTLD_LOCAL could break that. No, it uses g_module_symbol() (aka dlsym()) for that. It *has* to, otherwise it could never manage to load more than one module. I can't think of any valid case for wanting RTLD_GLOBAL.
(In reply to comment #5) > This is fixed already in 3.0 by splitting the installed backend modules into > separate addressbook and calendar directories. Same type of thing was also > wrecking havoc with the GType system. It's only a partial fix. If multiple modules use the same name for a non-static function or variable, that'll still collide, with no warning at build time.
To ssh://dwmw2@git.gnome.org/git/evolution-data-server bdc460e..6d97047 gnome-2-32 -> gnome-2-32 8dbd88c..f1980b5 gnome-3-0 -> gnome-3-0 7e186ad..671aac1 master -> master