After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 649433 - /usr/local/libexec/e-calendar-factory SIGSEGV while reading hash_table
/usr/local/libexec/e-calendar-factory SIGSEGV while reading hash_table
Status: RESOLVED FIXED
Product: Evolution Exchange
Classification: Deprecated
Component: Connector
2.32.x
Other FreeBSD
: Normal normal
: ---
Assigned To: Connector Maintainer
Ximian Connector QA
Depends on:
Blocks:
 
 
Reported: 2011-05-05 06:40 UTC by Matthias Apitz
Modified: 2011-05-09 13:42 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Stop loading plugins with RTLD_GLOBAL. (539 bytes, patch)
2011-05-06 14:00 UTC, David Woodhouse
none Details | Review

Description Matthias Apitz 2011-05-05 06:40:07 UTC
click in Evolution on 'calendar' (or starting Evo with -c calendar) crashes e-calendar-factory fully reproduceable with SIGSEGV; the gdb bt is like this:

Program received signal SIGSEGV, Segmentation fault.

Thread 29804300 (LWP 100085/initial thread)

  • #1 g_hash_table_lookup_node
    at ghash.c line 252
  • #0 ??
  • #1 g_hash_table_lookup_node
    at ghash.c line 252
  • #2 g_hash_table_lookup
    at ghash.c line 252
  • #3 e2k_autoconfig_lookup_option
    at _ctype.h line 106
  • #4 e2k_autoconfig_new

Server is up and running...
DEBUG e2k-autoconfig: entered read_config()
DEBUG e2k-autoconfig: enterin g_hash_table_new(e2k_ascii_strcase_hash,
e2k_ascii_strcase_equal): 29ed1f6b, 29ed1f39
DEBUG e2k-autoconfig: right after g_hash_table_new()
DEBUG e2k-autoconfig: read_config() config_options=2995e9b0:
00000008 00000007 00000007 00000000 00000000 2995f460 29ed1f6b 29ed1f39
DEBUG e2k-autoconfig: read_config() return, fd=-1
DEBUG e2k-autoconfig: read_config() config_options=2995e9b0:
00000008 00000007 00000007 00000000 00000000 2995f460 29ed1f6b 29ed1f39

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 29804300 (LWP 100134/initial thread)]
0x29ed1f6b in ?? ()
(gdb) p e2k_ascii_strcase_hash
$1 = {guint (gconstpointer)} 0x2a06d7db <e2k_ascii_strcase_hash>

now I'm jumping to e2k_ascii_strcase_hash() which has a fprintf as well
and it gets printed after the jump:

(gdb) j e2k_ascii_strcase_hash
Continuing at 0x2a06d7e0.
DEBUG e2k-utils.c: e2k_ascii_strcase_hash

so the problem is that the symbol e2k_ascii_strcase_hash() in
read_config() has this broken value 0x29ed1f6b while it should be
0x2a06d7e0

this is with 2.32.1 and 2.32.3 in FreeBSD 9-CURRENT (HEAD)

Matthias
Comment 1 Matthias Apitz 2011-05-05 12:18:48 UTC
I checked in /proc/PID/map where the two addr are:

(gdb) p *hash_table
$2 = {size = 8, mod = 7, mask = 7, nnodes = 0, noccupied = 0,
  nodes = 0x29972d60, hash_func = 0x29ed1f6b, key_equal_func = 0x29ed1f39,
  ref_count = 1, version = 0, key_destroy_func = 0, value_destroy_func = 0}


(gdb) p e2k_ascii_strcase_hash
$1 = {guint (gconstpointer)} 0x2a06d7db <e2k_ascii_strcase_hash>

the good addr 0x2a06d7db is in the area of the shared lib:
0x2a02c000 0x2a09b000 87 272 0xc4b14908 r-x 2 1 0x0 COW NC vnode
+/usr/local/lib/evolution-data-server-1.2/extensions/libecalbackendexchange.so NCH -1

the broken addr 0x29ed1f6b does not is in any area of /proc/PID/map

then I checked all shared libs in
/usr/local/lib/evolution-data-server-1.2/extensions/ for the symbol e2k_ascii_strcase_hash():

libecalbackendweather.so
libecalbackendhttp.so
libecalbackendgroupwise.so
libecalbackendfile.so
libecalbackendcontacts.so
libecalbackendcaldav.so
libebookbackendwebdav.so
libebookbackendvcf.so
libebookbackendldap.so
libebookbackendgroupwise.so
libebookbackendgoogle.so
libebookbackendfile.so
libebookbackendexchange.so
0003bf6b T e2k_ascii_strcase_hash
     ^^^
libecalbackendexchange.so
000417db T e2k_ascii_strcase_hash
     ^^^

as you can see e2k_ascii_strcase_hash() is in two shared libs and with
the same last bits of the correct addr and the broken addr; as I wild
guess I simply renamed 'libecalbackendexchange.so' to get it out of the
way; the e-calendar-factory complains about it:

(e-calendar-factory:36266): e-data-server-WARNING **: Cannot open
"/usr/local/lib/evolution-data-server-1.2/extensions/libebookbackendexchange.so"

but for the rest it works fine and can access my calendar data on the
Echange server;

any comments about this clash in the shared lib mapping?
Comment 2 David Woodhouse 2011-05-06 14:00:51 UTC
Created attachment 187358 [details] [review]
Stop loading plugins with RTLD_GLOBAL.

There is no reason for plugins to export symbols to the rest of the process in which they are loaded, and this certainly seems to be a reason for them *not* to.

Although I do think this is a bug in the FreeBSD dynamic linker.

First we load a DSO which provides the symbol. Then we *unload* it, then we load *another* DSO which provides the same symbol. And internal references within that second DSO get resolved to the address where the function *used* to reside in the original DSO. That *has* to be broken, surely?
Comment 3 Matthias Apitz 2011-05-06 14:37:12 UTC
small correction:

the two DSO providing the same symbol get load one after another; then the 1st(!) one is unloaded again and the symbol (in our case e2k_ascii_strcase_hash) is still pointing to the now detached addr space of the detached DSO;
Comment 4 David Woodhouse 2011-05-06 16:20:04 UTC
OK, in that case the behaviour of the dynamic linker is probably excusable. When we loaded a second DSO with the *same* symbols as the first, we probably got what we deserve. RTLD_LOCAL is almost certainly the answer. I cannot think of any situation in which we would *want* plugins to be able to export their symbols.

Some kind of incestuous communication between two plugins might try that, perhaps, but quite frankly I'd rather it *failed*.
Comment 5 Matthew Barnes 2011-05-06 17:39:07 UTC
(In reply to comment #2)
> First we load a DSO which provides the symbol. Then we *unload* it, then we
> load *another* DSO which provides the same symbol. And internal references
> within that second DSO get resolved to the address where the function *used* to
> reside in the original DSO. That *has* to be broken, surely?

This is fixed already in 3.0 by splitting the installed backend modules into separate addressbook and calendar directories.  Same type of thing was also wrecking havoc with the GType system.
Comment 6 Matthew Barnes 2011-05-06 17:39:25 UTC
Removing blocker status.
Comment 7 Matthew Barnes 2011-05-06 17:44:43 UTC
The backends need to export certain symbols for initialization and shutdown:

   eds_module_initialize()
   eds_module_shutdown()
   eds_module_list_types()

Sounds to me like RTLD_LOCAL could break that.
Comment 8 Matthias Apitz 2011-05-06 18:39:51 UTC
David's proposed patch works fine so far w/o any visible side effect
Comment 9 David Woodhouse 2011-05-07 12:06:36 UTC
(In reply to comment #7)
> The backends need to export certain symbols for initialization and shutdown:
> 
>    eds_module_initialize()
>    eds_module_shutdown()
>    eds_module_list_types()
> 
> Sounds to me like RTLD_LOCAL could break that.

No, it uses g_module_symbol() (aka dlsym()) for that. It *has* to, otherwise it could never manage to load more than one module.

I can't think of any valid case for wanting RTLD_GLOBAL.
Comment 10 David Woodhouse 2011-05-07 12:17:04 UTC
(In reply to comment #5)
> This is fixed already in 3.0 by splitting the installed backend modules into
> separate addressbook and calendar directories.  Same type of thing was also
> wrecking havoc with the GType system.

It's only a partial fix. If multiple modules use the same name for a non-static function or variable, that'll still collide, with no warning at build time.
Comment 11 David Woodhouse 2011-05-09 13:42:54 UTC
To ssh://dwmw2@git.gnome.org/git/evolution-data-server
   bdc460e..6d97047  gnome-2-32 -> gnome-2-32
   8dbd88c..f1980b5  gnome-3-0 -> gnome-3-0
   7e186ad..671aac1  master -> master