GNOME Bugzilla – Bug 746536
dlclose with atfork handlers core dumps
Last modified: 2018-05-24 17:38:16 UTC
This report is meant to bring attention to the issue mentioned in: https://trac.macports.org/ticket/45309 and http://comments.gmane.org/gmane.os.openbsd.bugs/21076 Brief summary of the issue: When a library used by a module adding a new dependency which has an initializer which adds a child atfork handler: The result is that after closing the module, the system has a dangling pointer for the atfork handler which would at best crash on the child side of fork() and at worst lead to arbitray code execution of whatever happened to be at that location in memory at a later time in the process. This issue can be reproduced on OSX and OpenBSD. On the latter, for example using a webkit (p11-kit as dependency) based browsers which dumps core on any call to fork(). A more detailed analysis for OSX can be found in the ticket above, starting in the later comments: https://trac.macports.org/ticket/45309#comment:47 A more detailed analysis for OpenBSD is in this thread: http://comments.gmane.org/gmane.os.openbsd.bugs/21076 A possible workaround is to not call dlclose() as done in the following patch: https://trac.macports.org/browser/trunk/dports/devel/glib2-devel/files/patch-gmodule-gmodule-dl.c.diff?rev=127768 This patch is confirmed to work on OpenBSD as well.
Adding my +1 here. That issue is 100% reproducible on OpenBSD.
Patching g_module_close() to always be a no-op is definitely wrong. (FTR, it looks like the reason this doesn't affect Linux is because glibc arranges for any pthread_atfork() handlers added by a library to be removed if that library is unloaded. Which is probably a good idea.) I think the general theory is that plugins that can't deal with being unloaded should call g_module_make_resident() on themselves. (They can just re-g_module_open() themselves if they don't have a pointer to the caller's GModule. Or in the case of things using GTypeModule [like gegl], they can call g_type_module_use() on themselves to increase their refcount and ensure they never get unloaded.) I'm not sure what's up in the OpenBSD thread, since libgiognutls.so pulls exactly that trick (g_type_module_use) already. Where is the offending dlclose() coming from here?
> I'm not sure what's up in the OpenBSD thread, since libgiognutls.so pulls > exactly that trick (g_type_module_use) already. Where is the offending > dlclose() coming from here? If I understand correctly and as mentioned in the last post on the thread, p11-kit seems to trigger this on OpenBSD (as well as on OSX!?): "The dlopen()ed gnome-keyring-pkcs11.so calls pthread_atfork() with a function pointer from itself. Later, it gets dlclose()d and unloaded, leaving this a dangling pointer." This seems to be directly related to this change in p11-kit: http://cgit.freedesktop.org/p11-glue/p11-kit/diff/?id=16e25b2890927108ec15297aabb1d86a49792741 e.g. reverting this commit in p11-kit fixes the issue as well. For me (as end-user) the whole issue becomes visible through webkit, means any webkit based browser (e.g. surf or vimb) dies, whenever they call fork().
There's additional analysis in this thread - https://marc.info/?t=142463168400001&r=1&w=2
(In reply to mail from comment #3) > If I understand correctly and as mentioned in the last post on the thread, > p11-kit seems to trigger this on OpenBSD (as well as on OSX!?): > "The dlopen()ed gnome-keyring-pkcs11.so calls pthread_atfork() with a > function pointer from itself. Later, it gets dlclose()d and unloaded, > leaving this a dangling pointer." Right, but *why* does it get dlclosed? What is the chain of events that leads to p11-kit being unloaded?
Unfortunately, my debug foo is not good enough to track down the chain of events leading to the offending dlclose(). I tried hard with gdb but failed, too many libs and concurrent threads/processes are involved. I can only say the following: - webkit based browsers, which call fork() are affected, I tested it with surf from suckless.org (but vimb or MiniBrowser are also affected) - p11-kit has the atfork handler which causes the issue - p11-kit is pulled in via libgiognutls: loading: libp11-kit.so.1.2 required by /usr/local/lib/gio/modules/libgiognutls.so - while simply starting surf, libp11-kit.so.1.2 is loaded and unloaded, multiple times - also libgiognutls.so is loaded and ubloaded multiple times
Ok the issue is about to be fixed on OpenBSD. Unloading a .so will unregister any atfork handlers from that .so. I think the bug can be closed as it is sort of OS-specific.
(In reply to mail from comment #6) > Unfortunately, my debug foo is not good enough to track down the chain of > events leading to the offending dlclose(). I tried hard with gdb but failed, > too many libs and concurrent threads/processes are involved. Well, I mostly meant "what is the backtrace from when dlclose is called", but I think you answered that: > - also libgiognutls.so is loaded and ubloaded multiple times That really shouldn't be happening; libgiognutls.so tries to prevent itself from being unloaded... (In reply to Antoine Jacoutot from comment #7) > Ok the issue is about to be fixed on OpenBSD. > Unloading a .so will unregister any atfork handlers from that .so. > I think the bug can be closed as it is sort of OS-specific. That doesn't fully fix this problem though since it exists on OS X too, and who knows how long it will take a fix to arrive there (and for everyone to stop using the unfixed versions of OS X).
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1012.