GNOME Bugzilla – Bug 684326
GdmSession crashes reaping workers
Last modified: 2012-09-18 21:19:44 UTC
I got GDM to crash testing bug 684320. And I got it to crash in the session-worker reaping the reauth worker and in the simple-slave reaping the dead session-worker. I think they're related and caused by faulty memory management of GdmSessionConversation, plus a number of race conditions in GLib. Backtraces and useful debug logs: ----- Crash in gdm-session-worker ----- dm-password][732]: DEBUG(+): GdmSessionWorkerJob: Stopping job pid:18276 gdm-password][732]: DEBUG(+): GdmCommon: sending signal 15 to process 18276 gdm-password][732]: DEBUG(+): GdmSessionWorkerJob: Waiting on process 18276 gdm-password][732]: DEBUG(+): GdmCommon: process (pid:18276) done (signal:15) gdm-password][732]: DEBUG(+): GdmSessionWorkerJob: SessionWorkerJob died gdm-password][732]: WARNING: Tried to look up non-existent conversation 0x00007f48a461cba7 in __libc_waitpid (pid=<optimized out>, stat_loc=stat_loc@entry=0x7ffff6c7ebac, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40 40 int result = INLINE_SYSCALL (wait4, 4, pid, stat_loc, options, NULL);
+ Trace 230871
I have a patch for the first one, but I really don't understand how we can get worker_exited again if we clearly got it once.
Uhm, bugzilla collapsed both into one, but there are comments inside that trace, beyond the functions.
Created attachment 224668 [details] [review] GdmSession: check conversation before accessing it Due to the way methods are handled by GDBus, it's possible we get a method invocation after the worker has already exited and the conversation freed.
grr wish i would have noticed this 5 minutes ago.
Btw, the second crash could be a reentrancy issue, as GdmSession is getting closed (through session-died -> gdm_slave_stop -> gdm_session_close) before the conversation is removed from the hash table. I'm not sure it is the actual cause, though.
Created attachment 224671 [details] [review] Clear out session_conversation when the worker exits The conversation is freed at the end of that callback, so the pointer would be dangling. Among other things, this fixes a duplicate wtmp record, as session dispose would find that logout was already recorded.
Created attachment 224672 [details] [review] GdmSession: remove session from the hash table before emitting signals GdmSession::session-died and GdmSession::session-exited cause the simple slave to stop, closing thus also the session, which would then walk all conversations and free them. But the conversation is already freed at the end of the worker-exited / worker-died handlers.
Created attachment 224673 [details] Full slave and session worker debug output
alright, i'm going to push these and do a 92.1
Attachment 224668 [details] pushed as 013a769 - GdmSession: check conversation before accessing it Attachment 224671 [details] pushed as 69aa70a - Clear out session_conversation when the worker exits Attachment 224672 [details] pushed as cb57764 - GdmSession: remove session from the hash table before emitting signals