After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 684326 - GdmSession crashes reaping workers
GdmSession crashes reaping workers
Status: RESOLVED FIXED
Product: gdm
Classification: Core
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: GDM maintainers
GDM maintainers
Depends on:
Blocks:
 
 
Reported: 2012-09-18 20:22 UTC by Giovanni Campagna
Modified: 2012-09-18 21:19 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
GdmSession: check conversation before accessing it (3.36 KB, patch)
2012-09-18 20:28 UTC, Giovanni Campagna
committed Details | Review
Clear out session_conversation when the worker exits (1.48 KB, patch)
2012-09-18 20:39 UTC, Giovanni Campagna
committed Details | Review
GdmSession: remove session from the hash table before emitting signals (2.32 KB, patch)
2012-09-18 20:43 UTC, Giovanni Campagna
committed Details | Review
Full slave and session worker debug output (151.54 KB, text/x-log)
2012-09-18 20:53 UTC, Giovanni Campagna
  Details

Description Giovanni Campagna 2012-09-18 20:22:44 UTC
I got GDM to crash testing bug 684320. And I got it to crash in the session-worker reaping the reauth worker and in the simple-slave reaping the dead session-worker.
I think they're related and caused by faulty memory management of GdmSessionConversation, plus a number of race conditions in GLib.

Backtraces and useful debug logs:

----- Crash in gdm-session-worker -----
dm-password][732]: DEBUG(+): GdmSessionWorkerJob: Stopping job pid:18276
gdm-password][732]: DEBUG(+): GdmCommon: sending signal 15 to process 18276
gdm-password][732]: DEBUG(+): GdmSessionWorkerJob: Waiting on process 18276
gdm-password][732]: DEBUG(+): GdmCommon: process (pid:18276) done (signal:15)
gdm-password][732]: DEBUG(+): GdmSessionWorkerJob: SessionWorkerJob died
gdm-password][732]: WARNING: Tried to look up non-existent conversation

0x00007f48a461cba7 in __libc_waitpid (pid=<optimized out>, stat_loc=stat_loc@entry=0x7ffff6c7ebac, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:40
40	  int result = INLINE_SYSCALL (wait4, 4, pid, stat_loc, options, NULL);
  • #0 __libc_waitpid
    at ../sysdeps/unix/sysv/linux/waitpid.c line 40
  • #1 crashlogger_get_backtrace
    at gdm-signal-handler.c line 196
  • #2 gdm_signal_handler_backtrace
    at gdm-signal-handler.c line 223
  • #3 signal_handler
    at gdm-signal-handler.c line 251
  • #4 signal_handler
    at gdm-signal-handler.c line 232
  • #5 <signal handler called>
  • #6 set_pending_query
    at gdm-session.c line 655
  • #7 gdm_session_handle_secret_info_query
    at gdm-session.c line 693
  • #8 ffi_call_unix64
    at ../src/x86/unix64.S line 75
  • #9 ffi_call
    at ../src/x86/ffi64.c line 486
  • #11 g_closure_invoke
    at gclosure.c line 777
  • #13 g_signal_emitv
    at gsignal.c line 3041
  • #15 g_dbus_interface_method_dispatch_helper
    at gdbusinterfaceskeleton.c line 615
  • #17 call_in_idle_cb
    at gdbusconnection.c line 4737
  • #18 g_main_dispatch
    at gmain.c line 2715
  • #19 g_main_context_dispatch
    at gmain.c line 3219
  • #20 g_main_context_iterate
    at gmain.c line 3290
  • #21 g_main_loop_run
    at gmain.c line 3484
  • #22 main
    at session-worker-main.c line 196
  • #0 __libc_waitpid
    at ../sysdeps/unix/sysv/linux/waitpid.c line 40
  • #1 crashlogger_get_backtrace
    at gdm-signal-handler.c line 196
  • #2 gdm_signal_handler_backtrace
    at gdm-signal-handler.c line 223
  • #3 signal_handler
    at gdm-signal-handler.c line 251
  • #4 signal_handler
    at gdm-signal-handler.c line 232
  • #5 <signal handler called>
  • #6 worker_exited
    at gdm-session.c line 1700
  • #8 _g_closure_invoke_va
    at gclosure.c line 840
  • #9 g_signal_emit_valist
    at gsignal.c line 3211
  • #10 g_signal_emit
    at gsignal.c line 3356
  • #11 g_child_watch_dispatch
    at gmain.c line 4603
  • #12 g_main_dispatch
    at gmain.c line 2715
  • #13 g_main_context_dispatch
    at gmain.c line 3219
  • #14 g_main_context_iterate
    at gmain.c line 3290
  • #15 g_main_loop_run
    at gmain.c line 3484
  • #16 main
    at simple-slave-main.c line 254

I have a patch for the first one, but I really don't understand how we can get worker_exited again if we clearly got it once.
Comment 1 Giovanni Campagna 2012-09-18 20:23:59 UTC
Uhm, bugzilla collapsed both into one, but there are comments inside that trace, beyond the functions.
Comment 2 Giovanni Campagna 2012-09-18 20:28:30 UTC
Created attachment 224668 [details] [review]
GdmSession: check conversation before accessing it

Due to the way methods are handled by GDBus, it's possible we get a
method invocation after the worker has already exited and the conversation
freed.
Comment 3 Ray Strode [halfline] 2012-09-18 20:30:24 UTC
grr wish i would have noticed this 5 minutes ago.
Comment 4 Giovanni Campagna 2012-09-18 20:38:45 UTC
Btw, the second crash could be a reentrancy issue, as GdmSession is getting closed (through session-died -> gdm_slave_stop -> gdm_session_close) before the conversation is removed from the hash table.
I'm not sure it is the actual cause, though.
Comment 5 Giovanni Campagna 2012-09-18 20:39:21 UTC
Created attachment 224671 [details] [review]
Clear out session_conversation when the worker exits

The conversation is freed at the end of that callback, so the pointer
would be dangling. Among other things, this fixes a duplicate wtmp record,
as session dispose would find that logout was already recorded.
Comment 6 Giovanni Campagna 2012-09-18 20:43:55 UTC
Created attachment 224672 [details] [review]
GdmSession: remove session from the hash table before emitting signals

GdmSession::session-died and GdmSession::session-exited cause the simple
slave to stop, closing thus also the session, which would then walk all
conversations and free them. But the conversation is already freed at
the end of the worker-exited / worker-died handlers.
Comment 7 Giovanni Campagna 2012-09-18 20:53:03 UTC
Created attachment 224673 [details]
Full slave and session worker debug output
Comment 8 Ray Strode [halfline] 2012-09-18 21:16:14 UTC
alright, i'm going to push these and do a 92.1
Comment 9 Ray Strode [halfline] 2012-09-18 21:19:37 UTC
Attachment 224668 [details] pushed as 013a769 - GdmSession: check conversation before accessing it
Attachment 224671 [details] pushed as 69aa70a - Clear out session_conversation when the worker exits
Attachment 224672 [details] pushed as cb57764 - GdmSession: remove session from the hash table before emitting signals