GNOME Bugzilla – Bug 752202
gnome-screensaver-dialog crashed with SIGSEGV in g_signal_handler_disconnect
Last modified: 2015-09-01 13:43:46 UTC
Created attachment 307191 [details] Backtrace In Ubuntu 15.10 development, GLib 2.45.3, gnome-screensaver 3.6.1, all be it rather heavily patched... The attached backtrace was obtained from a crash which occurs when attempting to unlock the screen. Ubuntu and specifically Unity, uses gnome-screensaver when either Orca or an on-screen keyboard are being used, due to issues with Unity's own lock screen. To reproduce: In Ubuntu 15.10, log into a Unity session, and enable Orca with alt + super + s. 2. Let the screen lock after the defined period of time. 3. Press a key to get the gnome-screensaver-dialog password prompt. 4. Once either pressing enter or clicking unlock, notice the screen go blank again. 5. Pressing a key causes gnome-screensaver to restart gnome-screensaver-dialog. I reverted back to Ubuntu's most recent GLib 2.45.2 package, and the problem was not present.
For me I don't see a crash, but the screensaver fails to unlock. I bisected this using gnome-screensaver in Debian unstable. 916297be799ee001b4a214cc52c3b960bb0b5deb is the first bad commit commit 916297be799ee001b4a214cc52c3b960bb0b5deb Author: Matthias Clasen <mclasen@redhat.com> Date: Sat Sep 20 01:08:32 2014 -0400 Add a global signal handler table Add a global lookup table for signal handlers. We already give them a unique ID, so there is no good reason to pay for non-constant lookups when disconnecting handlers. https://bugzilla.gnome.org/show_bug.cgi?id=737009 :040000 040000 348d2f3941e81ce58937cbdcfd12ce7dddd131c6 78141134246018c22b0bc0783c2157cdc1574f71 M gobject
(In reply to Iain Lane from comment #1) > For me I don't see a crash, but the screensaver fails to unlock. > > I bisected this using gnome-screensaver in Debian unstable. Are you getting some error messages in the session log? Is gnome-screensaver trying to disconnect signals that do not exist or a source id, instead of using g_source_remove()?
I just attached gdb to the dialog process and it's segfaulting here.
+ Trace 235258
$12 = (HandlerList *) 0x0 ...is it okay for handler_list_lookup to return NULL here? If so the code should check it, otherwise we need to find out why it is NULL.
so glib should warn rather than crashing, but: > keymap_handler = > g_signal_connect (keymap, ... > g_signal_handler_disconnect (plug, keymap_handler); s/plug/keymap/
Created attachment 307486 [details] [review] gsignal: Warn instead of crashing when attempting to remove a signal handler from the wrong instance
(In reply to Dan Winship from comment #4) > > > g_signal_handler_disconnect (plug, keymap_handler); > > s/plug/keymap/ Yes. I have this fixed locally, can't file a bug though since g-s is closed on bz. I've pinged someone. (In reply to Dan Winship from comment #4) > so glib should warn rather than crashing Like #5? Not sure if we should return here.
Review of attachment 307486 [details] [review]: Looks generally okay to me. ::: gobject/gsignal.c @@ +638,3 @@ + else + { + itype = G_TYPE_FROM_INSTANCE (instance); You could declare itype here. @@ +641,3 @@ + g_warning ("%s: signal id '%u' is invalid for instance '%p' of type '%s'", + G_STRLOC, signal_id, instance, + g_type_name (itype)); Shouldn't we bail out, instead of falling through?
(In reply to Emmanuele Bassi (:ebassi) from comment #7) > @@ +641,3 @@ > + g_warning ("%s: signal id '%u' is invalid for instance '%p' > of type '%s'", > + G_STRLOC, signal_id, instance, > + g_type_name (itype)); > > Shouldn't we bail out, instead of falling through? Don't know. We'd miss the unref and free at the end. We're already kind of committed to try and remove the handler in this case - if we want to change that then we might need to detect it earlier on (e.g. in g_signal_handler_disconnect)? (I know almost nothing about this internal implementation of signals FWIW.)
should add a test to gobject/tests/signals.c too
Created attachment 307557 [details] [review] gsignal: Warn instead of crashing when attempting to remove a signal handler from the wrong instance Without the second hunk here the testcase crashes when it unrefs 'object' due to an assertion failure in handler_unref_R. So I think we do actually need to bail out earlier and not actually disconnect the signal at all. I've kept the other check for safety.
Created attachment 307558 [details] [review] Test that disconnecting from the wrong type of object warns
Created attachment 307559 [details] [review] Move signal tests to gtester so that we can use g_test_expect_message
Comment on attachment 307558 [details] [review] Test that disconnecting from the wrong type of object warns > tests/gobject/signals.c | 39 ++++++++++++++++++++++++++++++++++++--- oh, oops, I said "gobject/tests/signals.c", not "tests/gobject/signals.c"... gobject/tests/ is new-style well-organized tests, while tests/gobject/ is old and frequently on crack (and we ought to be killing them off as we verify that they're redundant with other tests...)
(In reply to Dan Winship from comment #13) > Comment on attachment 307558 [details] [review] [review] > Test that disconnecting from the wrong type of object warns > > > tests/gobject/signals.c | 39 ++++++++++++++++++++++++++++++++++++--- > > oh, oops, I said "gobject/tests/signals.c", not "tests/gobject/signals.c"... > gobject/tests/ is new-style well-organized tests, while tests/gobject/ is > old and frequently on crack (and we ought to be killing them off as we > verify that they're redundant with other tests...) oh man, I didn't notice that one...
Created attachment 307561 [details] [review] Test that disconnecting from the wrong type of object warns How about this then?
Comment on attachment 307557 [details] [review] gsignal: Warn instead of crashing when attempting to remove a signal handler from the wrong instance >@@ -631,7 +631,15 @@ handler_unref_R (guint signal_id, >+ g_warning ("%s: signal id '%u' is invalid for instance '%p' of type '%s'", handler_unref_R() should only be getting called in circumstances where it knows that the handler came from the given instance. So this ought to be a "can't happen", and should at most have a "g_assert (hlist != NULL)". (But just crashing seems fine too.) >@@ -2586,7 +2594,7 @@ g_signal_handler_disconnect (gpointer instance, > > SIGNAL_LOCK (); > handler = handler_lookup (instance, handler_id, 0, 0); >- if (handler) >+ if (handler && handler_list_lookup (handler->signal_id, instance)) That looks weird; it seems like handler_lookup() should be returning NULL in this case. Indeed, the way it is now, you can pass the wrong instance to g_signal_handler_block() / g_signal_handler_unblock() / g_signal_handler_is_connected() too, and they'll just ignore the error rather than warning. So I think it's handler_lookup() that needs the additional check. I guess it could just call handler_list_lookup() itself? (Or inline the equivalent bits.) You'd have to run gobject/tests/signal-handler with "-m perf" before and after to make sure that doesn't destroy the performance advantages of the signal hash. (Additionally, handler_lookup() should probably be split into two different functions, one that takes instance and handler_id, and one that takes instance, closure, and signal_id_p.)
Comment on attachment 307561 [details] [review] Test that disconnecting from the wrong type of object warns this looks good to commit as soon as the other patch is ready
(In reply to Dan Winship from comment #16) > Comment on attachment 307557 [details] [review] [review] > gsignal: Warn instead of crashing when attempting to remove a signal handler > from the wrong instance > > >@@ -631,7 +631,15 @@ handler_unref_R (guint signal_id, > > >+ g_warning ("%s: signal id '%u' is invalid for instance '%p' of type '%s'", > > handler_unref_R() should only be getting called in circumstances where it > knows that the handler came from the given instance. So this ought to be a > "can't happen", and should at most have a "g_assert (hlist != NULL)". (But > just crashing seems fine too.) OK. > > >@@ -2586,7 +2594,7 @@ g_signal_handler_disconnect (gpointer instance, > > > > SIGNAL_LOCK (); > > handler = handler_lookup (instance, handler_id, 0, 0); > >- if (handler) > >+ if (handler && handler_list_lookup (handler->signal_id, instance)) > > That looks weird; it seems like handler_lookup() should be returning NULL in > this case. > > Indeed, the way it is now, you can pass the wrong instance to > g_signal_handler_block() / g_signal_handler_unblock() / > g_signal_handler_is_connected() too, and they'll just ignore the error > rather than warning. > > So I think it's handler_lookup() that needs the additional check. I guess it > could just call handler_list_lookup() itself? I think that adding this back would negate the improvements the hash was meant to confer: it'd amount to g_hash_table_lookup (g_handler_list_bsa_ht, instance) which is what we had before. What if I add the instance as a member of the Handler struct? We can then check this explicitly in handler_lookup, in the first if block there (or add it to the key hash function, but I'm less confident of getting that right).
Created attachment 307829 [details] [review] gsignal: Warn instead of crashing when attempting to remove a signal handler from the wrong instance Something like this is what I'm talking about.
Review of attachment 307829 [details] [review]: ::: gobject/gsignal.c @@ +251,3 @@ GQuark detail; guint signal_id; + gpointer instance; You should probably look with pahole that adding a pointer here does not create a hole in the structure. I'd probably push it to the end of the Handler structure. @@ +444,1 @@ key.sequential_number = handler_id; You could save the ret->instance == instance check later by changing the handler_equal() function to check for both the sequential number and the instance fields. @@ +444,2 @@ key.sequential_number = handler_id; + ret = (Handler *)g_hash_table_lookup (g_handlers, &key); No need to cast the returned pointer. @@ +445,3 @@ + ret = (Handler *)g_hash_table_lookup (g_handlers, &key); + + if (ret && (ret->instance != instance)) // we're connected to something else I would flip the if, and remove the else: if (ret != NULL && ret->instance == instance) return ret; return NULL; But if you change the handler_equal() function, this becomes: key.instance = instance; key.sequential_number = handler_id; return g_hash_table_lookup (g_handlers, &key); @@ +641,3 @@ { hlist = handler_list_lookup (signal_id, instance); + g_assert_nonnull (hlist); g_assert_nonnull() is GTest API, and has interesting side effects, so you should use g_assert() directly.
Created attachment 307836 [details] [review] gsignal: Warn instead of crashing when attempting to remove a signal handler from the wrong instance Cheers for the review - how about this? I checked with pahole and didn't see a hole there either way but having it at the end is fine. It looks like this: struct _Handler { gulong sequential_number; /* 0 8 */ Handler * next; /* 8 8 */ Handler * prev; /* 16 8 */ GQuark detail; /* 24 4 */ guint signal_id; /* 28 4 */ guint ref_count; /* 32 4 */ guint block_count:16; /* 36:16 4 */ guint after:1; /* 36:15 4 */ guint has_invalid_closure_notify:1; /* 36:14 4 */ /* XXX 14 bits hole, try to pack */ GClosure * closure; /* 40 8 */ gpointer instance; /* 48 8 */ /* size: 56, cachelines: 1, members: 11 */ /* bit holes: 1, sum bit holes: 14 bits */ /* last cacheline: 56 bytes */ };
Review of attachment 307836 [details] [review]: Commmit message seems misleading now - there's no warning added in this patch.
It'll now trip the already existing warning. But feel free to reword - "Don't crash when operating on signals on the wrong object" or something.
Any more comments on the code? Should I reword the commit still?
If you want to reword and push, that would be nice.
Pushed, thanks! commit 16721468e5410732f2575be35652ece538587b94 Author: Iain Lane <iain@orangesquash.org.uk> Date: Wed Jul 15 17:01:03 2015 +0100 gsignal: Don't crash when operating on signals on the wrong object commit 261250c46e3eab9b54c9cc59b405a69785a65b35 Author: Iain Lane <iain@orangesquash.org.uk> Date: Thu Jul 16 15:38:21 2015 +0100 Test that disconnecting from the wrong thing warns and doesn't crash This broke in 916297be799ee001b4a214cc52c3b960bb0b5deb (≥ 2.45.3)