After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 730296 - gsignal: Fix a potential NULL pointer dereference
gsignal: Fix a potential NULL pointer dereference
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gobject
unspecified
Other All
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2014-05-17 11:31 UTC by Philip Withnall
Modified: 2017-11-15 12:52 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gsignal: Fix a potential NULL pointer dereference (1.45 KB, patch)
2014-05-17 11:31 UTC, Philip Withnall
rejected Details | Review
gsignal: add assert on closure invalidate path (1.05 KB, patch)
2014-06-06 16:16 UTC, Allison Karlitskaya (desrt)
committed Details | Review

Description Philip Withnall 2014-05-17 11:31:31 UTC
Patch attached. I'm not certain if this combination of conditions can realistically be triggered in production code, but I think adding the NULL check is safe and should be fairly uncontroversial, given that every other call to handler_lookup() already has one.
Comment 1 Philip Withnall 2014-05-17 11:31:33 UTC
Created attachment 276702 [details] [review]
gsignal: Fix a potential NULL pointer dereference

If g_signal_handlers_destroy() is called on an instance, each Handler is
removed from the g_handler_list_bsa_ht table *before* it is unreffed.
When a Handler is finalised, it drops its reference to its GClosure,
which could then trigger invalid_closure_notify(). This function
previously assumed that the Handler definitely existed in
g_handler_list_bsa_ht, and hence could end up dereferencing a NULL
pointer.

Coverity issue: #1159477
Comment 2 Allison Karlitskaya (desrt) 2014-05-25 10:22:55 UTC
We've had some pretty tricky issues in this code before and saw quite some crashing here for a while (after we brought in the patch that introduced this in order to automatically remove handlers associated with invalidated closures).  I was _fairly_ convinced I finally got it right...

Are you sure this is possible?  A testcase that causes the crash would do a lot to help convince me that there is an issue here.
Comment 3 Philip Withnall 2014-05-26 07:45:44 UTC
(In reply to comment #2)
> Are you sure this is possible?  A testcase that causes the crash would do a lot
> to help convince me that there is an issue here.

As I said in my first comment, I don't know if this crash is possible. I've followed through the code by hand and think I found a path which could cause it, but I'm not really familiar enough with the code to be sure.

I think the patch should be safe in any case, but if you want a test case I'll try and come up with one. Might take a few days through.
Comment 4 Allison Karlitskaya (desrt) 2014-06-06 16:13:21 UTC
I start to believe that this may indeed be possible, but _extremely_ unlikely.

Essentially, we'd need to have (at least) two threads.  Start out with a signal connected to an object via g_signal_connect_object().

In the first thread, we'd try to disconnect the signal handler in the 'usual way'.  Meanwhile, in the second thread, we dispose the object.

Assuming the disposal starts first(ish) we'd notice that we have a closure on the object that we need to destroy (via the quark_closure_array).  No locks held yet.  destroy_closure_array() would call g_closure_invalidate().  Still no locks as we call the pending invalidation notifier invalid_closure_notify().

Meanwhile in the other thread, we're removing the signal handler explicitly.  While we do this we take the SIGNAL_LOCK().  By the time we release this lock, the signal handler is gone.

Back in the disposing thread, we hit SIGNAL_LOCK() inside invalid_closure_notify().  By the time we are able to acquire it, handler_lookup() will indeed be returning NULL.

We have to wonder if it's ever valid to use signals in this way at all: why are we disconnecting the signal handler for an object that is being disposed in another thread?  What would happen if a signal were to fire under this strange combination of circumstances?  

On the topic of the explicit disconnect: how are we doing this?  By id?  By pointer?  If by pointer, how did we get a ref to the almost-dead object in the other thread, and if we had such a ref, why is the object being disposed?What would happen if the object being disposed in the other thread were freed, its memory address reused, a new signal connection added, and then the signal disconnection proceeded in the first thread?

If by id, how do we know that the signal handler ID that we are using to do the connection in the first thread is even still valid if the object may be being destroyed in the other thread?  Disconnecting an invalid ID is a critical, after all...

I therefore think it's likely impossible to write a valid program that would actually hit this bug.  I suspect I'd even have a hard time writing a testcase that hammered signal disconnect vs. dispose from two threads in an to attempt to 'lose' this race without generating reams of (non-bug) criticals as a side effect of my attempts.

Would you be happy if we added an assert and a bug reference with a solicitation for further input if someone actually feels that they have a valid failure?  The assert that's already there will crash in any case, but maybe the explicit assert would help to pacify the analyzer?

There is also the possibility that it's possible to get NULL here from another combination of circumstances that could reasonably exist in a valid program and that I've just missed it...
Comment 5 Allison Karlitskaya (desrt) 2014-06-06 16:16:52 UTC
Created attachment 278046 [details] [review]
gsignal: add assert on closure invalidate path

It's theoretically possible that we could have a case where this would
actually return NULL, but it's difficult to imagine a valid program that
would contain such a case.

Add an explicit assert here to quiet up statis analysis.

See the bug for more discussion.
Comment 6 Philip Withnall 2014-06-15 22:25:56 UTC
Review of attachment 278046 [details] [review]:

(In reply to comment #4)
> Would you be happy if we added an assert and a bug reference with a
> solicitation for further input if someone actually feels that they have a valid
> failure?  The assert that's already there will crash in any case, but maybe the
> explicit assert would help to pacify the analyzer?

Sounds entirely reasonable to me, and I believe it should shut the static analyser up. Thanks for taking the time to think about this so thoroughly.

My only comment would be that it would be helpful to include the original Coverity issue number (#1159477) in the commit message, to make it easier to go back and re-examine the failure path. Also, s/statis/static/ in the commit message.
Comment 7 Philip Withnall 2017-11-15 12:51:57 UTC
Comment on attachment 276702 [details] [review]
gsignal: Fix a potential NULL pointer dereference

(Rejected in favour of the newer patch.)
Comment 8 Philip Withnall 2017-11-15 12:52:46 UTC
Pushed with an amended commit message.

Attachment 278046 [details] pushed as c64b6da - gsignal: add assert on closure invalidate path