After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 651942 - DispatchNotifier tries to dereference destroyed Dispatcher
DispatchNotifier tries to dereference destroyed Dispatcher
Status: RESOLVED FIXED
Product: glibmm
Classification: Bindings
Component: threads
unspecified
Other Linux
: Normal critical
: ---
Assigned To: gtkmm-forge
gtkmm-forge
: 676119 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2011-06-05 17:09 UTC by Petr Nohavica
Modified: 2012-05-31 14:05 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Test case (2.52 KB, text/x-c++src)
2011-08-08 17:10 UTC, Petr Nohavica
  Details
Test case 1 (2.69 KB, text/plain)
2012-03-12 16:01 UTC, Kjell Ahlstedt
  Details
Test case 2 (1.67 KB, application/x-compressed-tar)
2012-03-12 16:05 UTC, Kjell Ahlstedt
  Details
patch: Dispatcher: Don't send messages to a deleted Dispatcher. (no ABI break) (9.05 KB, patch)
2012-03-12 18:13 UTC, Kjell Ahlstedt
none Details | Review
patch: Dispatcher: Don't send messages to a deleted Dispatcher. (ABI break) (11.77 KB, patch)
2012-03-12 18:28 UTC, Kjell Ahlstedt
none Details | Review
Don't send a message, meant for a deleted Dispatcher, to a new Dispatcher. (11.62 KB, patch)
2012-04-04 14:41 UTC, Kjell Ahlstedt
none Details | Review

Description Petr Nohavica 2011-06-05 17:09:51 UTC
Consider the following scenario:

1) Thread A constructs (first) Dispatcher object, which leads to construction of per-thread-singleton DispatchNotifier
2) Thread A connects 1. dispatcher to a slot
3) Thread A constructs another Dispatcher, which gets pointer to the already existing instance of DispatchNotifier
4) Thread A connects 2. dispatcher to a slot
5) Thread B calls emit() few times on A's 1. dispatcher
6) Thread C calls emit() few times on A's 2. dispatcher
7) Thread A destroys one of its Dispatchers with the possibility of leaving undelivered messages in DispatchNotifier's pipe (and DispatchNotifier continues to live since there is another Dispatcher referencing it)
8) DispatchNotifier::pipe_io_handler() is called where message from now destroyed Dispatcher is read from the pipe; since the message contains pointer to that Dispatcher and DispatchNotifier dereferences it, SEGFAULT follows

Quick and dirty solution: 

Add std::set into the DispatchNotifier with pointers to valid Dispatchers. Registration / deregistration occurs in Dispatcher 'ctors via (un)reference_instance.
Comment 1 Petr Nohavica 2011-06-15 18:30:23 UTC
Bump.
Comment 2 Murray Cumming 2011-07-19 09:39:45 UTC
Daniel Elstner (CCed) might be interested, but even he probably needs a simple-as-possible test case.
Comment 3 Petr Nohavica 2011-08-08 17:10:56 UTC
Created attachment 193428 [details]
Test case
Comment 4 Petr Nohavica 2011-08-08 17:12:34 UTC
All right, I have created quite simple test case. I hope that it is, together
with the former bug report and the comments inside, comprehensible enough.
Comment 5 Petr Nohavica 2011-09-03 10:03:32 UTC
Ok, are we going to do something about this issue? I've googled for "pipe_io_handler crash" and looks that I'm not the only one having problems - for example see this 4 years old post: http://lists-archives.org/gtk/06593-how-to-clear-pending-events.html
Comment 6 Murray Cumming 2011-09-06 07:23:36 UTC
Daniel, are you interested in looking at Petr's test case in comment #3?
Comment 7 Petr Nohavica 2011-10-11 09:25:46 UTC
Fine, this is my last attempt. If nothing happens, I am going to give up on this and just use my patched version of Glib::Dispatcher.
Comment 8 Murray Cumming 2011-10-11 09:32:10 UTC
(In reply to comment #7)
> If nothing happens, I am going to give up on
> this and just use my patched version of Glib::Dispatcher.

You have a patch for Glib::Dispatcher to fix this problem?
Comment 9 Petr Nohavica 2011-10-11 09:45:32 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > If nothing happens, I am going to give up on
> > this and just use my patched version of Glib::Dispatcher.
> 
> You have a patch for Glib::Dispatcher to fix this problem?

Yes, see 
http://nohous.cz/gitweb/?p=bwfembed2/.git;a=blob;f=src/dispatcher2.h;h=70c70b782a67eec09649311d1189179940a089bf;hb=HEAD and 
http://nohous.cz/gitweb/?p=bwfembed2/.git;a=blob;f=src/dispatcher.cpp;h=33b9b583a5575d334e740ba6a01920297f1bfa6d;hb=HEAD
Comment 10 Petr Nohavica 2011-10-11 09:50:29 UTC
But as I stated in the former report, it's a quick and dirty solution.
If the memory gets reused for another dispatcher, you are delivering
messages of the old dispatcher to handlers of a new one.
Comment 11 Murray Cumming 2011-10-11 09:51:15 UTC
OK, but could you attach an actual patch, even if it's just for explanation.
Comment 12 Petr Nohavica 2011-10-11 09:53:31 UTC
Allright, I will post it later today.
Comment 13 Kjell Ahlstedt 2012-03-12 16:01:59 UTC
Created attachment 209501 [details]
Test case 1

An updated version of the test case in comment 3. Uses classes in namespace
Glib::Threads. I've also removed the dependence on Boost.
Comment 14 Kjell Ahlstedt 2012-03-12 16:05:40 UTC
Created attachment 209502 [details]
Test case 2

The Dispatcher2 class, linked to in comment 9, contains these lines in
DispatchNotifier::pipe_io_handler():

  // we don't emit directly since no more dispatcher events are processed
  // if nested main loop is called from signal handler (don't know why)
  context_->signal_idle().connect_once(
    sigc::mem_fun(data.dispatcher->signal_, &sigc::signal<void>::emit)); //emit

Why would the signal handler start a nested main loop? It's not as far-fetched
as you might think. It happens if the signal handler shows a modal dialog box,
as demonstrated by this test case. It's a highly modified version of the
dialogs/messagedialog example program in the gtkmm tutorial.

The reason dispatcher events are not processed while a nested main loop is
running, is that pipe_io_handler() is connected to an IOSource event source,
and event sources are created with Source::get_can_recurse() == false.
pipe_io_handler() will not be called again before the previous call has
returned.

The solution in Dispatcher2 is to create a new event source from which the
signal handler is called. An alternative, and in my opinion better, solution
is to call Source::set_can_recurse(true).
Comment 15 Kjell Ahlstedt 2012-03-12 18:13:07 UTC
Created attachment 209518 [details] [review]
patch: Dispatcher: Don't send messages to a deleted Dispatcher. (no ABI break)

This patch is similar to Dispatcher2 in comment 9. There are two differences.

1. Instead of a std::set with addresses of all existing Dispatchers there is a
   std::set with addresses of deleted Dispatchers to which there may be
   messages in the pipe. This set is cleared when the pipe is empty. Almost
   always this set will be empty, and a search for an address will be fast.

2. Glib::Source::set_can_recurse(true) is called in DispatchNotifier's
   constructor. (See comment 14.)

This solution suffers from the drawback mentioned in comment 10.
Copied from DispatchNotifier::reference_instance():

  // In the possible but unlikely case that a new dispatcher gets the same
  // address as a newly deleted one, if the pipe still contains messages to
  // the deleted dispatcher, those messages will be delivered to the new one.
  // Not ideal, but perhaps the best that can be done without breaking ABI.
  // The alternative would be to remove the following erase(), and risk not
  // delivering messages sent to the new dispatcher. See bug 651942.
  instance->deleted_dispatchers_.erase(dispatcher);

This patch does not break ABI or API. (I assume that Glib::DispatchNotifier is
not part of ABI.)

I have added two usage rules for Dispatcher in dispatcher.h. These rules are
not really new. They have just not been documented until now.
Comment 16 Kjell Ahlstedt 2012-03-12 18:28:33 UTC
Created attachment 209522 [details] [review]
patch: Dispatcher: Don't send messages to a deleted Dispatcher. (ABI break)

This patch breaks ABI, but not API. It does not suffer from the drawback
mentioned in comment 10 and comment 15. It's an alternative to the patch in
comment 15.

Each Dispatcher contains a pointer to a Dispatcher::Impl with all private data.
When a Dispatcher is deleted, the Dispatcher::Impl is deleted only if the pipe
is empty. Otherwise the pointer to the Dispatcher::Impl is copied to a list in
the DispatchNotifier and the deletion of Dispatcher::Impl is delayed until the
pipe is empty.
Comment 17 Kjell Ahlstedt 2012-03-20 17:57:28 UTC
Is the ABI-preserving patch in comment 15 good enough?
(Apart from some comments. I understand now that comments shall contain a URL
to a bug report instead of just a bug number. And I should add a TODO that a
better solution can be implemented in the next ABI-breaking release of glibmm.)
As I've said, it's not perfect, but it does avoid the segfault that this bug is
about.

Is an ABI-breaking release in sight? If it's not too far ahead, it's perhaps
better to wait for it, and skip the ABI-preserving patch.
Comment 18 Murray Cumming 2012-04-03 09:10:06 UTC
(In reply to comment #17)
> Is the ABI-preserving patch in comment 15 good enough?

At this point you are probably the best judge of that. If it improves things and doesn't make them worse, please push it.

> Is an ABI-breaking release in sight? If it's not too far ahead, it's perhaps
> better to wait for it, and skip the ABI-preserving patch.

I think we might do it around when GTK+ 4 happens. But that won't be particularly soon. I think we should not wait.
Comment 19 Kjell Ahlstedt 2012-04-04 12:27:40 UTC
Pushed the ABI-preserving patch with an added TODO comment, suggesting
implementation of the ABI-breaking solution when ABI can be broken.
http://git.gnome.org/browse/glibmm/commit/?id=0f64aef58ac9d8ad1604800d3ba74bb998f63987
Comment 20 Kjell Ahlstedt 2012-04-04 14:41:06 UTC
Created attachment 211301 [details] [review]
Don't send a message, meant for a deleted Dispatcher, to a new Dispatcher.

This patch replaces the ABI-preserving solution by the ABI-breaking solution.
Probably more useful than the patch in comment 16 now when the ABI-preserving
patch has been pushed. (ChangeLog is not updated. That can easily be added
if and when this patch or a similar one is pushed.)
Comment 21 Kjell Ahlstedt 2012-05-31 14:05:06 UTC
*** Bug 676119 has been marked as a duplicate of this bug. ***