GNOME Bugzilla – Bug 605728
Custom Gtk::Containers can't auto-remove when a C++ widget is deleted.
Last modified: 2016-07-20 07:20:54 UTC
In GTK+, you can just gtk_widget_destroy() a widget. GtkWidget's dispose handler will then just call gtk_container_remove() to make the container forget about it. This even works if you put a C++ Gtk::Widget in a C GtkContainer. However, if you put a C++ Gtk::Widget in a custom container (Derived from Gtk::Container, for instance), that doesn't happen. That's maybe because of this check: void Widget_Class::dispose_vfunc_callback(GObject* self) { ... if(obj && !obj->_cpp_destruction_is_in_progress()) However, I need to create a simple test case and check that theory.
Created attachment 229481 [details] Modified book/custom/custom_container example. The attached test case is a modified version of the custom_container example in the gtkmm tutorial. (In reply to comment #0) > void Widget_Class::dispose_vfunc_callback(GObject* self) > { > ... > if(obj && !obj->_cpp_destruction_is_in_progress()) This code stops gtk_container_remove() from being called directly from dispose_vfunc_callback(), but that's not a problem, because it's still called indirectly via if(base->dispose) (*base->dispose)(self); further down in dispose_vfunc_callback(). Gtk::Container_Class::remove_callback() is called. Condensed backtrace:
+ Trace 231199
The problem is that Gtk::Container_Class::remove_callback() does not call Gtk::Container_Class::remove_callback_normal(), which therefore does not call MyContainer::on_remove(). The call to remove_callback_normal() is skipped because gtkmm_child_already_deleted is true in remove_callback(). That's reasonable. The child (in this case a Gtk::Label) is being deleted. Gtk::Object::_release_c_instance() calls Gtk::Object::disconnect_cpp_wrapper() before it calls g_object_run_dispose(). But it is a problem. MyContainer is not informed about the deleted child widget, and keeps a pointer to it.
Created attachment 229482 [details] [review] patch: Modify the deletion of widgets. This patch suggests a solution. Gtk::Object::_release_c_instance() and Gtk::Window::_release_c_instance() call disconnect_cpp_wrapper() after they call g_object_run_dispose() or gtk_widget_destroy(). I've tested this patch on the test case in comment 1 and on some of the example programs in the gtkmm tutorial. I haven't found any problem with it, but I realize that the deletion of a Gtk::Widget is a delicate process that can easily go wrong in unusual situations.
See also bug 315874.
Pushed the patch. That fixes this bug. http://git.gnome.org/browse/gtkmm/commit/?id=4f6cc1f73484ae3ea8895bc9d713642b13fd59ef
This commit is causing a crash in Glom at shutdown. It's easy to reproduce by just starting Glom and then clicking the Quit button. I should break it down to find out exactly what triggers the crash, though I'm not sure when I'll find time. This is what valgrind says: ==10741== Invalid read of size 1 ==10741== at 0x5DDDA32: g_type_check_instance_cast (gtype.c:4033) ==10741== by 0x4F84989: Gtk::Window::_release_c_instance() (window.cc:103) ==10741== by 0x4F8493D: Gtk::Window::destroy_() (window.cc:86) ==10741== by 0x4F85ECB: Gtk::Window::~Window() (window.cc:544) ==10741== by 0x4F86018: Gtk::Window::~Window() (window.cc:545) ==10741== by 0x80E0D25: Glom::Application::on_window_hide(Gtk::Window*) (application.cc:79) ==10741== by 0x80E2C17: sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*>::operator()(Gtk::Window* const&) const (in /opt/gnome/bin/glom) ==10741== by 0x80E2B40: sigc::adaptor_functor<sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*> >::deduce_result_type<Gtk::Window*&, void, void, void, void, void, void>::type sigc::adaptor_functor<sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*> >::operator()<Gtk::Window*&>(Gtk::Window*&) const (adaptor_trait.h:89) ==10741== by 0x80E2A9B: sigc::bind_functor<-1, sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*>, Gtk::Window*, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>::operator()() (bind.h:1119) ==10741== by 0x80E2958: sigc::internal::slot_call0<sigc::bind_functor<-1, sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*>, Gtk::Window*, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>, void>::call_it(sigc::internal::slot_rep*) (slot.h:103) ==10741== by 0x4150EA1: sigc::slot0<void>::operator()() const (slot.h:440) ==10741== by 0x53DD9E4: Glib::SignalProxyNormal::slot0_void_callback(_GObject*, void*) (signalproxy.cc:95) ==10741== by 0x5DBC6C9: g_cclosure_marshal_VOID__VOID (gmarshal.c:85) ==10741== by 0x5DB9C11: g_closure_invoke (gclosure.c:777) ==10741== by 0x5DD5458: signal_emit_unlocked_R (gsignal.c:3654) ==10741== by 0x5DD47B7: g_signal_emit_valist (gsignal.c:3328) ==10741== by 0x5DD4AE0: g_signal_emit (gsignal.c:3384) ==10741== by 0x57CBE65: gtk_widget_hide (gtkwidget.c:4265) ==10741== by 0x57D86F4: gtk_widget_dispose (gtkwidget.c:10774) ==10741== by 0x57E7BE2: gtk_window_dispose (gtkwindow.c:2536) ==10741== by 0x4F84AC7: Gtk::Window_Class::dispose_vfunc_callback(_GObject*) (window.cc:186) ==10741== by 0x5DC01A3: g_object_run_dispose (gobject.c:1062) ==10741== by 0x57CB97A: gtk_widget_destroy (gtkwidget.c:4099) ==10741== by 0x4F849AB: Gtk::Window::_release_c_instance() (window.cc:111) ==10741== by 0x4F8493D: Gtk::Window::destroy_() (window.cc:86) ==10741== by 0x4F85D64: Gtk::Window::~Window() (window.cc:544) ==10741== by 0x4E6C5FC: Gtk::ApplicationWindow::~ApplicationWindow() (applicationwindow.cc:118) ==10741== by 0x8150CC5: GlomBakery::AppWindow_WithDoc_Gtk::~AppWindow_WithDoc_Gtk() (appwindow_withdoc_gtk.cc:63) ==10741== by 0x80E6576: Glom::AppWindow::~AppWindow() (appwindow.cc:120) ==10741== by 0x80E6878: Glom::AppWindow::~AppWindow() (appwindow.cc:139) ==10741== by 0x80E0D25: Glom::Application::on_window_hide(Gtk::Window*) (application.cc:79) ==10741== by 0x80E2C17: sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*>::operator()(Gtk::Window* const&) const (in /opt/gnome/bin/glom) ==10741== by 0x80E2B40: sigc::adaptor_functor<sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*> >::deduce_result_type<Gtk::Window*&, void, void, void, void, void, void>::type sigc::adaptor_functor<sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*> >::operator()<Gtk::Window*&>(Gtk::Window*&) const (adaptor_trait.h:89) ==10741== by 0x80E2A9B: sigc::bind_functor<-1, sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*>, Gtk::Window*, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>::operator()() (bind.h:1119) ==10741== by 0x80E2958: sigc::internal::slot_call0<sigc::bind_functor<-1, sigc::bound_mem_functor1<void, Glom::Application, Gtk::Window*>, Gtk::Window*, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil, sigc::nil>, void>::call_it(sigc::internal::slot_rep*) (slot.h:103) ==10741== by 0x4150EA1: sigc::slot0<void>::operator()() const (slot.h:440) ==10741== by 0x53DD9E4: Glib::SignalProxyNormal::slot0_void_callback(_GObject*, void*) (signalproxy.cc:95) ==10741== by 0x5DBC6C9: g_cclosure_marshal_VOID__VOID (gmarshal.c:85) ==10741== by 0x5DB9C11: g_closure_invoke (gclosure.c:777) ==10741== by 0x5DD5458: signal_emit_unlocked_R (gsignal.c:3654) ==10741== by 0x5DD47B7: g_signal_emit_valist (gsignal.c:3328) ==10741== by 0x5DD4AE0: g_signal_emit (gsignal.c:3384) ==10741== by 0x57CBE65: gtk_widget_hide (gtkwidget.c:4265) ==10741== by 0x4F7A21E: Gtk::Widget::hide() (widget.cc:5889) ==10741== by 0x80E0CA5: Glom::Application::create_window(Glib::RefPtr<Gio::File> const&) (application.cc:73) ==10741== by 0x80E0D4B: Glom::Application::on_activate() (application.cc:87) ==10741== by 0x524EADF: Gio::Application_Class::activate_callback(_GApplication*) (application.cc:653) ==10741== by 0x5DBC731: g_cclosure_marshal_VOID__VOIDv (gmarshal.c:115) ==10741== by 0x5DBA2BF: g_type_class_meta_marshalv (gclosure.c:997) ==10741== by 0x5DB9E81: _g_closure_invoke_va (gclosure.c:840) ==10741== Address 0x625c260a is not stack'd, malloc'd or (recently) free'd
I'll have a look at it. It's probably my fault.
Created attachment 242606 [details] [review] glom patch: AppWindow_WithDoc_Gtk::on_hide(): Call base class. This glom patch fixes the problem. I realize that you prefer a gtkmm patch. I'll see what I can do. I'd also like to understand why it has worked before without calling gtk_window_hide().
Created attachment 242778 [details] [review] patch: gmmproc: Fix _WRAP_SIGNAL(custom_c_callback) for void func(). (glibmm) To fix the crash in Glom with a modification of gtkmm, I need to make custom C callbacks for Gtk::Widget's hide signal. That requires a change in gmmproc. _WRAP_SIGNAL(..., custom_c_callback) does not work properly for signal handlers that take no arguments and return void. This patch fixes that. There is a small risk with this patch: Some mm package might depend on the present bug in _WRAP_SIGNAL(..., custom_c_callback), and consider it correct behaviour. I have checked some packages (atkmm, glibmm, goocanvasmm, grilomm, gtkmm, gtksourceviewmm, libgdamm, libnotifymm, pangomm). Of those, only gtkmm contains _WRAP_SIGNAL directives with a custom_c_callback argument.
Created attachment 242780 [details] [review] patch: Gtk::Widget: Don't call signal_hide handlers on a widget being deleted. This patch fixes the crash in Glom. It depends on the previous glibmm patch. The original problem in this bug report was that the connection between a C object and its C++ wrapper was broken too early during deletion of a C++ object. My commit in comment 4 fixed that, but at the same time it created the opposite problem: If a signal handler deletes the object that emits the signal, and the deletion triggers another emission of that signal, the signal handler can be called recursively. Gtk::Widget's hide signal is at risk. It can be triggered by deletion, and it can be reasonable to 'delete widget' in a signal_hide handler. Shall I push these two patches, and if so, in which branches? Or would it be better to revert the commit in comment 4? I need to do something with gtkmm. Even if Glom is fixed with the patch in comment 7, other programs may be affected by this bug. (In reply to comment #7) > I'd also like to understand why it has worked before without calling > gtk_window_hide(). It worked because gtk_window_hide() _is_ called both with and without the commit in comment 4. I've checked with gdb.
Review of attachment 242778 [details] [review]: That seems safe to commit then. Thanks.
Please go ahead and push the gtkmm patch to master (We have not yet branched for gtkmm-3-6 branch). I think that's the only branch that has this problem. Then we can see how it goes for a while before doing a gtkmm 3.6.1 release. Many many thanks for all this hard work.
I've pushed the glibmm patch and the gtkmm patch https://git.gnome.org/browse/glibmm/commit/?id=cd18cbff8b3679ae6fe3429fb7119251241521fe https://git.gnome.org/browse/gtkmm/commit/?id=d51d12e6084ad70f89bcc68b970426b7a13ec899 I added a sentence to the commit message of the gtkmm patch, saying that it requires the fix of _WRAP_SIGNAL(..., custom_c_callback). I suppose you mean gtkmm 3.8 and 3.8.1 in comment 11.
> I suppose you mean gtkmm 3.8 and 3.8.1 in comment 11. Yes, sorry.
Kjell, your fix is in the recent glibmm and gtkmm releases, so I'll close this again. Thanks.
Since 2013-05-02 there's a gtkmm 3.8.1 release at http://ftp.gnome.org/pub/GNOME/sources/gtkmm/3.8/, but there's no trace of it at https://git.gnome.org/browse/gtkmm/.
The git repository has now been updated with the gtkmm 3.8.1 release.
Would this need any change in gtkmm-2.4 (most recently gtkmm 2.24) which is probably being used with newer glibmm versions?
I guess you ask because of John Emmas's problems with the simplest possible gtkmm application, https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00000.html I doubt that this bug, or its fixes, are related to those problems. (The fixes are not even included in gtkmm2.) Like I say in https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00017.html, I can't reproduce John's crash, or anything similar, on my Linux system.
I can confirm that the above problem isn't directly related to this bug. Yesterday however I found a variation on my original report which I believe IS being caused by this bug. The problem doesn't occur with a regular gtkmm window - but it occurs if I DERIVE a window from a gtkmm window. I've given an example in this bug report:- https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00038.html This problem was discovered in gtkmm v2.24.4 (sorry, not 2.24.2 as I mistakenly reported yesterday) - so I believe Murray is right in believing that this should also get fixed in gtkmm 2.24. Hope that helps. John
Oops - I should have mentioned that in addition to Gtk::Widget::on_hide() this bug can also affect Gtk::Widget::on_unrealize()
I don't see any particular reason to think that this is the problem you are having. I still have general doubt in your *mm libraries that you have apparently built on Windows directly from git rather than from using the tarballs. I have repeatedly said that this is not a good idea. However, if you are convinced that this fix would help you, you could just try applying it yourself to test your theory.
Hi Murray. My belief is based solely on Kjell's remark (Comment #9) that during widget deletion the link between a C object and it's C++ wrapper was getting broken to early. I can't say if it's definitely the same reason - but that's definitely the same *problem* that I've reported on the mailing list:- https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00042.html BTW, I have no particular objection to building from tarballs but I've already reported on the mailing list that the same issues persist, even if I build from the tarball sources. These are problems in the actual code - not in the way I'm building it. My specific point (here) is that in addition to 'Gtk::Widget::on_hide()' the problem also affects 'Gtk::Widget::on_realize()'. Can anyone confirm if the patch would fix both situations? If not, there's little point in me applying it.
You are the only one able to reproduce this problem. Therefore you are the only one who could test if this fixes it. Please try.
Okay, leave it with me. I'm busy for the next few hours but I'll try to apply the patches today if possible. Next question... which patches are we talking about? Do the patches in Comment #12 cover everything or is it more convoluted than that?
I think you'll want to try adding these two gtkmm-3.0 commits to gtkmm-2.4: commit 4f6cc1f73484ae3ea8895bc9d713642b13fd59ef Author: Kjell Ahlstedt <kjell.ahlstedt@bredband.net> Date: Wed Feb 27 16:44:13 2013 +0100 Modify the deletion of widgets. * gtk/gtkmm/object.cc: _release_c_instance(): Call disconnect_cpp_wrapper() after g_object_unref() and g_object_run_dispose(). * gtk/src/window.ccg: _release_c_instance(): Call disconnect_cpp_wrapper() after gtk_widget_destroy(). Bug #605728. Bug #315874. commit d51d12e6084ad70f89bcc68b970426b7a13ec899 Author: Kjell Ahlstedt <kjell.ahlstedt@bredband.net> Date: Mon Apr 29 13:12:35 2013 +0200 Gtk::Widget: Don't call signal_hide handlers on a widget being deleted. * gtk/src/widget.[hg|ccg]: Add custom C callbacks for signal_hide. Don't call signal_hide handlers or an on_hide() override on a widget while it is being deleted. This requires the latest glibmm from git master, where a bug in _WRAP_SIGNAL(..., custom_c_callback) has been fixed. Bug #605728. You'll probably have to adjust them slightly to get them to apply.
I applied the above 2 patches and I'm pleased to confirm that they do seem to fix the crashing. There was one small snag though... The first patch makes some changes to gtk/gtkmm/object.cc. In particular, it marks the function Object::callback_weak_notify_() as not being needed any more. In v2.24.4 that function used to be called Object::callback_destroy_(). I wasn't sure whether to leave it unchanged, leave it but rename it to the updated name or just remove it altogether. In the end, I left it unchanged. I'm not sure if that was the right decision or even if it matters.
Created attachment 261388 [details] [review] patch: Modify the deletion of widgets (gtkmm-2-24) Here's a patch for the gtkmm-2-24 branch. It contains modified versions of the patches in comment 2 and comment 9. There is a complication, if this patch shall be pushed to the gtkmm-2-24 branch: It depends on the glibmm patch in comment 8. That patch is included in glibmm 2.37.4 and later. BUT, as we've seen in bug 697835 and bug 700495, many applications will break if a gtkmm 2 release is built with a new version of gmmproc. I think the best solution, if we have to apply this patch to gtkmm 2.24, will be to apply the glibmm patch in comment 8 to the version of gmmproc that will be used for building gtkmm.
John's problems were not related to this bug. See https://mail.gnome.org/archives/gtkmm-list/2013-December/msg00002.html I see no good reason to apply the patch in comment 27 to gtkmm 2.24. If you want to discuss and/or fix problems with __declspec(dllimport) in Windows, use bug 719847.
Hello Kjell. Let me check that you aren't getting mixed up between two different deletion problems. I reported 2 x deletion issues in close succession. The most recent of them is the one you referred to in Comment #28 above. I discovered it while deleting a Gtk::SpinButton control and it was specifically a problem with __declspec(dllimport) (ironically, I'd created that problem myself). While tracking it down though, I discovered other inconsistencies with __declspec() and I suggested that the various devs should liaise to devise a more consistent scheme. I do think that needs to be done as soon as possible and I'd be willing to contribute with suggestions / patches if needed BUT that particular problem has no connection with any patches in this bug report. HOWEVER.... Prior to that deletion crash I reported a completely separate deletion crash. This one occurred when deriving an object from either Gtk::Window or Gtk::Dialog. It occured when attempting to delete the derived object. That deletion crash was fixed by applying the patches (from Comment #27) to v2.24 Hope that clarifies the difference between them.
You have shown several very simple programs that have crashed when you run them, but have worked normally when I run them. See e.g. https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00000.html https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00038.html At least in one case your problem has been with g_object_steal_qdata((GObject*)gobj(), Glib::quark_); not removing Glib::quark_, just as in your SpinButton example. See e.g. https://mail.gnome.org/archives/gtkmm-list/2013-November/msg00022.html I think this has been your real problem all the time. You have just hidden that problem by applying patches that have made some of your test cases skip the call to g_object_steal_qdata(). What happens if you fix your problem with __declspec(dllimport) in glibmm, and use gtkmm 2.24.4 from a tarball without any patches? Won't all your test cases then run normally?
(In reply to comment #30) > > I think this has been your real problem all the time. You have just hidden > that problem by applying patches that have made some of your test cases skip > the call to g_object_steal_qdata(). > > What happens if you fix your problem with __declspec(dllimport) in glibmm, > and use gtkmm 2.24.4 from a tarball without any patches? Won't all your > test cases then run normally? I'll give that a try in the next day or two.
Hello Kjell. As you'll recall, I had 2 x different crashing issues, namely (1) a crash related to Gtk::SpinButton and (2) a crash when deriving a dialog from Gtk::Dialog. During the past couple of days I've carried out some comprehensive testing of these problems. I've tested the code in four specific states, namely:- a) My local code (as it was before I applied the patches) b) My local code (immediately after applying the patches) c) The code as it exists in the 2.24 tarball d) The code as it exists in current git (branch gtkmm-2-24) In all four cases I started by re-introducing the declspec() problem that eventually fixed my SpinButton issue - i.e. to make sure the problems came back (which they did). I then corrected the declspec() bug. With the declspec() issue fixed I can't reproduce the crash in any of my four test cases. So it looks like your diagnosis must be right. The declspec() problem was the culprit all along! It was just a coincidence that applying those patches happened to fix the original Gtk::Dialog issue. This raises an obvious question though... Given that the purpose of the patches is to ensure that an underlying 'C' object can't get deleted before it's C++ wrapper gets deleted, why would version 3 need to get patched but not version 2? Is there some reason why that scenario can't happen in version 2?
The patches in this bug report are not needed for your simple test cases to work, neither in gtkmm 2 nor in gtkmm 3. If you read the original description and the first few comments in this bug report, you'll see that they discuss a more unusual case with a custom container, derived directly from Gtk::Container. Such programs probably don't work perfectly in gtkmm 2, since the patches have not been applied there (except in our local copies).
Am I right to assume that there are no plans to backport these patches to gtkmm 2.4? Ardour is now using a custom container derived directly from Gtk::Container, and as indicated in comment 23, things don't work correctly without explicit tracking of child destruction, which is cumbersome.
FWIW, I fixed the issues in Ardour by using sigc::trackable::add_destroy_notify() to notify the (custom) container that the child is being destroyed.
I generally avoid touching gtkmm-2.4, partly because it's awkward even to build it, but maybe if you made the patch and tried it with your code base.
Paul - are you sure you meant comment #23 (it's just a request for me to test the patch)