GNOME Bugzilla – Bug 672465
Ownership of gtk_action_create_tool_item return value change
Last modified: 2014-08-30 05:07:58 UTC
The change in http://git.gnome.org/browse/gtk+/commit/?id=06307dd774da77a2d74f5529feee2ca120804e2f to have GtkAction ref the proxy widget means that the return value from e.g. gtk_action_create_tool_item () changed from a floating ref to a unowned reference to an object with refcount 1. This causes problems in e.g. the EggEditableToolbar code. It creates a widget with gtk_action_create_tool_item, which it then parents and then unparents (via gtk_toolbar_set_drop_highlight_item), expecting it to then go away. However, it now doesn't go away because the GtkAction still lives, and then when the action sensitivity changes it gets a callback to: action_sensitive_cb (GtkAction *action, GParamSpec *pspec, GtkToolItem *item) { EggEditableToolbar *etoolbar = EGG_EDITABLE_TOOLBAR (gtk_widget_get_ancestor (GTK_WIDGET (item), EGG_TYPE_EDITABLE_TOOLBAR)); This later crashes due to accessing the NULL ancestor. This signal is connected to using g_signal_connect_object, which used to disconnect it, but now that doesn't happen due to the ref from the action.
*** Bug 672263 has been marked as a duplicate of this bug. ***
So...we have three ways to go on this: 1) Change EggEditableToolbar 2) Revert, leave _get_proxies() as an evil bomb for bindings 3) Try a more involved fix where we only convert floating state when _get_proxies() is called Do we know if multiple programs have broken from this? Hmm...it seems like the fix is conceptually wrong because it's quite reasonable to expect that a *new* widget returned from a method like gtk_action_create_tool_item() is floating, but we've changed it from floating to not-floating.
gtranslator is one app that includes eggeditabletoolbar. I'm sure there are other ones that include it (its libegg after all). Don't know if there is some other code that have similar problems, and its kinda hard to tell because the more common side effect of the change is just a "leak", rather than the elaborate series of events that led to the eggtoolbar crash. Its nasty to break bindings, but as you say its also unexpected to have a widget creation function that doesn't return a floating object. The typical use of gtk_action_create_tool_item() would be to immediately add the proxy in the widget tree and thus sink it, expecting it to die when the widget hierarchy is destroyed. This generally means it'll be sinked before _get_proxies() is called. The problem is when this does not happen. We could sink when _get_proxies() was called the first time, but this makes for very weird lifetime semantics. What exactly is the problem with floating for bindings? Is it the problem only the initial state (non-owned floating), or is it also a problem to have a floating object with refcount > 1?
(In reply to comment #3) > gtranslator is one app that includes eggeditabletoolbar. I'm sure there are > other ones that include it (its libegg after all). Don't know if there is some > other code that have similar problems, and its kinda hard to tell because the > more common side effect of the change is just a "leak", rather than the > elaborate series of events that led to the eggtoolbar crash. True. > Its nasty to break bindings, but as you say its also unexpected to have a > widget creation function that doesn't return a floating object. Yes. > The typical use of gtk_action_create_tool_item() would be to immediately add > the proxy in the widget tree and thus sink it, expecting it to die when the > widget hierarchy is destroyed. This generally means it'll be sinked before > _get_proxies() is called. The problem is when this does not happen. We could > sink when _get_proxies() was called the first time, but this makes for very > weird lifetime semantics. Yeah, doing it in _get_proxies() would be a gross hack. > What exactly is the problem with floating for bindings? Is it the problem only > the initial state (non-owned floating), or is it also a problem to have a > floating object with refcount > 1? I think the issue here is that as a general rule, what bindings do is immediately sink any floating values. This means they take ownership of the floating reference. And that's what creates really confusing lifecycle issues for a binding, because calling _get_proxies() could actually end up destroying an object if the proxy object gets GC'd. Think: for widget in action.get_proxies(): print widget
Created attachment 210252 [details] [review] Revert "GtkAction: Hold a reference to proxy widgets" This reverts commit 06307dd774da77a2d74f5529feee2ca120804e2f. The original change here broke working code in Eggtoolbar. For now we'll just leave gtk_action_get_proxies() as a trap for bindings (and to a lesser extent C authors). I've documented the problem.
(In reply to comment #3) > What exactly is the problem with floating for bindings? Is it the problem only > the initial state (non-owned floating), or is it also a problem to have a > floating object with refcount > 1? (non-owned floating) is the problematic case. Assume following scenario: 1) _get_proxies() returns widget in (floating refc=0) state 2) bindings reaction to (transfer-none) is ref_sink() on the object, because the reference is held by the binding's proxy, so that widget has (refc=1, nonfloating) state 3) binding does not use the object any more, so it releases its reference held by the proxy => (refc=0), widget is destroyed 4) proxy keeps dangling pointer to the destroyed object -> crash is pending See bug #657202 for further details.
One issue with my patch is that I was motivated to fix it by Torsten Schönfeld pointing out the inconsistency problem. It *is* a problem, but on the other hand I don't know what *real world* applications are affected by this.
Jürg Billeter (In reply to comment #6) > > 3) binding does not use the object any more, so it releases its reference held > by the proxy => (refc=0), widget is destroyed > > 4) proxy keeps dangling pointer to the destroyed object -> crash is pending Yes, this actually reveals an interesting issue - the widget can be destroyed by multiple means. This floating issue with bindings is one, but if you simply do: var a = new Gtk.Action(...); var tool = a.create_tool_item(); mywindow.add(tool); // Sinks the ref mywindow.destroy(); // Destroys window and its children, including tool In C, "tool" is now an invalid pointer (and the action "a" still has it on its proxy list). > See bug #657202 for further details. Yeah, thanks for linking to this.
I meant to type above: Jürg Billeter suggests: <juergbi> walters: what i meant is to call g_object_ref, not g_object_ref_sink <juergbi> and in remove_proxy, if the object is still floating after g_object_unref, remove the floating reference as well <juergbi> a bit strange but it should avoid abi issues This would be an interesting approach.
I think though at this point I'd like to just apply my patch to revert; if we want to try the approach in e.g. comment 9 we can do it at the start of the 3.6 cycle.
> var a = new Gtk.Action(...); > var tool = a.create_tool_item(); > mywindow.add(tool); // Sinks the ref > mywindow.destroy(); // Destroys window and its children, including toolIn C, > "tool" is now an invalid pointer (and the action "a" still has it on its proxy list). In this example, tool being invalid is not any different from what would happen if it was created with e.g. gtk_tool_button_new(). Its just a fact of life of how widget lifetimes work in Gtk+. However, the fact that a is still on the action proxy list is a real bug. One solution that doesn't change the lifetime behaviour would be to have a weak ref on the proxy widget so we can catch this case and remove it from the list. That helps the C case, and avoids crashes in the binding case (although we still get weird things in the binding case like calling get_bindings destroying floating proxies).
Jürgs suggestion will avoid crashes, as the widget will not be finalized. But it will still "leak" in the sense that it will live forever if the action doesn't go away, even if the widget is destroyed. This would still cause the eggtoolbar case to crash, because what happened there is that the widget was unparented (but not finalized due to the action ref) and later a GtkAction notify::sensitive handler looked up the proxy widgets hierarchy for a container it expected to be there, then crashing when it got NULL.
Alex, so what should we do for 3.4? Go with the revert?
I'd do the revert + add a weak ref to make things internally consisten (if a bit weird in some edge cases).
Can either of you work out a patch ?
Created attachment 210656 [details] [review] Don't ref proxy from GtkAction, instead use a weak ref The ref was added in 06307dd774da77a2d74f5529feee2ca120804e2f and caused problems with the lifetime changes for floating refs (see bug 672465). This change goes back to the old lifetime behaviour, but additionally makes sure the proxy is removed from the action when it dies.
Created attachment 210657 [details] [review] Disconnect signal before (possibly) finalizing object The eggtoolbar editor got a warning from this code, because the unparenting of the toolbar item finalized it before the disconnect was called.
Review of attachment 210656 [details] [review]: I'm always wary of using weak references with GInitiallyUnowned implementations like GtkWidget; wouldn't it suffice to just use GtkWidget::destroy instead? after all, you want to remove the proxy when the proxy is destroyed.
True, we could use the destroy on the proxy too.
Created attachment 210924 [details] [review] Don't ref proxy from GtkAction, instead disconnect when destroyed The ref was added in 06307dd774da77a2d74f5529feee2ca120804e2f and caused problems with the lifetime changes for floating refs (see bug 672465). This change goes back to the old lifetime behaviour, but additionally makes sure the proxy is removed from the action when it dies.
Here is an alternative version using destroy
GtkAction has been deprecated