Bug 745678 – GObject doesn't support removing a weak reference in a GWeakNotify for the same object

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 745678 - GObject doesn't support removing a weak reference in a GWeakNotify for the same object


Summary:	GObject doesn't support removing a weak reference in a GWeakNotify for the sa...


Status:	RESOLVED OBSOLETE

Product:	glib
Classification:	Platform
Component:	gobject
Version:	2.43.x
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Duplicates:	749659 749660 (view as bug list)
Depends on:
Blocks:

Reported:	2015-03-05 14:16 UTC by Sébastien Wilmet
Modified:	2018-05-24 17:35 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Test: remove weak reference in a GWeakNotify (for the same object) (2.48 KB, patch) 2015-03-05 14:18 UTC, Sébastien Wilmet	none	Details \| Review
gobject: remove all weak references in a safer manner (5.82 KB, patch) 2015-03-07 15:32 UTC, Sébastien Wilmet	none	Details \| Review

Description Sébastien Wilmet 2015-03-05 14:16:43 UTC

GObject doesn't support removing a weak reference in a GWeakNotify for the same object.

Take for example:
- Three objects: A, B and C.
- Both objects B and C have a weak reference to object A.
- Object B has a strong reference to object C.
- When object A is destroyed, the GWeakNotify of object B is called, which unrefs and destroys object C.
- Object C's dispose() is called, which removes the weak reference to object A (if object C's GWeakNotify hasn't been called).

The latter point isn't supported by GObject, I'll attach a unit test. The backtrace looks like:

(process:1479): GLib-GObject-WARNING **: g_object_weak_unref: couldn't find weak ref 0x400dc1(0x2a)

+ Trace 234786

#0 _g_log_abort
at gmessages.c line 315
#1 g_logv
#2 g_log
#3 g_object_weak_unref
at gobject.c line 2722
#4 remove_other_weak_ref
at references.c line 128
#5 weak_refs_notify
at gobject.c line 2630
#6 g_data_set_internal
at gdataset.c line 407
#7 g_datalist_id_set_data_full
at gdataset.c line 670
#8 g_object_real_dispose
at gobject.c line 1021
#9 g_object_unref
at gobject.c line 3138
#10 main
at references.c line 235

Comment 1 Sébastien Wilmet 2015-03-05 14:18:31 UTC

Created attachment 298642 [details] [review]
Test: remove weak reference in a GWeakNotify (for the same object)

Comment 2 Sébastien Wilmet 2015-03-05 14:24:43 UTC

And there is a real use case.

To make the example a bit more concrete, in the GtkSourceView library: object A is a GtkSourceBuffer, object B is a GtkSourceSearchContext and object C is a GtkTextRegion. When the buffer is destroyed, the search is cleared so it destroys the GtkTextRegion. Both the SearchContext and TextRegion have a weak reference to the buffer.

Comment 3 Sébastien Wilmet 2015-03-07 15:32:50 UTC

Created attachment 298781 [details] [review]
gobject: remove all weak references in a safer manner

To support removing a weak reference in a GWeakNotify callback, for the
same object.

This is of course less efficient, since there are more locks/unlocks,
but correctness is more important than performance.

Comment 4 Allison Karlitskaya (desrt) 2015-03-13 02:19:44 UTC

A few notes:

First, I suspect that the locking is not actually required here -- if the object is in the middle of dying then it's not possible for someone to be holding extra references on it from another thread.

But that sort of brings us to the next point: what you're doing is at least a little bit invalid to begin with.  You're getting your weakref notifies because the object is dead already.  We call that parameter "where_the_object_was" for a reason.  Through a series of steps, you respond to that by making a method call on the object (g_object_weak_unref).  I'm not entirely sure that we want to declare that this is valid behaviour.

We even document this in the reference for GWeakNotify:

"""

Since the object is already being finalized when the GWeakNotify is called, there's not much you could do with the object, apart from e.g. using its address as hash-index or the like.


"""


Perhaps there is some way for you to restructure your code to avoid this problem?

Comment 5 Sébastien Wilmet 2015-03-14 11:00:03 UTC

Ah, indeed the unit test is too simplistic. We need at least three objects to make the code valid, so that the g_object_weak_unref() is called in a dispose() function of one of the objects.

I've found a workaround in GtkSourceView:
https://git.gnome.org/browse/gtksourceview/commit/?id=48aa3ee7f516a785e039af62de42079309ee58a4

What doesn't work is this commit:
https://git.gnome.org/browse/gtksourceview/commit/?h=wip/search-weak-ref&id=1cab23a3029336fea2b27d459325059735248ff8

The clear_search() function is defined here:
https://git.gnome.org/browse/gtksourceview/tree/gtksourceview/gtksourcesearchcontext.c?id=48aa3ee7f516a785e039af62de42079309ee58a4#n558

It calls gtk_text_region_destroy() (GtkTextRegion was meant to be included in GtkTextView but is still a private file in GtkSourceView… and it's not a GObject, but for the purpose of this bug it doesn't matter):
https://git.gnome.org/browse/gtksourceview/tree/gtksourceview/gtktextregion.c?id=48aa3ee7f516a785e039af62de42079309ee58a4#n131

So you see in gtk_text_region_destroy() that g_object_remove_weak_pointer() is called, which itself calls g_object_weak_unref().

The backtrace in GtkSourceView is:

/Search/destroy-buffer-during-search: 
(./tests/test-search-context:902): GLib-GObject-WARNING **: g_object_weak_unref: couldn't find weak ref 0x7fb802c96d6b(0x21be540)

+ Trace 234844

#0 _g_log_abort
at gmessages.c line 315
#1 g_logv
at gmessages.c line 1041
#2 g_log
#3 g_object_weak_unref
at gobject.c line 2722
#4 g_object_remove_weak_pointer
at gobject.c line 2768
#5 gtk_text_region_destroy
at gtktextregion.c line 149
#6 clear_search
at gtksourcesearchcontext.c line 563
#7 buffer_destroyed_cb
at gtksourcesearchcontext.c line 2535
#8 weak_refs_notify
at gobject.c line 2630
#9 g_data_set_internal
at gdataset.c line 407
#10 g_datalist_id_set_data_full
at gdataset.c line 670
#11 g_object_real_dispose
at gobject.c line 1021
#12 gtk_source_buffer_dispose
at gtksourcebuffer.c line 595
#13 g_object_unref
at gobject.c line 3138
#14 test_destroy_buffer_during_search
at test-search-context.c line 1074
#15 test_case_run
at gtestutils.c line 2124
#16 g_test_run_suite_internal
at gtestutils.c line 2185
#17 g_test_run_suite_internal
at gtestutils.c line 2196
#18 g_test_run_suite
at gtestutils.c line 2249
#19 g_test_run
at gtestutils.c line 1553
#20 main
at test-search-context.c line 1113

Comment 6 Allison Karlitskaya (desrt) 2015-05-21 14:38:48 UTC

*** Bug 749659 has been marked as a duplicate of this bug. ***

Comment 7 Allison Karlitskaya (desrt) 2015-05-21 14:42:08 UTC

*** Bug 749660 has been marked as a duplicate of this bug. ***

Comment 8 Christian Hergert 2015-05-21 19:35:09 UTC

I know that technically the object should be considered destroyed, but the patches in bug 749659 and bug 749660 both look like very reasonable implementations.

For that reason alone I think we should consider fixing them. But being able to simplify application code, which is the entire purpose of glib, is a great reason too. With how much this pattern is getting used in the wild, I'd argue it is desirable.

Like signal hanlders, there are certain things you can and cannot do. The patches show can support this, with minimal changes, and a more readable implementation, so why not do it? Saying that from a weak notify you are only allowed to 1) remove weak references 2) remove signal handlers, seems easy enough to follow.

Comment 9 Christian Hergert 2015-05-21 20:20:33 UTC

I forgot to mention why I think fixing this simplifies application code ...

We do a lot of weak references in Builder. There is a lot of complex, moving machinery, various compilers, completion data, etc, and we are very aggressive about getting that memory released as soon as possible.

Having this fixed will allow us to avoid having duplicate release paths to track whether or not we are releasing via weak notifications. So for us, its duplicate paths that are repeated many, many times throughout the codebase.

Additionally, it also makes EggSignalGroup[1] and EggBindingSet[2] significantly easier to wrap your head around. And having those both reliable is essential for us to keep our codebase a reasonable size when dealing with GtkTextView/GtkTextBuffer splits. (We have tons of indirect signal connections watching for changes on gproperties of gproperties).

[1] https://git.gnome.org/browse/gnome-builder/tree/contrib/egg/egg-signal-group.c
[2] https://git.gnome.org/browse/gnome-builder/tree/contrib/egg/egg-binding-set.c

Comment 10 Christian Hergert 2015-05-21 22:38:39 UTC

Based on our IRC discussion, I wanted to add another note here.

One thing we can't do today, is deal with weak references that might cross the boundary of two objects.

 - Object 1 holds a full reference to Object 2 and a weak reference to Object 3
 - Object 2 holds a weak reference to Object 3
 - Object 3 weak notifies Object 1 before Object 2
 - Object 1 releases Object 2
 - Object 2 doesn't know it shouldn't ignore cleanup on Object 3 weak reference

There may of course be alternate design approaches that could be taken here (like plumbing "remove operations" from the indirect objects to steal the weak reference back). Doing that doesn't seem entirely different than this, so maybe worth while to just fix it here.

Using LibIDE requires a "Context" object, and everything is part of that context. So we can regularly have a situation like the aforementioned with objects that depend on shared resources.

Comment 11 Christian Hergert 2015-06-01 20:05:46 UTC

Anyone have a chance to take a closer look at this?

Comment 12 Alexander Larsson 2015-10-22 18:10:09 UTC

I agree with ryan that this is currently invalid, but i also agree with christian that it is something reasonable to request. An object calling weak_unref(FOO) in its finalizer may very well have *no* idea that the object is itself being finalized indirectly because of FOO dying.

If you control all the code in motion you can sort of work around this with manual plumbing, but there is no general solution for arbitrary objects.

Comment 13 Owen Taylor 2015-10-22 19:00:36 UTC

(In reply to Ryan Lortie (desrt) from comment #4)
\ 
> But that sort of brings us to the next point: what you're doing is at least
> a little bit invalid to begin with.  You're getting your weakref notifies
> because the object is dead already.  We call that parameter
> "where_the_object_was" for a reason.  Through a series of steps, you respond
> to that by making a method call on the object (g_object_weak_unref).  I'm
> not entirely sure that we want to declare that this is valid behaviour.
> 
> We even document this in the reference for GWeakNotify:
> 
> """
> 
> Since the object is already being finalized when the GWeakNotify is called,
> there's not much you could do with the object, apart from e.g. using its
> address as hash-index or the like.

I think it's clear that *as documented* this and the duplicate are not allowed. There used to be even stronger attempts to try to reduce what could be done in weak refs like temporarily setting the reference count to zero. On the other hand, I can't think of any reason that fundamentally we can't make the 

 g_object_add_weak_ref() / g_object_run_dispose()

A generic version of:

 g_signal_connect(widget, "destroy"); gtk_widget_destroy()

with similar constraints. The main difference I know of is that if you connect to "destroy" during dispose() it's silently not called unless the object is resurrected, while if you add a weak_reference during dispose() it may be called before finalization even without resurrection. But these are subtle semantic differences. In general what you can do is similar.

If these patches were to land, documentation updates would definitely be required to more precisely say what you can do in a weak reference callback.

I think some of the idea of trying to restrict what weak references could do was the idea that we'd make them thread safe and if arbitrary code was running in the weak reference, that would be impossible; in the end GWeakRef was added instead.

Comment 14 Christian Hergert 2016-05-16 09:53:33 UTC

It appears to me that not improving this is also causing issues. For example, in GBinding, it does not disconnect the source closure in the weak unref case because it is not guaranteed safe to do so.

This can cause an issue if the on_source_notify() closure does actually fire in the mean time. This might happen when the object had it's dispose called but has not yet finalized. (And I'm seeing this in practice fairly regularly).

Still hoping the patches in bug 749659 and bug 749660 will be looked over too.

Comment 15 Christian Hergert 2016-06-15 19:17:34 UTC

Ping for the hackfest, since it would be good to see if we can resolve this here.

Comment 16 Christian Hergert 2016-06-16 16:13:06 UTC

We discussed this a bit at the hackfest, so I'm going to try to summarize some of the difficulties that the current patches don't resolve, but that we'd like to address.

In particular, being able to remove your weak ref (and perhaps GSignals) is not enough. Indirect objects will not be able to know what they can and cannot do during the cleanup phase (since they won't know they are in a post-finalize phase). For example, imagine an object that needs some other accessor function called. It is not clear whether or not that could be called. Plumbing API to check whether you are post-finalize (but pre-free) is not a particularly clean solution either, since it would leak into implementation everywhere.

Our current idea to resolve this (assuming I follow the plan correctly), is the combination of both a weak pointer and a weak callback, which will be specific to GObjects. If we first invalidate the weak pointers, the indirect objects will know that they no longer need to do their cleanup phase when the weak callback is called. If that object has been finalized between the weak pointer clearing and the weak callback phase, the weak callback can be ignored.

Comment 17 Ernestas Kulik 2018-05-22 06:41:46 UTC

Seems that I’ve run into this while porting Nautilus to use gestures for input events (in GTK+ 3 they hold a weak ref to the associated widget and aren’t managed by them, unlike in GTK+ 4), since the lifetime of a certain struct is tied to that of a widget and the gesture is unreffed in the callback. In the interim, adding an idle source just to unref the gesture seems to be working.

Comment 18 GNOME Infrastructure Team 2018-05-24 17:35:33 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1002.