After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 642787 - MetaWindowActor has a dangling reference to its MetaWindow
MetaWindowActor has a dangling reference to its MetaWindow
Status: RESOLVED FIXED
Product: mutter
Classification: Core
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: mutter-maint
mutter-maint
: 637045 644753 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2011-02-19 23:54 UTC by Joe Barnett
Modified: 2011-04-05 12:28 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
MetaWindowActor: Use a weak pointer for corresponding MetaWindow (1.88 KB, patch)
2011-03-02 22:58 UTC, Colin Walters
rejected Details | Review
MetaWindowActor: keep a reference to the MetaWindow (1.60 KB, patch)
2011-03-08 03:44 UTC, Owen Taylor
committed Details | Review

Description Joe Barnett 2011-02-19 23:54:30 UTC
Occasionally get a crash when exiting the overview; here's backtrace output from one instance of a crash.

running on ubuntu 10.10 + xorg-edgers packages for latest open source ati/radeon driver for a "VGA compatible controller: ATI Technologies Inc Broadway PRO [Mobility Radeon HD 5800 Series]"


[Thread 0x7fffc4555700 (LWP 31140) exited]
Window manager warning: Log level 8: _gather_pid_callback: assertion `data != NULL' failed
[New Thread 0x7fffc4555700 (LWP 31186)]
(gnome-control-center:31185): user-accounts-cc-panel-DEBUG: calling 'ListCachedUsers'

(gnome-control-center:31185): user-accounts-cc-panel-WARNING **: Failed to run apg: Failed to execute child process "apg" (No such file or directory)

(gnome-control-center:31185): user-accounts-cc-panel-WARNING **: Failed to load faces: Error opening directory '/home/jbarnett/gnome-shell/install/share/pixmaps/faces': No such file or directory
(gnome-control-center:31185): user-accounts-cc-panel-DEBUG: Got 0 users

[Thread 0x7fffc4555700 (LWP 31186) exited]
Window manager warning: Log level 16: invalid uninstantiatable type `(null)' in cast to `GObject'
Window manager warning: Log level 8: g_object_get_data: assertion `G_IS_OBJECT (object)' failed
Window manager warning: Log level 8: g_ascii_strncasecmp: assertion `s2 != NULL' failed

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff1b78ae4 in ?? () from /lib/libc.so.6
(gdb) t a a bt

Thread 1 (Thread 0x7ffff7fba960 (LWP 30578))

  • #0 ??
    from /lib/libc.so.6
  • #1 g_typelib_get_dir_entry_by_gtype
    at girepository/gitypelib.c line 246
  • #2 find_by_gtype_foreach
    at girepository/girepository.c line 576
  • #3 g_hash_table_foreach
    at ghash.c line 1328
  • #4 g_irepository_find_by_gtype
    at girepository/girepository.c line 616
  • #5 gjs_define_object_class
    at gi/object.c line 1261
  • #6 gjs_object_from_g_object
    at gi/object.c line 1531
  • #7 gjs_value_from_g_argument
    at gi/arg.c line 1840
  • #8 gjs_invoke_c_function
    at gi/function.c line 651
  • #9 js_Invoke
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #10 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #11 js_Invoke
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #12 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #13 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #14 js_Invoke
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #15 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #16 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #17 js_Invoke
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #18 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #19 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #20 js_Invoke
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #21 ??
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #22 JS_CallFunctionValue
    from /usr/lib/xulrunner-1.9.2.13/libmozjs.so
  • #23 gjs_call_function_value
    at gjs/jsapi-util.c line 1151
  • #24 gjs_closure_invoke
    at gi/closure.c line 267
  • #25 closure_marshal
    at gi/value.c line 128
  • #26 g_closure_invoke
    at gclosure.c line 767
  • #27 signal_emit_unlocked_R
    at gsignal.c line 3252
  • #28 g_signal_emit_valist
    at gsignal.c line 2993
  • #29 g_signal_emit
    at gsignal.c line 3040
  • #30 clutter_actor_event
    at ./clutter-actor.c line 8087
  • #31 emit_event
    at ./clutter-main.c line 2201
  • #32 emit_pointer_event
    at ./clutter-main.c line 2228
  • #33 _clutter_process_event_details
    at ./clutter-main.c line 2525
  • #34 _clutter_process_event
    at ./clutter-main.c line 2564
  • #35 _clutter_input_device_set_actor
    at ./clutter-input-device.c line 546
  • #36 _clutter_input_device_update
    at ./clutter-input-device.c line 732
  • #37 _clutter_process_event_details
    at ./clutter-main.c line 2491
  • #38 _clutter_process_event
    at ./clutter-main.c line 2564
  • #39 _clutter_stage_process_queued_events
    at ./clutter-stage.c line 722
  • #40 clutter_clock_dispatch
    at ./clutter-master-clock.c line 360
  • #41 g_main_dispatch
    at gmain.c line 2440
  • #42 g_main_context_dispatch
    at gmain.c line 3013
  • #43 g_main_context_iterate
    at gmain.c line 3091
  • #44 g_main_loop_run
    at gmain.c line 3299
  • #45 main
    at core/main.c line 707
(gdb) call gjs_dumpstack()

== Stack trace for context 0xd73450 ==
0 [native frame]
1 anonymous(animate = true, actor = [0xd497c0 MetaWindowActor]) ["/home/jbarnett/gnome-shell/source/gnome-shell/js/ui/windowManager.js":272]
2 anonymous([object Object]) ["/home/jbarnett/gnome-shell/source/gnome-shell/js/ui/windowManager.js":123]
3 anonymous([object Object]) ["/home/jbarnett/gnome-shell/install/share/gjs-1.0/lang.js":110]
4 _emit(name = "showing") ["/home/jbarnett/gnome-shell/install/share/gjs-1.0/signals.js":124]
5 anonymous() ["/home/jbarnett/gnome-shell/source/gnome-shell/js/ui/overview.js":584]
6 anonymous() ["/home/jbarnett/gnome-shell/source/gnome-shell/js/ui/overview.js":513]
7 anonymous() ["/home/jbarnett/gnome-shell/source/gnome-shell/js/ui/overview.js":638]
8 anonymous([0x10c6830 ClutterRectangle], [object _private_Clutter_Event]) ["/home/jbarnett/gnome-shell/source/gnome-shell/js/ui/panel.js":863]
9 anonymous([0x10c6830 ClutterRectangle], [object _private_Clutter_Event]) ["/home/jbarnett/gnome-shell/install/share/gjs-1.0/lang.js":110]


(gdb) q
Comment 1 Giovanni Campagna 2011-02-20 12:54:09 UTC
It seems that actor.get_meta_window() is returning something that is not a GObject, probably a finalized one, thus making GJS believe it is a new GType (G_TYPE_INVALID, most likely), and looking up introspection info fails with a segfault.

Currently MetaWindowActor does not reference its MetaWindow. It is ever possible that outside code finalizes MetaWindow without notifying MetaWindowActor? As I read the code, priv->window is only set when setting the meta-window GObject property, so it is indeed possible, if the lifetime of MetaWindowActor is not predictable enough.
If that changed, would it create a reference cycle? Should it add a weak pointer / weak reference?
Comment 2 Dan Winship 2011-02-21 15:00:00 UTC
Joe: does this happen after using the (X) icon to close windows from the overview maybe?

(In reply to comment #1)
> Currently MetaWindowActor does not reference its MetaWindow. It is ever
> possible that outside code finalizes MetaWindow without notifying
> MetaWindowActor?

It seems like if you have a non-instantaneous window-destroy animation (which we do), then during the animation, the MetaWindow will have been finalized, but the MetaWindowActor's priv->window will still point to the finalized memory.

> If that changed, would it create a reference cycle? Should it add a weak
> pointer / weak reference?

I don't think we have to worry about reference cycles here, since the object lifetime is controlled by external events; as long as MetaWindowActor unrefs its priv->window when its destroy effect completes (rather than from dispose/finalize), then there shouldn't be a leak.

(Alternatively, we could just set priv->window to NULL from meta_window_actor_destroy(), but that would probably break code elsewhere (like in the stack trace above) that assumes that actor.get_meta_window() will never return NULL.)
Comment 3 Joe Barnett 2011-02-21 16:05:21 UTC
haven't been able to discern a pattern yet, but possible the (X) is related... will see if I can figure anything out...
Comment 4 Owen Taylor 2011-02-21 16:22:19 UTC
(In reply to comment #2)
> (Alternatively, we could just set priv->window to NULL from
> meta_window_actor_destroy(), but that would probably break code elsewhere (like
> in the stack trace above) that assumes that actor.get_meta_window() will never
> return NULL.)

Though we'd still have the issue that code in MetaWindow is likely not expecting to be called on a destroyed/unmanaged X window... e.g., if you tried to move the window it might go ahead and move a new window that the X id was recycled for.

My initial instinct is that it's better to deal with that rather than have priv->window NULL:

 - you might want to know what window the actor *was* rather than have it be an empty shell
 - References to a metaWindow can be saved in JS anyways
Comment 5 Giovanni Campagna 2011-02-21 18:02:57 UTC
(In reply to comment #4)
> (In reply to comment #2)
> > (Alternatively, we could just set priv->window to NULL from
> > meta_window_actor_destroy(), but that would probably break code elsewhere (like
> > in the stack trace above) that assumes that actor.get_meta_window() will never
> > return NULL.)
> 
> Though we'd still have the issue that code in MetaWindow is likely not
> expecting to be called on a destroyed/unmanaged X window... e.g., if you tried
> to move the window it might go ahead and move a new window that the X id was
> recycled for.
> 
> My initial instinct is that it's better to deal with that rather than have
> priv->window NULL:
> 
>  - you might want to know what window the actor *was* rather than have it be an
> empty shell
>  - References to a metaWindow can be saved in JS anyways

meta_window_actor_destroy calls clutter_actor_destroy (at some point), which does g_object_run_dispose. Unless you want to unref the MetaWindow in finalize, which is the opposite of what Dan said, you need to deal with NULL windows at the JS level, or avoid accessing disposed MetaWindowActors.
Comment 6 Dan Winship 2011-02-21 18:42:09 UTC
Another possibility would be to get rid of the "window is destroyed but MetaWindowActor still exists" state, by actually destroying the MetaWindowActor when its MetaWindow is destroyed, but reparenting its contents out into another container first, and letting the destroy animation act on that actor (which it would destroy itself when it was done). This would involve non-MetaWindowActors in the window_group though, so it could create new kinds of lossage.
Comment 7 Owen Taylor 2011-02-28 22:20:17 UTC
(In reply to comment #5)
> >  - you might want to know what window the actor *was* rather than have it be an
> > empty shell
> >  - References to a metaWindow can be saved in JS anyways
> 
> meta_window_actor_destroy calls clutter_actor_destroy (at some point), which
> does g_object_run_dispose. Unless you want to unref the MetaWindow in finalize,
> which is the opposite of what Dan said, you need to deal with NULL windows at
> the JS level, or avoid accessing disposed MetaWindowActors.

Well, depends on whether the MetaWindowActor has been destroyed yet or nor. My assumption was that it hadn't. After the MetaWindowActor has been destroyed, I don't think there is any interest in knowing what it pointed to.
Comment 8 Owen Taylor 2011-02-28 22:21:55 UTC
(In reply to comment #6)
> Another possibility would be to get rid of the "window is destroyed but
> MetaWindowActor still exists" state, by actually destroying the MetaWindowActor
> when its MetaWindow is destroyed, but reparenting its contents out into another
> container first, and letting the destroy animation act on that actor (which it
> would destroy itself when it was done). This would involve non-MetaWindowActors
> in the window_group though, so it could create new kinds of lossage.

I'd eventually like to get rid of MetaWindowActor being a group with contents in favor of it just painting itself. Already, the shadow isn't an actor - it's a non-actor object painted from meta_window_actor_paint().
Comment 9 Giovanni Campagna 2011-03-01 17:46:42 UTC
(In reply to comment #7)
> (In reply to comment #5)
> > >  - you might want to know what window the actor *was* rather than have it be an
> > > empty shell
> > >  - References to a metaWindow can be saved in JS anyways
> > 
> > meta_window_actor_destroy calls clutter_actor_destroy (at some point), which
> > does g_object_run_dispose. Unless you want to unref the MetaWindow in finalize,
> > which is the opposite of what Dan said, you need to deal with NULL windows at
> > the JS level, or avoid accessing disposed MetaWindowActors.
> 
> Well, depends on whether the MetaWindowActor has been destroyed yet or nor. My
> assumption was that it hadn't. After the MetaWindowActor has been destroyed, I
> don't think there is any interest in knowing what it pointed to.

If you don't destroy the actor, it will continue to show.
OTOH, if you don't drop all references to the actor (which is difficult to predict, in a GC'ed runtime), you either have NULL, a dangling pointer, or a pointer to an invalid MetaWindow (corresponding to a recycled XID).
Comment 10 Colin Walters 2011-03-02 18:53:01 UTC
I think given the current state of things, allowing the corresponding MetaWindow to be NULL is the least invasive fix here.  This only matters in theory during the destroy animation, and I think it's fine if the MetaWindow is no longer available there.
Comment 11 Colin Walters 2011-03-02 22:58:19 UTC
Created attachment 182312 [details] [review]
MetaWindowActor: Use a weak pointer for corresponding MetaWindow

If the MetaWindow goes away, we'll be pointing to free'd
memory if something (e.g. gnome-shell JS code) calls
meta_window_actor_get_window().
Comment 12 Owen Taylor 2011-03-08 03:19:20 UTC
Review of attachment 182312 [details] [review]:

Weak reference seems the worst of both worlds ... it means that actor.meta_window works until the Javascript garbage collector runs and collects the proxy object for the window, which sounds like a recipe for unpredictable bugs. I don't think this patch would fix the crash either - it would just turn the crash into a series of Javascript backtraces that probably would keep the user from entering and leaving the overview.

I attached a fix for the problem that was resulting in references destroyed actors to bug 644167 (I think, I couldn't easily reproduce this bug - it was hard to figure out the logic to figure out the exact circumstances we could end up with a destroyed window and easier top just rewrite it.)

Let's see how it works to just NULL out priv-.window field when the window is unmanaged, will attach a patch to that effect, but not going to apply it immediately before the release tonight.
Comment 13 Owen Taylor 2011-03-08 03:44:39 UTC
Created attachment 182796 [details] [review]
MetaWindowActor: keep a reference to the MetaWindow

In testing the NULL-out approach didn't even come close to working for
windows with a destroy animation since we accessed the window to find
out how to draw the shadow. We'd probably need to cache how we last drew
the shadow and audit all other uses of priv->window. Here's the simple
approach of keeping a strong reference that we break when the actor is
destroyed, which it will reliably be at some point.

===
Until the actor is destroyed, we need to have access to the
MetaWindow to access some fields used for painting. Keep a strong
reference to the window rather than just hoping the window will
not be freed.
Comment 14 Giovanni Campagna 2011-03-08 12:32:58 UTC
Review of attachment 182796 [details] [review]:

This seems the best approach to me, except for the part about the MetaWindow kept alive while holding a possibly invalid XID (but it's not a regression, its lifetime was dependent on GC and thus unpredictable anyway).
Comment 15 Colin Walters 2011-03-09 19:07:32 UTC
(In reply to comment #12)
> Review of attachment 182312 [details] [review]:
> 
> Weak reference seems the worst of both worlds ... it means that
> actor.meta_window works until the Javascript garbage collector runs and
> collects the proxy object for the window, which sounds like a recipe for
> unpredictable bugs. I don't think this patch would fix the crash either - it
> would just turn the crash into a series of Javascript backtraces that probably
> would keep the user from entering and leaving the overview.

(Just for future reference, my assumption was that we would start testing for null inside the JS code)
Comment 16 Colin Walters 2011-03-09 19:10:23 UTC
Review of attachment 182796 [details] [review]:

::: src/compositor/meta-window-actor.c
@@ +542,3 @@
+          g_object_ref (priv->window);
+        if (old_window)
+          g_object_unref (old_window);

Why not write this as:

if (priv->window)
  g_object_unref (priv->window);
priv->window = g_value_dup_object (value);
Comment 17 Joe Barnett 2011-03-15 17:25:21 UTC
bug 644753 may be the same thing as this?
Comment 18 Giovanni Campagna 2011-03-15 17:30:11 UTC
*** Bug 644753 has been marked as a duplicate of this bug. ***
Comment 19 Owen Taylor 2011-03-16 19:27:59 UTC
(In reply to comment #16)
> Review of attachment 182796 [details] [review]:
> 
> ::: src/compositor/meta-window-actor.c
> @@ +542,3 @@
> +          g_object_ref (priv->window);
> +        if (old_window)
> +          g_object_unref (old_window);
> 
> Why not write this as:
> 
> if (priv->window)
>   g_object_unref (priv->window);
> priv->window = g_value_dup_object (value);

Think it's good habit to be in not to write that because of refcounting problems when set back to the same value.

Either something like I did or write:

 MetaWindow *new_window = g_value_get_object (value);
 if (new_window != priv->window)
   {
      if (priv->window)
         g_object_unref (priv->window);
      priv->window = g_object_ref (new_window);
   }

are correct idioms. It doesn't matter here, of course.
Comment 20 Owen Taylor 2011-03-16 19:28:48 UTC
Hmm, actually, your variant is fine because the gvalue holds a reference in that case.
Comment 21 Owen Taylor 2011-03-16 19:38:25 UTC
Pushed (with suggested shortening). Though more complicated changes to
NULL out the field might make some sense, I think this is good enough
to close the bug. If problems come up from having a reference to an
unmanaged MetaWindow we can fix later.

Attachment 182796 [details] pushed as 349fb7c - MetaWindowActor: keep a reference to the MetaWindow
Comment 22 Dan Winship 2011-04-05 12:28:49 UTC
*** Bug 637045 has been marked as a duplicate of this bug. ***