After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 727115 - Segfault when running as X11 compositor (process_damage on dead window)
Segfault when running as X11 compositor (process_damage on dead window)
Status: RESOLVED FIXED
Product: mutter
Classification: Core
Component: wayland
unspecified
Other Linux
: Normal normal
: ---
Assigned To: mutter-maint
mutter-maint
Depends on:
Blocks:
 
 
Reported: 2014-03-26 22:34 UTC by drago01
Modified: 2014-03-27 13:06 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
compositor: Don't call process_damage if the window is gone (1.08 KB, patch)
2014-03-27 12:59 UTC, drago01
committed Details | Review

Description drago01 2014-03-26 22:34:22 UTC
drago01> Jasper: ok http://www.fpaste.org/88990/13958701/
<drago01> Jasper: why do we process damage on a non existent window?
* API (~api@92.12.165.83.dynamic.mundo-r.com) hat #gnome-shell betreten
<drago01> Jasper: 
<drago01> master does
<drago01>   if (window_actor == NULL)
<drago01>     return;
<drago01> that code is removed in wayland
* mclasen_afk (~mclasen@pool-108-20-138-235.bstnma.fios.verizon.net) hat #gnome-shell betreten
* rib (~bob@cpc26-heme10-2-0-cust305.9-1.cable.virginm.net) hat #gnome-shell betreten
<Jasper> drago01, where?
<drago01> Jasper: https://git.gnome.org/browse/mutter/tree/src/compositor/compositor.c#n154 vs. https://git.gnome.org/browse/mutter/tree/src/compositor/compositor.c?h=wayland#n139
<Jasper> drago01, I removed it intentionally
<Jasper> drago01, why is MetaWindowActor NULL there?
<drago01> Jasper: got destroyed ?
<Jasper> #1  0x00007fe1cf2779e6 in process_damage (compositor=0x2678ab0, window=0x0, event=0x7fffe7137b90) at compositor/compositor.c:146
<Jasper> that seems utterly broken
* mclasen_afk ist jetzt bekannt als mclasen
<Jasper> we should not be calling process_damage with a NULL window ever
<Jasper> that's why I removed it -- because we need to figure out what's giving us NULL windows rather than just fizzling these requests out
<Jasper> meta_window_get_compositor_private does
<Jasper>   if (!window)
<Jasper>     return NULL;
<Jasper> which is just broken
<Jasper> drago01, can you figure out how to reproduce it?
<drago01> ok lets see
<yoseforb> Jasper: Hey
<Jasper> yoseforb, hey
<yoseforb> Jasper: I just want to say something is of some strange for me - already one week and there isn't even one crashes in my gnome-shell!
<Jasper> your ccache is broken
<yoseforb> I use gnome-shell from git for many time, and this almost never not happen
<Jasper> there is no way that is actually true
<Jasper> yoseforb, oh, you're probably using x11
<Jasper> yeah, we moved all the crashes to the wayland branch
* shanttu hat die Verbindung getrennt (Leaving)
<Jasper> use wayland. that will make things behave as normal again
<drago01> Jasper: trying to raise the pidgin window by clicking on the tray icon
<yoseforb> what this mean yoe move all the crashed to the wayland branch? the crashes is gone? or there is a leak instead?
* aday_ (~aday@5d605f47.skybroadband.com) hat #gnome-shell betreten
<drago01> yoseforb: should I commit some breakages? ;)
<Jasper> according to the internet, GNOME can't possibly work
<Jasper> so go for it drago01
<Jasper> drago01, I assume what happens then is that the window unmanages, and then we damage events for it?
<Jasper> So the meta_display_lookup_x_window returns NULL?
<Jasper> Can you add some print statements there?
<yoseforb> Jasper: but the memory usage is groun up
<yoseforb> now it on 740MB
<Jasper> Are you using any extensions?
<yoseforb> no
<Jasper> Try hitting the trash button in the looking glass.
<Jasper> That forces a GC
<drago01> Jasper: yes that would make sense
<Jasper> drago01, though I have no idea why clicking the tray icon would cause a window to unmanage
<drago01> Jasper: it maps / unmaps it
<Jasper> ugh
<Jasper> instead of using _NET_ACTIVATE_WINDOW?
<yoseforb> Jasper: It still there (just 3MB it gone from the memory usage)
<Jasper> Probably a memory fragmentation issue then
* aday hat die Verbindung getrennt (Ping timeout: 600 seconds)
<drago01> Jasper: not sure about that but it is definitly that window that is null and cannot be found by meta_display_lookup_x_window
<Jasper> drago01, so why are we getting damage events on it
<drago01> Jasper: hmm maybe a timing thing
<drago01> Jasper: i.e window sends a damage event gets destroyed and we handle get the damage event for from before the destruction
<Jasper> drago01, right. Can you add some print statements to meta_window_unmanage / meta_compositor_process_event to figure out if that's the case?
* yoseforb hat die Verbindung getrennt (yoseforb)
* stefano hat die Verbindung getrennt (Leaving)
* owen hat die Verbindung getrennt (Read error: 148 (No route to host))
<drago01> Jasper: it unmanages the pidgin window
<drago01> Jasper: but the damage event seems to be for a different window
<drago01> Jasper: no idea which one the id does not show up in xwininfo -tree -root
* API hat die Verbindung getrennt (Remote closed the connection)
<drago01> Jasper: also
<drago01>  MetaWindow *window = modified != None ? meta_display_lookup_x_window (display, modified) : NULL;
<drago01>       if (meta_compositor_process_event (display->compositor, event, window))
<drago01> there is a case where it always passes NULL
<drago01> which might have been the reason for the null guard in the first place
<drago01> but it makes no sense to do meta_display_lookup_x_window in the calle and in the caller
Comment 1 drago01 2014-03-26 22:34:51 UTC
Program received signal SIGSEGV, Segmentation fault.
meta_window_actor_process_x11_damage (self=0x0, event=0x7fffe7137b90) at compositor/meta-window-actor.c:1559
1559	  if (priv->surface)
(gdb) bt
  • #0 meta_window_actor_process_x11_damage
    at compositor/meta-window-actor.c line 1559
  • #1 process_damage
    at compositor/compositor.c line 146
  • #2 meta_compositor_process_event
    at compositor/compositor.c line 970
  • #3 meta_display_handle_xevent
    at core/events.c line 1943
  • #4 xevent_callback
    at core/events.c line 2231
  • #5 filter_func
    at ui/ui.c line 248
  • #6 gdk_event_apply_filters
    at gdkeventsource.c line 81
  • #7 gdk_event_source_translate_event
    at gdkeventsource.c line 195
  • #8 _gdk_x11_display_queue_events
    at gdkeventsource.c line 338
  • #9 gdk_display_get_event
    at gdkdisplay.c line 320
  • #10 gdk_event_source_dispatch
    at gdkeventsource.c line 360
  • #11 g_main_dispatch
    at gmain.c line 3066
  • #12 g_main_context_dispatch
    at gmain.c line 3641
  • #13 g_main_context_iterate
    at gmain.c line 3712
  • #14 g_main_loop_run
    at gmain.c line 3906
  • #15 meta_run
    at core/main.c line 601
  • #16 main
    at main.c line 437

Comment 2 Jasper St. Pierre (not reading bugmail) 2014-03-26 22:53:32 UTC
(In reply to comment #0)
> <drago01>  MetaWindow *window = modified != None ? meta_display_lookup_x_window
> (display, modified) : NULL;
> <drago01>       if (meta_compositor_process_event (display->compositor, event,
> window))
> <drago01> there is a case where it always passes NULL
> <drago01> which might have been the reason for the null guard in the first
> place
> <drago01> but it makes no sense to do meta_display_lookup_x_window in the calle
> and in the caller

https://git.gnome.org/browse/mutter/tree/src/compositor/compositor.c?h=wayland#n924

We actually always look up the window here for damage events, rather than in display.c. I imagine that at one point, display.c didn't have the damage event base, so it needed to be here.

We really should only be getting damage events on windows we select for. The weird case is that there's a damage in the event queue for a window after it's already unmanaged, but I'm not 100% sure that's what's happening here.
Comment 3 drago01 2014-03-27 09:30:08 UTC
Back in ancient times we had this: https://git.gnome.org/browse/mutter/tree/src/compositor/compositor-clutter.c?id=9b3a0d1a#n1362 and https://git.gnome.org/browse/mutter/tree/src/compositor/compositor-clutter.c?id=9b3a0d1a#n1370

This got removed at some point and we ended up with the NULL check.
Comment 4 drago01 2014-03-27 09:44:18 UTC
Seems like Owen added that in https://git.gnome.org/browse/mutter/commit/?id=c60d4c2bc43d374269f3148bc752a48b6033376a 

Removing the NULL guard in process_damage looks wrong given that history ... we should probably just re add it (it has been there for a reason).
Comment 5 Jasper St. Pierre (not reading bugmail) 2014-03-27 12:24:51 UTC
Yeah, I'm fine with adding back the NULL check if "we got damage for an unmanaged window" is what's going on.

I was concerned with it being generic, though. For stuff like maximize / minimize, we should *never* have a NULL MetaWindowActor. I'm probably also going to remove the NULL check in meta_window_get_compositor_private.
Comment 6 drago01 2014-03-27 12:59:32 UTC
(In reply to comment #5)
> Yeah, I'm fine with adding back the NULL check if "we got damage for an
> unmanaged window" is what's going on.
> 
> I was concerned with it being generic, though. For stuff like maximize /
> minimize, we should *never* have a NULL MetaWindowActor. I'm probably also
> going to remove the NULL check in meta_window_get_compositor_private.

Well the thing here is not a MetaWindow that has not MetaWindowActor but a X window that has no MetaWindow actor in the first place.
Comment 7 drago01 2014-03-27 12:59:42 UTC
Created attachment 273084 [details] [review]
compositor: Don't call process_damage if the window is gone

We might get a damage event for an already unmanaged window calling
process_damage is pointless and causes a crash so simply skip that case.
Comment 8 Jasper St. Pierre (not reading bugmail) 2014-03-27 13:03:34 UTC
Review of attachment 273084 [details] [review]:

If you've verified that this is what's going on (and since it happens when Pidgin unmaps/remaps the window, I'm quite sure it is), LGTM.
Comment 9 drago01 2014-03-27 13:06:48 UTC
Attachment 273084 [details] pushed as 6eeaf09 - compositor: Don't call process_damage if the window is gone