After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 751833 - metacity crashes with SIGSEGV in meta_frame_get_frame_bounds
metacity crashes with SIGSEGV in meta_frame_get_frame_bounds
Status: RESOLVED FIXED
Product: metacity
Classification: Other
Component: general
3.17.x
Other Linux
: Normal major
: ---
Assigned To: Metacity maintainers list
Metacity maintainers list
Depends on:
Blocks:
 
 
Reported: 2015-07-02 12:57 UTC by Antonio Ospite
Modified: 2015-07-09 12:57 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
metacity crashes with SIGSEGV in meta_frame_get_frame_bounds (25.22 KB, text/plain)
2015-07-02 12:57 UTC, Antonio Ospite
  Details
window: don't return frame bounds when destroying window (774 bytes, patch)
2015-07-02 13:01 UTC, Alberts Muktupāvels
none Details | Review
compositor: fix possible crash closing/destroying window (3.20 KB, patch)
2015-07-03 18:53 UTC, Alberts Muktupāvels
none Details | Review
compositor: fix possible crash closing/destroying window (3.71 KB, patch)
2015-07-07 10:55 UTC, Alberts Muktupāvels
committed Details | Review

Description Antonio Ospite 2015-07-02 12:57:26 UTC
Created attachment 306613 [details]
metacity crashes with SIGSEGV in meta_frame_get_frame_bounds

Hi,

since I upgraded to metacity 3.17 I get these sporadic crashes.

The first thing I looked at was in dmesg and this is what I got:

traps: metacity[2515] general protection ip:7f1153f2dc00 sp:7fffac774eb8 error:0 in libmetacity-private.so.3.0.0[7f1153ef4000+a0000]

I then investigated further with gdb and got a backtrace of the crash,
please see the attached file.

I can reproduce the bug by opening and _closing_ a dialog multiple times, for example with a sequence like the following:
 1. Execute "zenity --error"
 2. Click OK
 3. If not crashed, goto 1

The number of iterations of the sequence above before the problem occurs is not consistent, but eventually metacity crashes.

I am using the binary compiled by debian, version 3.17.2-3+b1
with these dependencies:
  libglib2.0-0   2.44.1-1.1
  libgtk-3-0     3.16.4-2

The original debian bug report is here:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=790615

Thanks,
   Antonio
Comment 1 Alberts Muktupāvels 2015-07-02 13:01:51 UTC
Created attachment 306614 [details] [review]
window: don't return frame bounds when destroying window
Comment 2 Alberts Muktupāvels 2015-07-02 13:02:36 UTC
Can you test this patch?
Comment 3 Antonio Ospite 2015-07-03 10:07:47 UTC
(In reply to Alberts Muktupāvels from comment #2)
> Can you test this patch?

Hi, the patch is not enough to avoid crashes.

However, something changed in the logs: now the frame address looks like an invalid one when the crash happens:

BEFORE the patch:
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff3433b81 in meta_frame_get_frame_bounds (frame=0x76a630) at core/frame.c:436
436	  return meta_ui_get_frame_bounds (frame->window->screen->ui,
  • #0 meta_frame_get_frame_bounds
    at core/frame.c line 436
  • #0 meta_frame_get_frame_bounds
    at core/frame.c line 436


frame=0x3ff0000000000000 looks suspicious, some other times I get frame=0x4000000000000000, always with the patch applied.

Any idea?

Thanks,
   Antonio
Comment 4 Antonio Ospite 2015-07-03 10:55:42 UTC
No, sorry, my last comment was not accurate, frame=0x3ff000000000000 was showing also in my very first backtrace from the debian report.
Comment 5 Alberts Muktupāvels 2015-07-03 10:57:57 UTC
Thanks for testing! I have other idea how to fix this, but right now I have no time to write patch - will do it later.
Comment 6 Alberts Muktupāvels 2015-07-03 18:53:49 UTC
Created attachment 306763 [details] [review]
compositor: fix possible crash closing/destroying window

Can you test this patch?
Comment 7 Antonio Ospite 2015-07-04 20:15:52 UTC
Hi Alberts,

I tested the patch only for half a few hours after applying this fixup:

diff --git a/src/compositor/compositor-xrender.c b/src/compositor/compositor-xrender.c
index 36e1523..f67a40a 100644
--- a/src/compositor/compositor-xrender.c
+++ b/src/compositor/compositor-xrender.c
@@ -2816,7 +2816,7 @@ xrender_free_window (MetaCompositor *compositor,
   if (frame)
     xwindow = meta_frame_get_xwindow (frame);
   else
-    meta_window_get_xwindow (window);
+    xwindow = meta_window_get_xwindow (window);

   destroy_win (xrc->display, xwindow, FALSE);
 #endif


and metacity did not crash.

I will be testing some more Sunday evening and a full Monday, and then I'll confirm that the fix is robust.

BTW, when submitting the final patch, could you please elaborate more about what you think the cause of the problem was? In the commit message it would be great.
Is it some concurrency issue? I would like to learn more.

Thanks,
   Antonio
Comment 8 Antonio Ospite 2015-07-04 20:16:57 UTC
(In reply to Antonio Ospite from comment #7)
> Hi Alberts,
> 
> I tested the patch only for half a few hours after applying this fixup:
> 

I meant "only for a few hours" :)
Comment 9 Alberts Muktupāvels 2015-07-04 20:27:05 UTC
(In reply to Antonio Ospite from comment #7)
> Hi Alberts,
> 
> I tested the patch only for half a few hours after applying this fixup:
> 
> diff --git a/src/compositor/compositor-xrender.c
> b/src/compositor/compositor-xrender.c
> index 36e1523..f67a40a 100644
> --- a/src/compositor/compositor-xrender.c
> +++ b/src/compositor/compositor-xrender.c
> @@ -2816,7 +2816,7 @@ xrender_free_window (MetaCompositor *compositor,
>    if (frame)
>      xwindow = meta_frame_get_xwindow (frame);
>    else
> -    meta_window_get_xwindow (window);
> +    xwindow = meta_window_get_xwindow (window);
> 
>    destroy_win (xrc->display, xwindow, FALSE);
>  #endif

:( That is why I could not reproduce bug in removed comment... I guess we really want to use only xwindow from frame. I will re-add comment. 

> and metacity did not crash.
> 
> I will be testing some more Sunday evening and a full Monday, and then I'll
> confirm that the fix is robust.

I am running with patch and metacity does not crash. Also I did create app that created/destroyed many GtkDialogs. Without patch it crashed fast.

> BTW, when submitting the final patch, could you please elaborate more about
> what you think the cause of the problem was? In the commit message it would
> be great.
> Is it some concurrency issue? I would like to learn more.

I think that compositor just called that function when windows was destroying. I could get different outputs in gdb with my test app.

> Thanks,
>    Antonio
Comment 10 Antonio Ospite 2015-07-05 08:17:27 UTC
(In reply to Alberts Muktupāvels from comment #9)
> (In reply to Antonio Ospite from comment #7)
> > Hi Alberts,
> > 
> > I tested the patch only for half a few hours after applying this fixup:
> > 
> > diff --git a/src/compositor/compositor-xrender.c
> > b/src/compositor/compositor-xrender.c
> > index 36e1523..f67a40a 100644
> > --- a/src/compositor/compositor-xrender.c
> > +++ b/src/compositor/compositor-xrender.c
> > @@ -2816,7 +2816,7 @@ xrender_free_window (MetaCompositor *compositor,
> >    if (frame)
> >      xwindow = meta_frame_get_xwindow (frame);
> >    else
> > -    meta_window_get_xwindow (window);
> > +    xwindow = meta_window_get_xwindow (window);
> > 
> >    destroy_win (xrc->display, xwindow, FALSE);
> >  #endif
> 
> :( That is why I could not reproduce bug in removed comment... I guess we
> really want to use only xwindow from frame. I will re-add comment. 
> 

I am not sure I follow the reasoning here.

The fixup is  top of your patch in order to avoid a warning about uninitialized xwindow:

  CC       libmetacity_private_la-compositor-xrender.lo
compositor/compositor-xrender.c: In function 'xrender_free_window':
compositor/compositor-xrender.c:2821:3: warning: 'xwindow' may be used uninitialized in this function [-Wmaybe-uninitialized]
   destroy_win (xrc->display, xwindow, FALSE);
   ^

> > Is it some concurrency issue? I would like to learn more.
>
> I think that compositor just called that function when windows was
> destroying. I could get different outputs in gdb with my test app.

I can see from the code that meta_compositor_free_window() was called but it was not implemented, but I still don't understand why this caused the crash.

Please, in the commit message, try to explain the behavior which caused the crash _before_ your patch, and why _after_ your fix the crash does not happen anymore.

Sorry, but I am new to X stuff so I still need some contextual info to get the picture.

Ciao,
   Antonio
Comment 11 Alberts Muktupāvels 2015-07-05 09:25:11 UTC
(In reply to Antonio Ospite from comment #10)
> (In reply to Alberts Muktupāvels from comment #9)
> > (In reply to Antonio Ospite from comment #7)
> > > Hi Alberts,
> > > 
> > > I tested the patch only for half a few hours after applying this fixup:
> > > 
> > > diff --git a/src/compositor/compositor-xrender.c
> > > b/src/compositor/compositor-xrender.c
> > > index 36e1523..f67a40a 100644
> > > --- a/src/compositor/compositor-xrender.c
> > > +++ b/src/compositor/compositor-xrender.c
> > > @@ -2816,7 +2816,7 @@ xrender_free_window (MetaCompositor *compositor,
> > >    if (frame)
> > >      xwindow = meta_frame_get_xwindow (frame);
> > >    else
> > > -    meta_window_get_xwindow (window);
> > > +    xwindow = meta_window_get_xwindow (window);
> > > 
> > >    destroy_win (xrc->display, xwindow, FALSE);
> > >  #endif
> > 
> > :( That is why I could not reproduce bug in removed comment... I guess we
> > really want to use only xwindow from frame. I will re-add comment. 
> > 
> 
> I am not sure I follow the reasoning here.
> 
> The fixup is  top of your patch in order to avoid a warning about
> uninitialized xwindow:
> 
>   CC       libmetacity_private_la-compositor-xrender.lo
> compositor/compositor-xrender.c: In function 'xrender_free_window':
> compositor/compositor-xrender.c:2821:3: warning: 'xwindow' may be used
> uninitialized in this function [-Wmaybe-uninitialized]
>    destroy_win (xrc->display, xwindow, FALSE);
>    ^

I understand, but in this case this will introduce regression. Read comment:
https://git.gnome.org/browse/metacity/tree/src/compositor/compositor-xrender.c#n2804

I am going to restore partially destroy_win call - only for windows with frame, because only these windows cause problem.

> > > Is it some concurrency issue? I would like to learn more.
> >
> > I think that compositor just called that function when windows was
> > destroying. I could get different outputs in gdb with my test app.
> 
> I can see from the code that meta_compositor_free_window() was called but it
> was not implemented, but I still don't understand why this caused the crash.

It was used in past. This did not cause the crash, it just fixes it.

> Please, in the commit message, try to explain the behavior which caused the
> crash _before_ your patch, and why _after_ your fix the crash does not
> happen anymore.

Compositor has compositor_idle_cb that repairs display. Crash was caused by calling meta_window_get_frame_bounds in wrong time - when window was closing / destroying.

destroy_win in xrender_free_window removes this window from windows list. This ensures that compositor does not try to call meta_window_get_frame_bounds on window that is destroying.

> Sorry, but I am new to X stuff so I still need some contextual info to get
> the picture.

I am new to X stuff too...
Comment 12 Antonio Ospite 2015-07-07 07:49:12 UTC
OK, I was forgetting about compositor_idle_cb, thanks.

Anyway, I had zero crashes since I applied the patch, so the basic idea is good.

Just to be clear, I _only_ used the patch with the fixup from above applied, since your original one gave the compilation warning.

Let me know if you want me to test a final version of the fix before you push it.

Thanks,
   Antonio
Comment 13 Alberts Muktupāvels 2015-07-07 10:55:10 UTC
Created attachment 306994 [details] [review]
compositor: fix possible crash closing/destroying window
Comment 14 Alberts Muktupāvels 2015-07-07 10:57:13 UTC
(In reply to Antonio Ospite from comment #12)
> OK, I was forgetting about compositor_idle_cb, thanks.
> 
> Anyway, I had zero crashes since I applied the patch, so the basic idea is
> good.
> 
> Just to be clear, I _only_ used the patch with the fixup from above applied,
> since your original one gave the compilation warning.
> 
> Let me know if you want me to test a final version of the fix before you
> push it.

Ok, final version is attached.
Comment 15 Alberts Muktupāvels 2015-07-09 12:01:07 UTC
This problem has been fixed in the unstable development version. The fix will be available in the next major software release. You may need to upgrade your Linux distribution to obtain that newer version.
Comment 16 Antonio Ospite 2015-07-09 12:57:33 UTC
Thanks Alberts.