GNOME Bugzilla – Bug 788908
gnome-shell crashed with SIGSEGV in clutter_actor_get_allocation_box (from _st_create_shadow_pipeline_from_actor)
Last modified: 2018-01-04 00:36:28 UTC
Created attachment 361476 [details] Stacktrace I've a VM with gnome-shell running, resizing the guest window multiple times, or changing the monitor settings from g-c-c after a while leads to a crash in: clutter_actor_get_allocation_box where the actor there is an invalid object (as per quick gdb check)
+ Trace 238050
Full stacktrace attached
Do you also have a "backtrace full" you can attach? Also the output of gjs_dumpstack() could be useful.
(In reply to Jonas Ådahl from comment #1) > Do you also have a "backtrace full" you can attach? Also the output of > gjs_dumpstack() could be useful. This is BT full: As per gjs_dumpstack() unfortunately, if I attach to gnome-shell it continues the execution when it segfaults, I don't get why :o.
+ Trace 238051
Similar trace could also be:
+ Trace 238054
(In reply to Marco Trevisan (Treviño) from comment #3) > Similar trace could also be That one is a duplicate of bug 788627.
Mh, yeah... I think they're actually they've the same root cause, so the actor has been destroyed, thus it fails immediately when trying to get data allocation box or it just get some compromised data and then it fails when trying to generate an insanely big texture.
Created attachment 361796 [details] [review] StIcon: only compute shadow pipeline when the texture is properly allocated Creating the shadow pipeline requires the actor to be allocated in order to get its dimensions, however in the current state we just compute it even if this is not the case. This causes _st_create_shadow_pipeline_from_actor (when getting the allocation box) to trigger an allocation cycle, which might lead to a convolution to st_icon_finish_update causing breakage on data as soon as we return from it. Waiting for the texture size change before trying to update the shadow pipeline is a way for avoiding this. Another option is to do simply as we do with other actors, so initializing the pipeline at paint if we don't already have one for current data, which is probably a better thing to do. So let me know if you prefer that way.
Some debugging on this, highlighting the issue: st_icon_set_icon_name (0x555558ad8af0) st_icon_style_changed (0x555556ceb2a0) st_icon_update (0x555556ceb2a0): Object pending texture 0x555557cd7bc0, is floating: 1, refs 1 st_icon_update (0x555556ceb2a0): FINISH NOW 0x555557cd7bc0! st_icon_finish_update (0x555556ceb2a0): destroying icon texture 0x555557cdd7e0 st_icon_finish_update (0x555556ceb2a0): Object icon texture 0x555557cd7bc0, refs 3 priv = 0x555556ceae10, instance priv is 0x555556ceae10 st_icon_update_shadow_pipeline (0x555556ceb2a0) : icon texture 0x555557cd7bc0 _st_create_shadow_pipeline_from_actor: actor 0x555557cd7bc0 | thread 128328 st_icon_style_changed (0x555556ceb2a0) st_icon_update (0x555556ceb2a0): Object pending texture 0x5555557d34d0, is floating: 1, refs 1 st_icon_update (0x555556ceb2a0): FINISH NOW 0x5555557d34d0! st_icon_finish_update (0x555556ceb2a0): destroying icon texture 0x555557cd7bc0 st_icon_finish_update (0x555556ceb2a0): Object icon texture 0x5555557d34d0, refs 3 priv = 0x555556ceae10, instance priv is 0x555556ceae10 st_icon_update_shadow_pipeline (0x555556ceb2a0) : icon texture 0x5555557d34d0 Thread 1 "gnome-shell" received signal SIGSEGV, Segmentation fault. clutter_actor_get_allocation_box (self=self@entry=0x555557cd7bc0, box=box@entry=0x7fffffff3240) at /media/M2/GNOME/mutter/clutter/clutter/clutter-actor.c:9800 9800 *box = self->priv->allocation; (gdb) bt
+ Trace 238073
$2 = (ClutterActor *) 0x5555557d34d0 (gdb) (gdb) call gjs_dumpstack() == Stack trace for context 0x555555969000 == Panel<._removeStyleClassName@resource:///org/gnome/shell/ui/panel.js:1166:9 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 Transparency<._getAlphas@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:592:17 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 Transparency<._updateStyles@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:537:9 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 Transparency<._init@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:353:9 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 _Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5 Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20 ThemeManager<._init@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:59:30 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 _Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5 Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20 DockedDash<._init@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/docking.js:363:30 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 _Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5 Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20 DockManager<._createDocks@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/docking.js:1744:20 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 DockManager<._toggle@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/docking.js:1690:9 wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22 As you can see, basically the same function gets called on an object that has been just destroyed...
Created attachment 361838 [details] [review] StIcon: only compute shadow pipeline when the texture is properly allocated Some cleanups
Review of attachment 361838 [details] [review]: ::: src/st/st-icon.c @@ +277,3 @@ if (priv->shadow_spec) priv->shadow_pipeline = _st_create_shadow_pipeline_from_actor (priv->shadow_spec, priv->icon_texture); + Stray newline. @@ +290,3 @@ + +static void +st_icon_update_shadow_pipeline_on_allocation (StIcon *icon) Shouldn't you be able to do this by setting the ClutterActorClass::allocate vfunc, calling the parents allocate(), then maybe update the shadow pipeline? Thus no need for signals. You'd have to make sure not to eagerly replace valid pipelines only.
Created attachment 361841 [details] [review] StIcon: only compute shadow pipeline when the texture is allocated (at paint) This version is just creating the pipeline at paint, when we're sure about the allocation box of the texture icon actor.
Created attachment 362437 [details] [review] StIcon: only compute shadow pipeline when the texture is properly allocated Patch updated to fix the glitches I found during some testing of the previous version I posted here that was generating the texture at allocation. As you can see here: https://usercontent.irccloud-cdn.com/file/Mpe1Nqgi/out.mp4 (The other version that do it only at paint wasn't affected). Now, the problem here is that if the actor size and the texture size are different we can't use a fast path (as per commit 7015bb2ca975, which is probably also the reason why we get these crashes now), thus we use an offscreen mode. But that mode seem to behave badly when we're in allocation (while it's fine at paint). So, i've added a StPrivateShadowCreateFlags enum where we can define the mode we allow to be used in shadow creation, and thus for st-icon we allow only the texture mode at allocation point. PS: avoiding to do this only for st-icon by using some more checks at allocation was still possible, but I thought it was less elegant and portable
We have four downstream (Fedora) reports that look very similar to this and #788627 , with the "failed to allocate 18446744072098939136" bytes error message; https://bugzilla.redhat.com/show_bug.cgi?id=1526164 has some debugging work done by the reporter. The others (which I've closed as dupes of that bug) are https://bugzilla.redhat.com/show_bug.cgi?id=1508398 , https://bugzilla.redhat.com/show_bug.cgi?id=1506325 and https://bugzilla.redhat.com/show_bug.cgi?id=1502183 . So it'd be great to have a fix for this (these?) bug (bugs?) that we could port downstream. Thanks.
I think all these bugs are the same (and I'm also pretty sure bug 788627 is a duplicate of this). The patch fixes the problem, Jonas already did a first-pass review on IRC, but we were waiting for a final review by Florian. Cheers
I had asked whether there was a clear reproducer a while ago - is there? (The thing is, I'm not a fan of either patch, and the commit message suggests that there may be an easier fix, so I'd like to check that first)
Reporter of #1508398 downstream says "This happens occasionally, I am not able to reproduce it intentionally." Reporter of #1506325 says "When continously maximize and minimize the window of Visual Studio Code using key bindings. (Shell Extension Hide Top Bar enabled)", so you could try that, I guess. Reporter of #1514850 wrote "Right after booting the system, I started Thunderbird and Firefox. Gnome Shell crashed during Firefox startup. This is the second time this happend with Fedora 27, though I don't remember it ever happening with Fedora 26." #1515926: "Opening gnucash". Seems too simple to be a reliable reproducer, but hey, could try it. #1516253: "Error after boot". #1516633: "Just after log in". #1517234, #1525979, #1502183: nothing. Reporter of #1526164 wrote "Running dual monitors. I have a TRENDnet KVM switch that I use to switch my primary monitor between this workstation and another. This crash happened when I switched _to_ this workstation. Then my session crashed and went to the login screen" - I've asked if it's reliably reproducible, but sounds like his case may depend on that hardware KVM switch.
(In reply to Florian Müllner from comment #14) > I had asked whether there was a clear reproducer a while ago - is there? In my case, to reproduce this I only have to run g-s in a virtual machine, open a window, then another one who's going to be maximized. Then resize the VM to request a resolution change. Not sure if that works 100% in all the setups, but here is always happening. > (The thing is, I'm not a fan of either patch, and the commit message > suggests that there may be an easier fix, so I'd like to check that first) Attachment 361841 [details] was the simplest that worked. So basically allocating the shadow at paint, but I would have preferred to do it earlier when possible, thus the other version. If you want that to be structured differently, let me know... I'm open to refactor things, just let me know what's your ideal way.
If it is indeed the same thing, I was able to trigger crashes simply by opening and closing my laptop lid quickly. See bug 788627 and last comments there. Not tried that for a while (at the time I was experimenting to see what would make GNOME Shell get my monitor set-up correct).
*** Bug 788627 has been marked as a duplicate of this bug. ***
Note I've been running with attachment 362437 [details] [review] for a few weeks now and the issue appears to be resolved for me (came here from https://bugzilla.redhat.com/show_bug.cgi?id=1526164)