After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 788908 - gnome-shell crashed with SIGSEGV in clutter_actor_get_allocation_box (from _st_create_shadow_pipeline_from_actor)
gnome-shell crashed with SIGSEGV in clutter_actor_get_allocation_box (from _s...
Status: RESOLVED FIXED
Product: gnome-shell
Classification: Core
Component: st
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gnome-shell-maint
gnome-shell-maint
Depends on:
Blocks:
 
 
Reported: 2017-10-13 03:16 UTC by Marco Trevisan (Treviño)
Modified: 2019-10-10 15:21 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Stacktrace (10.18 KB, text/plain)
2017-10-13 03:16 UTC, Marco Trevisan (Treviño)
  Details
StIcon: only compute shadow pipeline when the texture is properly allocated (2.16 KB, patch)
2017-10-18 10:34 UTC, Marco Trevisan (Treviño)
none Details | Review
StIcon: only compute shadow pipeline when the texture is properly allocated (2.57 KB, patch)
2017-10-19 06:24 UTC, Marco Trevisan (Treviño)
none Details | Review
StIcon: only compute shadow pipeline when the texture is allocated (at paint) (4.67 KB, patch)
2017-10-19 07:22 UTC, Marco Trevisan (Treviño)
none Details | Review
StIcon: only compute shadow pipeline when the texture is properly allocated (10.83 KB, patch)
2017-10-28 05:09 UTC, Marco Trevisan (Treviño)
reviewed Details | Review
Stack trace of gnome-shell crashing, from 2018-04-02 (37.97 KB, text/plain)
2018-04-02 22:56 UTC, eomanis
  Details

Description Marco Trevisan (Treviño) 2017-10-13 03:16:28 UTC
Created attachment 361476 [details]
Stacktrace

I've a VM with gnome-shell running, resizing the guest window multiple times, or changing the monitor settings from g-c-c after a while leads to a crash in:

clutter_actor_get_allocation_box where the actor there is an invalid object (as per quick gdb check)

  • #0 clutter_actor_get_allocation_box
    at mutter/clutter/clutter/clutter-actor.c line 9800
  • #1 _st_create_shadow_pipeline_from_actor
    at gnome-shell/src/st/st-private.c line 420
  • #2 st_icon_update_shadow_pipeline
    at gnome-shell/src/st/st-icon.c line 278
  • #3 st_icon_finish_update
    at gnome-shell/src/st/st-icon.c line 310
  • #4 st_icon_update
    at gnome-shell/src/st/st-icon.c line 381

Full stacktrace attached
Comment 1 Jonas Ådahl 2017-10-13 03:42:35 UTC
Do you also have a "backtrace full" you can attach? Also the output of gjs_dumpstack() could be useful.
Comment 2 Marco Trevisan (Treviño) 2017-10-13 05:27:58 UTC
(In reply to Jonas Ådahl from comment #1)
> Do you also have a "backtrace full" you can attach? Also the output of
> gjs_dumpstack() could be useful.

This is BT full:

As per gjs_dumpstack() unfortunately, if I attach to gnome-shell it continues the execution when it segfaults, I don't get why :o.

  • #0 clutter_actor_get_allocation_box
    at /media/M2/GNOME/mutter/clutter/clutter/clutter-actor.c line 9800
  • #1 _st_create_shadow_pipeline_from_actor
    at ../../gnome-shell/src/st/st-private.c line 420
  • #2 st_icon_update_shadow_pipeline
    at ../../gnome-shell/src/st/st-icon.c line 278
  • #3 st_icon_finish_update
    at ../../gnome-shell/src/st/st-icon.c line 310
  • #4 st_icon_update
    at ../../gnome-shell/src/st/st-icon.c line 381
  • #5 _g_closure_invoke_va
    at /media/M2/GNOME/glib/gobject/gclosure.c line 867
  • #6 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3300
  • #7 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #8 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #9 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #10 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #11 _g_closure_invoke_va
    at /media/M2/GNOME/glib/gobject/gclosure.c line 867
  • #12 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3300
  • #13 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #14 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #15 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #16 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #17 _g_closure_invoke_va
    at /media/M2/GNOME/glib/gobject/gclosure.c line 867
  • #18 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3300
  • #19 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #20 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #21 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #22 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #23 g_closure_invoke
    at /media/M2/GNOME/glib/gobject/gclosure.c line 804
  • #24 signal_emit_unlocked_R
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3673
  • #25 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3391
  • #26 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #27 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #28 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #29 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #30 g_closure_invoke
    at /media/M2/GNOME/glib/gobject/gclosure.c line 804
  • #31 signal_emit_unlocked_R
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3673
  • #32 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3391
  • #33 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #34 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #35 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #36 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #37 _g_closure_invoke_va
    at /media/M2/GNOME/glib/gobject/gclosure.c line 867
  • #38 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3300
  • #39 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #40 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #41 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #42 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #43 _g_closure_invoke_va
    at /media/M2/GNOME/glib/gobject/gclosure.c line 867
  • #44 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3300
  • #45 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #46 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #47 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #48 notify_children_of_style_change
    at ../../gnome-shell/src/st/st-widget.c line 525
  • #49 _g_closure_invoke_va
    at /media/M2/GNOME/glib/gobject/gclosure.c line 867
  • #50 g_signal_emit_valist
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3300
  • #51 g_signal_emit
    at /media/M2/GNOME/glib/gobject/gsignal.c line 3447
  • #52 st_widget_recompute_style
    at ../../gnome-shell/src/st/st-widget.c line 1659
  • #53 st_widget_style_changed
    at ../../gnome-shell/src/st/st-widget.c line 553
  • #54 st_widget_remove_style_class_name
    at ../../gnome-shell/src/st/st-widget.c line 1262
  • #55 ffi_call_unix64
    from /usr/lib/x86_64-linux-gnu/libffi.so.6
  • #56 ffi_call
    from /usr/lib/x86_64-linux-gnu/libffi.so.6
  • #57 gjs_invoke_c_function
    at /media/M2/GNOME/gjs/gi/function.cpp line 1030
  • #58 function_call
    at /media/M2/GNOME/gjs/gi/function.cpp line 1349

Comment 3 Marco Trevisan (Treviño) 2017-10-13 23:28:57 UTC
Similar trace could also be:

  • #0 _g_log_abort
    at ../../../../glib/gmessages.c line 554
  • #1 g_log_default_handler
    at ../../../../glib/gmessages.c line 3051
  • #2 default_log_handler
    at ../src/main.c line 311
  • #3 g_logv
    at ../../../../glib/gmessages.c line 1341
  • #4 g_log
    at ../../../../glib/gmessages.c line 1403
  • #5 g_malloc0
    at ../../../../glib/gmem.c line 129
  • #6 blur_pixels
    at ../src/st/st-private.c line 280
  • #7 _st_create_shadow_pipeline
    at ../src/st/st-private.c line 372
  • #8 _st_create_shadow_pipeline_from_actor
    at ../src/st/st-private.c line 434
  • #9 st_icon_update_shadow_pipeline
    at ../src/st/st-icon.c line 278
  • #10 st_icon_finish_update
    at ../src/st/st-icon.c line 310
  • #11 st_icon_update
    at ../src/st/st-icon.c line 381
  • #12 _g_closure_invoke_va
    at ../../../../gobject/gclosure.c line 867
  • #13 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3300
  • #14 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #15 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #16 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #17 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #18 _g_closure_invoke_va
    at ../../../../gobject/gclosure.c line 867
  • #19 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3300
  • #20 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #21 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #22 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #23 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #24 _g_closure_invoke_va
    at ../../../../gobject/gclosure.c line 867
  • #25 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3300
  • #26 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #27 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #28 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #29 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #30 g_closure_invoke
    at ../../../../gobject/gclosure.c line 804
  • #31 signal_emit_unlocked_R
    at ../../../../gobject/gsignal.c line 3673
  • #32 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3391
  • #33 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #34 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #35 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #36 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #37 g_closure_invoke
    at ../../../../gobject/gclosure.c line 804
  • #38 signal_emit_unlocked_R
    at ../../../../gobject/gsignal.c line 3673
  • #39 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3391
  • #40 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #41 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #42 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #43 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #44 _g_closure_invoke_va
    at ../../../../gobject/gclosure.c line 867
  • #45 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3300
  • #46 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #47 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #48 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #49 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #50 _g_closure_invoke_va
    at ../../../../gobject/gclosure.c line 867
  • #51 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3300
  • #52 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #53 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #54 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #55 notify_children_of_style_change
    at ../src/st/st-widget.c line 525
  • #56 g_closure_invoke
    at ../../../../gobject/gclosure.c line 804
  • #57 signal_emit_unlocked_R
    at ../../../../gobject/gsignal.c line 3673
  • #58 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3391
  • #59 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #60 st_widget_recompute_style
    at ../src/st/st-widget.c line 1659
  • #61 st_widget_style_changed
    at ../src/st/st-widget.c line 553
  • #62 st_widget_add_style_class_name
    at ../src/st/st-widget.c line 1236
  • #63 ffi_call_unix64
    at ../src/x86/unix64.S line 76
  • #64 ffi_call
    at ../src/x86/ffi64.c line 525
  • #65 gjs_invoke_c_function
    at gi/function.cpp line 1033
  • #66 function_call
    at gi/function.cpp line 1351
  • #67 ??
  • #68 ??
  • #69 ??
  • #70 ??
  • #71 ??

Comment 4 Florian Müllner 2017-10-13 23:47:58 UTC
(In reply to Marco Trevisan (Treviño) from comment #3)
> Similar trace could also be

That one is a duplicate of bug 788627.
Comment 5 Marco Trevisan (Treviño) 2017-10-14 00:42:38 UTC
Mh, yeah... I think they're actually they've the same root cause, so the actor has been destroyed, thus it fails immediately when trying to get data allocation box or it just get some compromised data and then it fails when trying to generate an insanely big texture.
Comment 6 Marco Trevisan (Treviño) 2017-10-18 10:34:58 UTC
Created attachment 361796 [details] [review]
StIcon: only compute shadow pipeline when the texture is properly allocated

Creating the shadow pipeline requires the actor to be allocated in order
to get its dimensions, however in the current state we just compute it
even if this is not the case.
This causes _st_create_shadow_pipeline_from_actor (when getting the allocation
box) to trigger an allocation cycle, which might lead to a convolution to
st_icon_finish_update causing breakage on data as soon as we return from it.

Waiting for the texture size change before trying to update the shadow pipeline
is a way for avoiding this.

Another option is to do simply as we do with other actors, so initializing
the pipeline at paint if we don't already have one for current data,
which is probably a better thing to do.

So let me know if you prefer that way.
Comment 7 Marco Trevisan (Treviño) 2017-10-18 10:35:16 UTC
Some debugging on this, highlighting the issue:

st_icon_set_icon_name (0x555558ad8af0)
st_icon_style_changed (0x555556ceb2a0)
st_icon_update (0x555556ceb2a0): Object pending texture 0x555557cd7bc0, is floating: 1, refs 1
st_icon_update (0x555556ceb2a0): FINISH NOW 0x555557cd7bc0!
st_icon_finish_update (0x555556ceb2a0): destroying icon texture 0x555557cdd7e0
st_icon_finish_update (0x555556ceb2a0): Object icon texture 0x555557cd7bc0, refs 3
priv = 0x555556ceae10, instance priv is 0x555556ceae10
st_icon_update_shadow_pipeline (0x555556ceb2a0) : icon texture 0x555557cd7bc0
_st_create_shadow_pipeline_from_actor: actor 0x555557cd7bc0 | thread 128328
st_icon_style_changed (0x555556ceb2a0)
st_icon_update (0x555556ceb2a0): Object pending texture 0x5555557d34d0, is floating: 1, refs 1
st_icon_update (0x555556ceb2a0): FINISH NOW 0x5555557d34d0!
st_icon_finish_update (0x555556ceb2a0): destroying icon texture 0x555557cd7bc0
st_icon_finish_update (0x555556ceb2a0): Object icon texture 0x5555557d34d0, refs 3
priv = 0x555556ceae10, instance priv is 0x555556ceae10
st_icon_update_shadow_pipeline (0x555556ceb2a0) : icon texture 0x5555557d34d0

Thread 1 "gnome-shell" received signal SIGSEGV, Segmentation fault.
clutter_actor_get_allocation_box (self=self@entry=0x555557cd7bc0, box=box@entry=0x7fffffff3240)
    at /media/M2/GNOME/mutter/clutter/clutter/clutter-actor.c:9800
9800      *box = self->priv->allocation;
(gdb) bt
  • #0 clutter_actor_get_allocation_box
    at /media/M2/GNOME/mutter/clutter/clutter/clutter-actor.c line 9800
  • #1 _st_create_shadow_pipeline_from_actor
    at ../../gnome-shell/src/st/st-private.c line 423
  • #2 st_icon_update_shadow_pipeline
    at ../../gnome-shell/src/st/st-icon.c line 282
  • #3 st_icon_finish_update
    at ../../gnome-shell/src/st/st-icon.c line 318
  • #4 st_icon_update
    at ../../gnome-shell/src/st/st-icon.c line 395
  • #2 st_icon_update_shadow_pipeline
    at ../../gnome-shell/src/st/st-icon.c line 282
$2 = (ClutterActor *) 0x5555557d34d0                            
(gdb)

(gdb) call gjs_dumpstack()
== Stack trace for context 0x555555969000 ==
Panel<._removeStyleClassName@resource:///org/gnome/shell/ui/panel.js:1166:9
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
Transparency<._getAlphas@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:592:17
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
Transparency<._updateStyles@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:537:9
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
Transparency<._init@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:353:9
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
_Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5
Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20
ThemeManager<._init@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/theming.js:59:30
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
_Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5
Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20
DockedDash<._init@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/docking.js:363:30
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
_Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110:5
Class.prototype._construct/newClassConstructor@resource:///org/gnome/gjs/modules/lang.js:213:20
DockManager<._createDocks@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/docking.js:1744:20
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22
DockManager<._toggle@/usr/share/gnome-shell/extensions/ubuntu-dock@ubuntu.com/docking.js:1690:9
wrapper@resource:///org/gnome/gjs/modules/lang.js:178:22


As you can see, basically the same function gets called on an object that has been just destroyed...
Comment 8 Marco Trevisan (Treviño) 2017-10-19 06:24:38 UTC
Created attachment 361838 [details] [review]
StIcon: only compute shadow pipeline when the texture is properly allocated

Some cleanups
Comment 9 Jonas Ådahl 2017-10-19 06:43:05 UTC
Review of attachment 361838 [details] [review]:

::: src/st/st-icon.c
@@ +277,3 @@
   if (priv->shadow_spec)
    priv->shadow_pipeline = _st_create_shadow_pipeline_from_actor (priv->shadow_spec, priv->icon_texture);
+

Stray newline.

@@ +290,3 @@
+
+static void
+st_icon_update_shadow_pipeline_on_allocation (StIcon *icon)

Shouldn't you be able to do this by setting the ClutterActorClass::allocate vfunc, calling the parents allocate(), then maybe update the shadow pipeline? Thus no need for signals. You'd have to make sure not to eagerly replace valid pipelines only.
Comment 10 Marco Trevisan (Treviño) 2017-10-19 07:22:13 UTC
Created attachment 361841 [details] [review]
StIcon: only compute shadow pipeline when the texture is allocated (at paint)

This version is just creating the pipeline at paint, when we're sure
about the allocation box of the texture icon actor.
Comment 11 Marco Trevisan (Treviño) 2017-10-28 05:09:31 UTC
Created attachment 362437 [details] [review]
StIcon: only compute shadow pipeline when the texture is properly allocated

Patch updated to fix the glitches I found during some testing of the previous
version I posted here that was generating the texture at allocation.
As you can see here: https://usercontent.irccloud-cdn.com/file/Mpe1Nqgi/out.mp4
(The other version that do it only at paint wasn't affected).

Now, the problem here is that if the actor size and the texture size are different
we can't use a fast path (as per commit 7015bb2ca975, which is probably also the
reason why we get these crashes now), thus we use an offscreen mode.
But that mode seem to behave badly when we're in allocation (while it's fine
at paint).

So, i've added a StPrivateShadowCreateFlags enum where we can define the mode
we allow to be used in shadow creation, and thus for st-icon we allow only
the texture mode at allocation point.

PS: avoiding to do this only for st-icon by using some more checks at
    allocation was still possible, but I thought it was less elegant
    and portable
Comment 12 Adam Williamson 2017-12-14 23:28:56 UTC
We have four downstream (Fedora) reports that look very similar to this and #788627 , with the "failed to allocate 18446744072098939136" bytes error message; https://bugzilla.redhat.com/show_bug.cgi?id=1526164 has some debugging work done by the reporter. The others (which I've closed as dupes of that bug) are https://bugzilla.redhat.com/show_bug.cgi?id=1508398 , https://bugzilla.redhat.com/show_bug.cgi?id=1506325 and https://bugzilla.redhat.com/show_bug.cgi?id=1502183 . So it'd be great to have a fix for this (these?) bug (bugs?) that we could port downstream. Thanks.
Comment 13 Marco Trevisan (Treviño) 2017-12-14 23:56:49 UTC
I think all these bugs are the same (and I'm also pretty sure bug 788627 is a duplicate of this).

The patch fixes the problem, Jonas already did a first-pass review on IRC, but we were waiting for a final review by Florian.

Cheers
Comment 14 Florian Müllner 2017-12-15 12:29:50 UTC
I had asked whether there was a clear reproducer a while ago - is there?

(The thing is, I'm not a fan of either patch, and the commit message suggests that there may be an easier fix, so I'd like to check that first)
Comment 15 Adam Williamson 2017-12-15 19:01:23 UTC
Reporter of #1508398 downstream says "This happens occasionally, I am not able to reproduce it intentionally."

Reporter of #1506325 says "When continously maximize and minimize the window of Visual Studio Code using key bindings. (Shell Extension Hide Top Bar enabled)", so you could try that, I guess.

Reporter of #1514850 wrote "Right after booting the system, I started Thunderbird and Firefox. Gnome Shell crashed during Firefox startup. This is the second time this happend with Fedora 27, though I don't remember it ever happening with Fedora 26."

#1515926: "Opening gnucash". Seems too simple to be a reliable reproducer, but hey, could try it.

#1516253: "Error after boot".

#1516633: "Just after log in".

#1517234, #1525979, #1502183: nothing.

Reporter of #1526164 wrote "Running dual monitors. I have a TRENDnet KVM switch that I use to switch my primary monitor between this workstation and another. This crash happened when I switched _to_ this workstation. Then my session crashed and went to the login screen" - I've asked if it's reliably reproducible, but sounds like his case may depend on that hardware KVM switch.
Comment 16 Marco Trevisan (Treviño) 2017-12-15 19:05:59 UTC
(In reply to Florian Müllner from comment #14)
> I had asked whether there was a clear reproducer a while ago - is there?

In my case, to reproduce this I only have to run g-s in a virtual machine, open a window, then another one who's going to be maximized. Then resize the VM to request a resolution change.

Not sure if that works 100% in all the setups, but here is always happening.

> (The thing is, I'm not a fan of either patch, and the commit message
> suggests that there may be an easier fix, so I'd like to check that first)

Attachment 361841 [details] was the simplest that worked.
So basically allocating the shadow at paint, but I would have preferred to do it earlier when possible, thus the other version.

If you want that to be structured differently, let me know... I'm open to refactor things, just let me know what's your ideal way.
Comment 17 Michael Thayer 2017-12-15 19:12:08 UTC
If it is indeed the same thing, I was able to trigger crashes simply by opening and closing my laptop lid quickly.  See bug 788627 and last comments there.  Not tried that for a while (at the time I was experimenting to see what would make GNOME Shell get my monitor set-up correct).
Comment 18 Marco Trevisan (Treviño) 2017-12-15 21:51:31 UTC
*** Bug 788627 has been marked as a duplicate of this bug. ***
Comment 19 kevin 2018-01-04 00:36:28 UTC
Note I've been running with attachment 362437 [details] [review] for a few weeks now and the issue appears to be resolved for me (came here from https://bugzilla.redhat.com/show_bug.cgi?id=1526164)
Comment 20 Jonas Ådahl 2018-01-24 05:44:18 UTC
Review of attachment 362437 [details] [review]:

::: src/st/st-icon.c
@@ +64,3 @@
 static void st_icon_update               (StIcon *icon);
 static gboolean st_icon_update_icon_size (StIcon *icon);
+static void st_icon_update_shadow_pipeline (StIcon *icon, StPrivateShadowCreateFlags flags);

nit: newline after St *icon, and parameter variable names should be aligned.

@@ +169,3 @@
+
+  st_icon_update_shadow_pipeline (icon, ST_SHADOW_TEXTURE_MODE);
+}

Looks a bit of a missuse of the allocate vfunc. Shouldn't it be enough to update the pipeline on paint (especially if we potentially update the pipeline immediately when the pixbuf is ready).

@@ +181,3 @@
   if (priv->icon_texture)
     {
+      st_icon_update_shadow_pipeline (icon, ST_SHADOW_ANY_MODE);

Why would you want to use ANY_MODE here? Drawing nothing (in case no texture is available) offscreen should still result in nothing I'd assume?

@@ +295,3 @@
+static void
+st_icon_update_shadow_pipeline (StIcon *icon,
+                                StPrivateShadowCreateFlags shadow_flags)

nit: parameter variable names should be aligned

@@ +316,3 @@
+            _st_create_shadow_pipeline_from_actor (priv->shadow_spec,
+                                                   priv->icon_texture,
+                                                   shadow_flags);

I'd rather see we only call _st_create_shadow_pipeline_from_actor() if we know it wont fail (i.e. the texture actor has a texture). Then the create_function will actually work, and not unexpectedly return NULL.

That'd mean we don't need to add a flag the shadow_flag to the create_shadow_pipeline() function.

@@ +328,3 @@
                    StIcon         *icon)
 {
+  st_icon_clear_shadow_pipeline (icon);

Shouldn't it be possible to clear() and then update() immediately? I suppose we'd need to add a check for clutter_actor_has_allocation() in update() (and also rename to maybe_update() I guess.

::: src/st/st-private.c
@@ +413,3 @@
 _st_create_shadow_pipeline_from_actor (StShadow     *shadow_spec,
+                                       ClutterActor *actor,
+                                       StPrivateShadowCreateFlags flags)

nit: parameter variable names should stay aligned (also in the header)
Comment 21 Marco Trevisan (Treviño) 2018-01-24 18:38:50 UTC
Review of attachment 362437 [details] [review]:

Let me know if I should keep the enum modes around or not, and see the comments.

::: src/st/st-icon.c
@@ +64,3 @@
 static void st_icon_update               (StIcon *icon);
 static gboolean st_icon_update_icon_size (StIcon *icon);
+static void st_icon_update_shadow_pipeline (StIcon *icon, StPrivateShadowCreateFlags flags);

So you want me to move the lines above too?

@@ +169,3 @@
+
+  st_icon_update_shadow_pipeline (icon, ST_SHADOW_TEXTURE_MODE);
+}

Indeed we can, I just tried to replicate what we did before, where the pipeline was updated as soon as possible. But since that leads to the crash we know because of missing allocation infos, we try now doing it at allocation time.

But if we prefer just doing it at paint it's ok... It also seem that the glitches I mentioned in comment #11 aren't visible anymore (although sometimes I notice some tiny flickering).

In that case I can get rid of the StPrivateShadowCreateFlags completely, as it was the only reason for being there.

@@ +181,3 @@
   if (priv->icon_texture)
     {
+      st_icon_update_shadow_pipeline (icon, ST_SHADOW_ANY_MODE);

Yeah, I was guessing the same... But I just kept using the same codepath we were using before: try with the texture, if it's not there, just use the offscreen paint.

In fact, in this case we could probably be just more restrictive, but...

There's just a problem, and we still have with current codebase (just not underlined because we've the function in such way): when using the TEXTURE_MODE for some icons the shadow pipeline is not generated (see https://i.imgur.com/rvBrjjK.png, while this is when using the ANY_MODE: https://i.imgur.com/B7gGzTL.png). Reason for this is that (texture_width == width && texture_height == height) fails. The status is this one:

Actor size is 16.000000x16.000000, scaled size is 16.000000x16.000000, texture size is 48x48, shadow pipeline: (nil).

Explaination why that mode is used is explained in commit 7015bb2ca (as per fixing Bug 788039).

@@ +316,3 @@
+            _st_create_shadow_pipeline_from_actor (priv->shadow_spec,
+                                                   priv->icon_texture,
+                                                   shadow_flags);

Well, true, but it's even true that I didn't want to duplicate the same checks in both places, plus even _st_create_shadow_pipeline might fail (basically never, but still, we don't have anything ensuring it's not NULL), so I'd prefer to leave such more specific checks at lower levels than here.

It's even true that the flag might not needed anymore, so...

@@ +328,3 @@
                    StIcon         *icon)
 {
+  st_icon_clear_shadow_pipeline (icon);

So in case it's not allocated, you'd still queue a relayout/redraw?
Comment 22 Jonas Ådahl 2018-01-25 01:55:35 UTC
(In reply to Marco Trevisan (Treviño) from comment #21)
> Review of attachment 362437 [details] [review] [review]:
> 
> Let me know if I should keep the enum modes around or not, and see the
> comments.
> 
> ::: src/st/st-icon.c
> @@ +64,3 @@
>  static void st_icon_update               (StIcon *icon);
>  static gboolean st_icon_update_icon_size (StIcon *icon);
> +static void st_icon_update_shadow_pipeline (StIcon *icon,
> StPrivateShadowCreateFlags flags);
> 
> So you want me to move the lines above too?

Personally, I don't think that's needed.

> 
> @@ +169,3 @@
> +
> +  st_icon_update_shadow_pipeline (icon, ST_SHADOW_TEXTURE_MODE);
> +}
> 
> Indeed we can, I just tried to replicate what we did before, where the
> pipeline was updated as soon as possible. But since that leads to the crash
> we know because of missing allocation infos, we try now doing it at
> allocation time.
> 
> But if we prefer just doing it at paint it's ok... It also seem that the
> glitches I mentioned in comment #11 aren't visible anymore (although
> sometimes I notice some tiny flickering).

Could the flickering be icons loading slowly?

Anyway, I think using vfuncs only for what they are intended to be used for seems best. I guess we could schedule an g_idle (that we cancel on paint if not triggered) that'll load things, but might be too much work for too little gain.

> 
> In that case I can get rid of the StPrivateShadowCreateFlags completely, as
> it was the only reason for being there.
> 
> @@ +181,3 @@
>    if (priv->icon_texture)
>      {
> +      st_icon_update_shadow_pipeline (icon, ST_SHADOW_ANY_MODE);
> 
> Yeah, I was guessing the same... But I just kept using the same codepath we
> were using before: try with the texture, if it's not there, just use the
> offscreen paint.
> 
> In fact, in this case we could probably be just more restrictive, but...
> 
> There's just a problem, and we still have with current codebase (just not
> underlined because we've the function in such way): when using the
> TEXTURE_MODE for some icons the shadow pipeline is not generated (see
> https://i.imgur.com/rvBrjjK.png, while this is when using the ANY_MODE:
> https://i.imgur.com/B7gGzTL.png). Reason for this is that (texture_width ==
> width && texture_height == height) fails. The status is this one:
> 
> Actor size is 16.000000x16.000000, scaled size is 16.000000x16.000000,
> texture size is 48x48, shadow pipeline: (nil).
> 
> Explaination why that mode is used is explained in commit 7015bb2ca (as per
> fixing Bug 788039).

Could it then be that if we create the pipeline too early (e.g. it's allocated but the pixbuf is still pending), we'll create the shadow pipeline by using the offscreen, which happens to be empty -> no shadow?

Only ever trying to create the pipeline when all requirements are met, that only st-icon.c knows about (cogl texture available (no matter if it'll be used directly or via an offscreen), icon actor allocated) would avoid that issue.

> 
> @@ +316,3 @@
> +            _st_create_shadow_pipeline_from_actor (priv->shadow_spec,
> +                                                   priv->icon_texture,
> +                                                   shadow_flags);
> 
> Well, true, but it's even true that I didn't want to duplicate the same
> checks in both places, plus even _st_create_shadow_pipeline might fail
> (basically never, but still, we don't have anything ensuring it's not NULL),
> so I'd prefer to leave such more specific checks at lower levels than here.
> 
> It's even true that the flag might not needed anymore, so...
> 
> @@ +328,3 @@
>                     StIcon         *icon)
>  {
> +  st_icon_clear_shadow_pipeline (icon);
> 
> So in case it's not allocated, you'd still queue a relayout/redraw?

Yes, we need to redraw anyway since the pixbuf changed.
Comment 23 Marco Trevisan (Treviño) 2018-01-25 14:14:42 UTC
(In reply to Jonas Ådahl from comment #22)
> > Indeed we can, I just tried to replicate what we did before, where the
> > pipeline was updated as soon as possible. But since that leads to the crash
> > we know because of missing allocation infos, we try now doing it at
> > allocation time.
> > 
> > But if we prefer just doing it at paint it's ok... It also seem that the
> > glitches I mentioned in comment #11 aren't visible anymore (although
> > sometimes I notice some tiny flickering).
> 
> Could the flickering be icons loading slowly?

Mh, might be, but maybe it was just an impression... I can't replicate it right now.
So I guess we're all set.

> Anyway, I think using vfuncs only for what they are intended to be used for
> seems best. I guess we could schedule an g_idle (that we cancel on paint if
> not triggered) that'll load things, but might be too much work for too
> little gain.

Yeah, ok... I generally prefer to load before paint, especially if it's something involving some cycles like blurring, but if that's ok with you, we can just do it at paint.

> > Explaination why that mode is used is explained in commit 7015bb2ca (as per
> > fixing Bug 788039).
> 
> Could it then be that if we create the pipeline too early (e.g. it's
> allocated but the pixbuf is still pending), we'll create the shadow pipeline
> by using the offscreen, which happens to be empty -> no shadow?

No I think it's not the case. At that stage the priv->pending_texture is NULL while priv->icon_texture is set, so I think that's not the case. It's just that the icon itself than the actor is in, it seems, as it's centered on that.
Comment 24 Marco Trevisan (Treviño) 2018-01-25 14:39:52 UTC
Let's continue in https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3
Comment 25 Florian Müllner 2018-01-25 16:29:47 UTC
Reopening for discussion not covered by the merge request
Comment 26 Florian Müllner 2018-01-25 16:35:03 UTC
(In reply to Florian Müllner from comment #14)
> (The thing is, I'm not a fan of either patch, and the commit message
> suggests that there may be an easier fix, so I'd like to check that first)

What I was referring to is that _st_create_shadow_pipeline_from_actor() has undefined behavior likely leading to a crash under certain (not quite clear) conditions. Right now this seems to be only triggered by the use in st-icon, so fixing that isn't wrong. However it would be better to fix the function to always have well defined behavior - that is, return a valid shadow pipeline or NULL - to make sure that the next code that uses the function doesn't run into the same trap.
Comment 27 Marco Trevisan (Treviño) 2018-01-25 16:40:32 UTC
(In reply to Florian Müllner from comment #26)
> (In reply to Florian Müllner from comment #14)
> > (The thing is, I'm not a fan of either patch, and the commit message
> > suggests that there may be an easier fix, so I'd like to check that first)
> 
> What I was referring to is that _st_create_shadow_pipeline_from_actor() has
> undefined behavior likely leading to a crash under certain (not quite clear)
> conditions.

Agree, however as I explained in comments or commits above, what I've noticed is that this is happening so far (at least in st-icon), when trying to allocate an actor in order to get its allocation box, as this is leading to some convolution.

It might not be the only cause, but at least adding the allocation check on that function seems to be enough here (included in the 1st patch).
Comment 28 Florian Müllner 2018-01-25 16:41:55 UTC
(In reply to Florian Müllner from comment #26)
> certain (not quite clear) conditions

I say they conditions are not quite clear, because I cannot reproduce the crash, however with an early check for clutter_actor_has_allocation(), all icon shadows disappear - so not having a valid allocation when calling the function doesn't trigger the crash by itself.

Also according to https://gitlab.gnome.org/GNOME/mutter/blob/master/clutter/clutter/clutter-actor.c#L9787, the call to get_allocation_box() itself should trigger the allocation ...
Comment 29 Florian Müllner 2018-01-25 16:45:19 UTC
(In reply to Marco Trevisan (Treviño) from comment #27)
> It might not be the only cause, but at least adding the allocation check on
> that function seems to be enough here (included in the 1st patch).

Sure, but I'm still curious - maybe it's not the needs_allocation flag that is the problem, but one of the other checks in clutter_actor_has_allocation()? For instance, does clutter_actor_is_mapped() work?
Comment 30 Marco Trevisan (Treviño) 2018-01-25 16:49:03 UTC
(In reply to Florian Müllner from comment #28)
> (In reply to Florian Müllner from comment #26)
> > certain (not quite clear) conditions
> 
> I say they conditions are not quite clear, because I cannot reproduce the
> crash, however with an early check for clutter_actor_has_allocation(), all
> icon shadows disappear - so not having a valid allocation when calling the
> function doesn't trigger the crash by itself.

True, only when this trigger allocation, and subsequent signals then cause to recall the very same function reusing invalid data.

> Also according to
> https://gitlab.gnome.org/GNOME/mutter/blob/master/clutter/clutter/clutter-
> actor.c#L9787, the call to get_allocation_box() itself should trigger the
> allocation ...

Correct, and that's what I've explained on comment #6.
After so long time, I actually can't recall all the details I went through by that, but I can try to remember if this is unclear.
Comment 31 Cédric Bellegarde 2018-01-30 13:13:09 UTC
>because I cannot reproduce the crash

Here happen really often when:
- Going to overview
- Close a Gnome Terminal window with many tabs
- When Gnome Terminal popups about closing multiple tabs, Shell crash with this error.

Only step I'm not sure is: Do I click on Gnome Terminal window just after clicking on the "Shell overview close button". Not sure...
Comment 32 Jonas Ådahl 2018-01-31 07:17:00 UTC
What is the content of StShadow (priv->shadow_spec) for when this crashes? For the blur_.. g_malloc0(), the only explanation for the crash is that the shadow spec from the theme node is invalid, thus not related to the actor being allocated or not. The reason being that in various backtraces I've seen for that crash, both the width and height values are all sane (16x16) but the blur value, coming directly from the theme node is always "optimized out".
Comment 33 Cédric Bellegarde 2018-01-31 08:59:56 UTC
>What is the content of StShadow (priv->shadow_spec) for when this crashes?

How can I give you this information?
Comment 34 Jonas Ådahl 2018-01-31 09:45:20 UTC
(In reply to Cédric Bellegarde from comment #33)
> >What is the content of StShadow (priv->shadow_spec) for when this crashes?
> 
> How can I give you this information?

Build gnome-shell with "-O0 -g" (might not be needed), trigger the crash to get a core dump, then open it with gdb.

In gdb, navigate to the _st_create_shadow_pipeline_from_actor frame
(run "bt", then "frame N" where N is the frame number of said function).

Then type

print *shadow_spec
Comment 35 Cédric Bellegarde 2018-01-31 10:17:27 UTC
(gdb) print *shadow_spec
$1 = {color = {red = 0 '\000', green = 0 '\000', blue = 0 '\000', alpha = 0 '\000'}, 
  xoffset = 0, yoffset = 0, blur = 33554432.000000253, spread = 33554440.1875, 
  inset = 1098907648, ref_count = 0}
Comment 36 Marco Trevisan (Treviño) 2018-02-05 16:04:12 UTC
(In reply to Cédric Bellegarde from comment #35)
> (gdb) print *shadow_spec
> $1 = {color = {red = 0 '\000', green = 0 '\000', blue = 0 '\000', alpha = 0
> '\000'}, 
>   xoffset = 0, yoffset = 0, blur = 33554432.000000253, spread =
> 33554440.1875, 
>   inset = 1098907648, ref_count = 0}

Can you reproduce the issue when using patch at https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3 ?

As iirc this was happening when the actor sizes where invalid.
Comment 37 Jonas Ådahl 2018-02-06 04:11:33 UTC
(In reply to Marco Trevisan (Treviño) from comment #36)
> (In reply to Cédric Bellegarde from comment #35)
> > (gdb) print *shadow_spec
> > $1 = {color = {red = 0 '\000', green = 0 '\000', blue = 0 '\000', alpha = 0
> > '\000'}, 
> >   xoffset = 0, yoffset = 0, blur = 33554432.000000253, spread =
> > 33554440.1875, 
> >   inset = 1098907648, ref_count = 0}
> 
> Can you reproduce the issue when using patch at
> https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3 ?
> 
> As iirc this was happening when the actor sizes where invalid.

As far as I can tell, the content of shadow_spec is not affected by the allocation state of the actor. The flow goes something like this:

1. We create a theme node from the StIcon

This doesn't use the actor state, just the type name and place in the stage tree.

2. We use said theme node to fetch the shadow spec.

This also doesn't use the allocation, just the theme node and a entry key string ("icon-shadow").

3. The shadow spec we got is messed up causing invalid allocations due to the "blur" and "spread" being something very wrong.

So, the issue here seems to rather be about a) corruption of the theme node data structure, making the us fetch corrupted data, or b) a bug in the theme code that makes us fetch uninitialized data, or c) something else.
Comment 38 Mario Sánchez Prada 2018-02-09 16:28:25 UTC
Just reporting that I can reproduce this reliably on Endless' fork of GNOME Shell 3.26, I actually reported it in https://gitlab.gnome.org/GNOME/gnome-shell/issues/36 before being pointed out here by Florian!

I've been debugging this for a while today and I ended up exactly in the same situations described here: the contents of `shadow_spec` show a ref_count of 0 at the time of the failure, and somehow it seems (directly or indirectly) related to the call to clutter_actor_get_allocation_box() inside _st_create_shadow_pipeline_from_actor(), which somehow invalidates the priv->shadow_spec of the icon, causing the crash (plus some warnings, see https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/24#note_54952).

My early (and naive) experiments show that early returning from _st_create_shadow_pipeline_from_actor() to prevent a new reallocation from happening (see [1]) "fix" the issue, confirming Florian's comment #c28 here, but that seems to break icons shadows, so not a valid fix I guess.

Last, I can confirm as well that applying the commits from https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3 seem to fix the problem as well for me, so it's probably better to close the gitlab issue and continue here?

[1] https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/24/diffs?commit_id=735ce62e8b63ac2ec6950eff114c655f4f1c1e84
Comment 39 Mario Sánchez Prada 2018-02-09 20:06:31 UTC
Another short heads-up: I haven't inspected in detail the code in https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3, but I tested it carefully and confirmed that it doesn't present the issue of my naive early return, and icon shadows do still work with that MR.

At this point, and considering that MR in there is fairly more complex than what I was proposing, I guess I should close the new issue and associated MR I opened, to keep everything in one place.

@Marco Are you planning to keep working on this? If so, considering I have a reproducer 100% reliable, I'd like to coordinate with you to help getting this nasty thing fixed upstream, let me know how I can help. Thanks.
Comment 40 Florian Müllner 2018-02-20 16:50:48 UTC
(In reply to Mario Sánchez Prada from comment #39)
> I have a reproducer 100% reliable, I'd like to coordinate with you to 
> help getting this nasty thing fixed upstream

I may have come up with a simpler approach, but as I don't see the issue, I'm essentially guessing - could you test wip/fmuellner/guard-shadow-pipeline-from-actor-call?

I would prefer to not change the API to expose the different code paths to callers, and put the burden of picking the right one on them (and crash in case they picked the wrong one) ...
Comment 41 Mario Sánchez Prada 2018-02-21 12:53:12 UTC
@Florian Can't test it right now, but I will as soon as possible, likely tomorrow or Friday
Comment 42 Mario Sánchez Prada 2018-02-23 12:16:48 UTC
I've tested your patch now (needed a couple of small fixes) and, while it fixes the crash, it looks like it's still not solving the root issue, since I still see lots of CRITICALs due to _st_paint_shadow_with_opacity() receiving a NULL shadow.

More details in: https://gitlab.gnome.org/GNOME/gnome-shell/commit/9d133d8ebbf9d99db0a02eaa6843c7e43cdec480#note_62721
Comment 43 Florian Müllner 2018-02-23 20:31:59 UTC
Thanks for testing!

(In reply to Mario Sánchez Prada from comment #42)
> (needed a couple of small fixes)

Meh, I should have compile-tested at least :-(


> it looks like it's still not solving the root issue, since
> I still see lots of CRITICALs due to _st_paint_shadow_with_opacity()
> receiving a NULL shadow.

But it does support my hypothesis:

 - icon calls create_shadow_pipeline with spec
 - method triggers a style change that results in g_clear_pointer being called
   (but with the patch the refcount doesn't drop to 0 until the end of the
    call, so no crash)
 - method returns a pipeline that supposedly matches the spec (but doesn't,
   as spec has been nulled in the meantime)

I'll propose a merge request that I belief makes create_shadow_pipeline() safe to call in all cases. With regard to StIcon, that should turn the crash into a waste of computing cycles (by creating pipelines that are thrown away immediately) - I'll review Marco's patches to address that issue separately.
Comment 44 ejhuff 2018-03-01 08:12:37 UTC
Did anyone happen to notice that the magic number 18446744072098939136 is
the result of a signed multiply overflow?

(gdb) printf "%x\n",18446744072098939136
a0000100
(gdb) printf "%lx\n",18446744072098939136
ffffffffa0000100
(gdb) printf "%lx\n",0x5000010*0x5000010
ffffffffa0000100
(gdb) printf "%lu\n",0x5000010*0x5000010
18446744072098939136
(gdb) printf "%lu\n",0x5000010u*0x5000010u
2684354816

The large number arises because the arguments to 
blur_pixels in src/st/st-private.c

are declares as gint when the are actually sizes, and so
should be gsize.

Then *rowstride_out * *height_out would not overflow, and the
allocation would request a mere 2.6 gigabytes instead of
2^63 bytes.
Comment 45 ejhuff 2018-03-01 09:03:05 UTC
That was not quite right.  gdb seems to have 32 bit longs.

(gdb) printf "%lx\n",0x5000010ll*0x5000010ll
190000a0000100
(gdb) printf "%lx\n",0x5000010l*0x5000010l
ffffffffa0000100
(gdb) printf "%lu\n",0x5000010ll*0x5000010ll
7036877102121216
(gdb) printf "%lu\n",0x5000010l*0x5000010l
18446744072098939136

The allocation would be for 7 x 10^15 instead of 1.8 x 10^19

But still,

pixels_out = g_malloc0 (*rowstride_out * *height_out);

should be 

pixels_out = g_malloc0 ((gsize)*rowstride_out * (gsize)*height_out);

to allow for the case where the factors are both (say) 65536.
Comment 46 Mario Sánchez Prada 2018-03-19 11:05:37 UTC
(In reply to Florian Müllner from comment #43)
> [...]
> I'll propose a merge request that I belief makes create_shadow_pipeline()
> safe to call in all cases. With regard to StIcon, that should turn the crash
> into a waste of computing cycles (by creating pipelines that are thrown away
> immediately) - I'll review Marco's patches to address that issue separately.

Hi again. Sorry for the (loooong) delayed response, but I did manage to test your patch from https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/49/commits today and I can confirm they also fix the issue entirely, as I'm not seeing neither crashes nor CRITICAL messages at all now on the console.

And to be clear, I did make sure first that the crash was still there by reverting first the 3 commits from https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3/commits that I had temporarily applied, which led me to the crash in nearly no time. Then I reverted those + applied yours and everything was fine again.

Thanks!
Comment 47 eomanis 2018-04-02 22:56:26 UTC
Created attachment 370462 [details]
Stack trace of gnome-shell crashing, from 2018-04-02

This appears to be hitting me on Arch Linux on amd64 with gnome-shell 3.28.0-1 (Wayland session).

It happened twice when I had mpv --opengl-backend=wayland running in fullscreen and was leaving fullscreen.
Comment 48 Adam Williamson 2018-04-11 19:11:52 UTC
Can someone please review Florian's PR? This has been open for several months, meanwhile people just keep hitting this crash in released distributions (which means, on Wayland, they lose their desktop session). I've found two more dupes of https://bugzilla.redhat.com/show_bug.cgi?id=1526164 just this morning and I am not in any way being methodical about this.
Comment 49 Florian Müllner 2018-04-11 19:43:55 UTC
Pushed my fix in commit f6a08472a0fef65f, so closing.

Changing StIcon to follow the pattern used in StEntry and StLabel still makes sense IMHO, but we already have a merge request[0] for that, so no need to keep this open.

[0] https://gitlab.gnome.org/GNOME/gnome-shell/merge_requests/3"
Comment 50 Pace 2019-10-10 07:22:49 UTC
Hi everyone, I came across this thread after searching for a solution to constant logfile errors and occasional reboots. 

So I'm running Debian 10,
Linux debian 4.19.0-6-amd64 #1 SMP Debian 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux

And my logfile comes up constantly with the following error:-
gnome-shell[1381]: _st_paint_shadow_with_opacity: assertion 'shadow_pipeline != NULL' failed

I see from the thread here that the issue is marked as resolved. Can someone let me know how I can fix this on my installation.

King regards
Comment 51 Adam Williamson 2019-10-10 15:21:28 UTC
Well, according to packages.debian.org, buster has gnome-shell 3.30.2, and the commit mentioned in #c49 was included in that release, so *this* bug ought to be fixed. It's possible you have a different bug with a similar symptom; best thing to do would be to file a new issue (on GitLab, not here) with a full traceback.