GNOME Bugzilla – Bug 608800
alt-dragging gimp windows crashes gnome-shell
Last modified: 2010-02-10 19:38:12 UTC
To reproduce on my end with the gnome shell jhbuild version: 1. start gnome-shell 2. right-click an image in nautilus, open with GIMP 2.6 3. press and hold Alt, drag the image window gnome-shell then crashes on my end. To reproduce the crash again, you need to start gimp *after* gnome-shell, it seems. I see this in the terminal: mutter:ERROR:core/boxes.c:1161:meta_rectangle_edge_aligns: code should not be reached Shell killed with signal 6
I have the same problem. Shell even crashes without dragging the window but only open an image in gimp 2.6. When doing nothing gnome-shell crashes regularly. I don't know how to get more detailed information what is going wrong. But if you could tell me how to do it, I will try it.
This seems to be fixed with today's updates...
(In reply to comment #2) > This seems to be fixed with today's updates... Think you just got lucky today, no changes have been made that would have fixed this. (I've seen this crash too a few times.)
Most likely a Mutter problem, reassigning
I traced this error back to line 726 in edge_resistance.c. It is happening during the g_free called for each key of edges_to_be_freed as specified in g_hash_table_new_full. This may not be the fix, but right now it looks like the simplest extraneous piece to remove. diff --git a/src/core/edge-resistance.c b/src/core/edge-resistance.c index 8409ff8..30a45d4 100644 --- a/src/core/edge-resistance.c +++ b/src/core/edge-resistance.c @@ -713,7 +713,7 @@ meta_display_cleanup_edges (MetaDisplay *display) * array that it is in. So store it in a hash table for later * freeing. Could also do this in a simple linked list. */ - g_hash_table_insert (edges_to_be_freed, edge, edge); + g_hash_table_insert (edges_to_be_freed, edge, NULL); } } }
After looking at this further the code is failing in two places but on the same structure. So somehow MetaEdge Edge is getting corrupted. The one place where I see a MetaEdge->side_type getting set to something other than META_SIDE_* is in boxes.c where overlap->side_type is set to -1 because the comment is "We don't know what else to set it to"
(In reply to comment #5) > I traced this error back to line 726 in edge_resistance.c. It is happening > during the g_free called for each key of edges_to_be_freed as specified in > g_hash_table_new_full. What do you mean by this? The assertion failure definitely isn't happening at line 726, which is, in my copy: g_array_free (edge_data->right_edges, TRUE); > This may not be the fix, but right now it looks like the simplest extraneous > piece to remove. > > > diff --git a/src/core/edge-resistance.c b/src/core/edge-resistance.c > index 8409ff8..30a45d4 100644 > --- a/src/core/edge-resistance.c > +++ b/src/core/edge-resistance.c > @@ -713,7 +713,7 @@ meta_display_cleanup_edges (MetaDisplay *display) > * array that it is in. So store it in a hash table for later > * freeing. Could also do this in a simple linked list. > */ > - g_hash_table_insert (edges_to_be_freed, edge, edge); > + g_hash_table_insert (edges_to_be_freed, edge, NULL); > } > } > } I can't see how this change makes a difference. The value inserted into the hash table should be irrelevant. (This code would make considerably more sense and be more efficient if the writer had taken their own advice and used "a simple linked list")
Tracing backwards. The assertion failure means that: gboolean meta_rectangle_edge_aligns (const MetaRectangle *rect, const MetaEdge *edge) was called with edge->side_type not one of TOP/BOTTOM/LEFT/RIGHT. This function is called from exactly 3 places in edge-resistance.c: from find_nearest_position() - here edge is one of the edges array passed in from apply_edge_snapping() (unless there is something wrong with the bsearch, but I don't see anything wrong there), and that will be one of edge_data->left_edges/right_edges/top_edges/bottom_edges. from apply_edge_resistance() - here edge is one of the edges array passed in from apply_edge_resistance_to_each_side() (unless there is something wrong with the iteration - but it seems pretty safe), and that will be one of edge_data->left_edges/right_edges/top_edges/bottom_edges. This arrays are created in edge-resistance.c:cache_edges() and never modified. In cache_edges(), they are created with the code: while (tmp) { MetaEdge *edge = tmp->data; switch (edge->side_type) { case META_SIDE_LEFT: case META_SIDE_RIGHT: g_array_append_val (edge_data->left_edges, edge); g_array_append_val (edge_data->right_edges, edge); break; case META_SIDE_TOP: case META_SIDE_BOTTOM: g_array_append_val (edge_data->top_edges, edge); g_array_append_val (edge_data->bottom_edges, edge); break; default: g_assert_not_reached (); } tmp = tmp->next; } [ side note. It seems that the left/right and top/bottom edges arrays are always absolutely identical! ] So, at the point that the edges were created, there cannot be anything in the edge array with side_type != -1. So, my only guess from that analysis is that we have memory scribbling going on somewhere. There could also be potentially be a problem if the cached edges got invalidated while we were inside a grab operation, but that should trigger the assertion failure in apply_edge_resistance_to_each_side: g_assert (display->grab_edge_resistance_data != NULL); and not a problem like this. I don't see anything that could happen *while* apply_edge_resistance_to_each_side() is running that could cause invalidation - all that code sticks to edge-resistance.c and boxes.c.
Don't have much of an idea of what further to do here without being able to reliably reproduce. If someone finds a good recipe for that, please add it here.
Program received signal SIGABRT, Aborted. 0x00000034802326c5 in raise () from /lib64/libc.so.6 (gdb) bt
+ Trace 220462
I have been able to reproduce this crash by opening gscan2pdf and while scanning something grab the modal progress dialog and quickly move it top to bottom over the existing preferences modal box.
Jon managed to create a reliable reproducer and got this very informative Valgrind trace: ==10584== Invalid read of size 4 ==10584== at 0x43F1E6: find_index_of_edge_near_position (edge-resistance.c:108) ==10584== by 0x43F8B3: apply_edge_resistance (edge-resistance.c:371) ==10584== by 0x43FF87: apply_edge_resistance_to_each_side (edge-resistance.c:639) ==10584== by 0x440F26: meta_window_edge_resistance_for_move (edge-resistance.c:1153) ==10584== by 0x472E5E: update_move (window.c:7607) ==10584== by 0x47389A: meta_window_handle_mouse_grab_op_event (window.c:8000) ==10584== by 0x4382B6: event_callback (display.c:1935) ==10584== by 0x49A853: filter_func (ui.c:84) ==10584== by 0x348865D505: ??? (in /usr/lib64/libgdk-x11-2.0.so.0.1800.6) ==10584== by 0x348865F557: ??? (in /usr/lib64/libgdk-x11-2.0.so.0.1800.6) ==10584== by 0x348865FA8D: ??? (in /usr/lib64/libgdk-x11-2.0.so.0.1800.6) ==10584== by 0x3481A3920D: g_main_context_dispatch (in /lib64/libglib-2.0.so.0.2200.4) ==10584== Address 0xe86c114 is 4 bytes inside a block of size 24 free'd ==10584== at 0x4A04D72: free (vg_replace_malloc.c:325) ==10584== by 0x3481A36BFB: g_list_foreach (in /lib64/libglib-2.0.so.0.2200.4) ==10584== by 0x421A25: meta_rectangle_free_list_and_elements (boxes.c:770) ==10584== by 0x47667E: meta_workspace_invalidate_work_area (workspace.c:725) ==10584== by 0x477127: meta_workspace_set_builtin_struts (workspace.c:969) ==10584== by 0x3480E05DBB: ffi_call_unix64 (in /usr/lib64/libffi.so.5.0.6) ==10584== by 0x3480E05B43: ffi_call (in /usr/lib64/libffi.so.5.0.6) ==10584== by 0x51756D1: g_function_info_invoke (ginvoke.c:240) ==10584== by 0xD18EE75: gjs_invoke_c_function (function.c:676) ==10584== by 0xD18F62D: function_call (function.c:860) ==10584== by 0x348B650300: js_Invoke (in /usr/lib64/xulrunner-1.9.1/libmozjs.so) ==10584== by 0x348B641582: ??? (in /usr/lib64/xulrunner-1.9.1/libmozjs.so) What it shows is that the problem is that when the work area is invalidated, the edges that are referenced from the edge resistance data are freed without recomputing the edge resistance data. Will work on a patch.
Created attachment 153368 [details] [review] Fix crash when struts change during grab operation Since meta_workspace_invalidate_work_area() frees the edges workspace->screen_edges and workspace->monitor_edges, we must clean up our cached edge resistance data when the invalidate_work_area() is called on the active workspace, or when the workspace changes. Make the computation of the edge resistance data lazy so that it will be recomputed the next time we try to access it. meta_display_compute_resistance_and_snapping_edges() is made private to edge-resistance.c Invaliding the data when active workspace changes also will improve correctness for edge resistance when the current workspace changes during a grab operation. (Even with this fix we still don't try to handle window positions changing during a grab operation; that can't cause a crash since, unlike screen and monitor edges, the window edges are freshly allocated, it will just cause slight oddness in that corner case.) Root cause tracked down due to much effort by Jon Nettleton.
Review of attachment 153368 [details] [review]: ::: src/core/workspace.c @@ +482,3 @@ + meta_display_cleanup_edges (workspace->screen->display); + * a current resize or move operation */ + /* Free any cached pointers to the workspaces's edges from Why does this one need a call? For the move window case? I guess it doesn't hurt too much to invalidate in all, but maybe if that's the case this call could be done under if (move_window != NULL)?
Review of attachment 153368 [details] [review]: ::: src/core/workspace.c @@ +482,3 @@ + * a current resize or move operation */ + meta_display_cleanup_edges (workspace->screen->display); + I'm not 100% sure if Metacity/Mutter allows the old Enlightenment-style of switching workspaces while moving (if you have switch workspaces bound to Alt-F1/F2 this can be pretty handy), but I would like the code to be correct if that is allowed or is allowed in the future. The workspace could also be switch due to arbitrary reentrancy from plugin code. meta_display_cleanup_edges() is a no-op if you don't have edges cached, so there's no practical performance gain from moving it around. If the workspaces switches without moving the window being moved or resized along with then I'm sure we have bigger problems, but conceptually we'd still want to invalidate the edges, so I don't see it belonging in the move_window != NULL case. Considering that cleanup_edges() is a no-op if we have no grab operation or the grab operation is something like a switch-workspaces grab, do you see any problem with leaving it like this?
This is a good explanation, thanks! Looks good to go as-is then.
Attachment 153368 [details] pushed as 2a823ef - Fix crash when struts change during grab operation