GNOME Bugzilla – Bug 642957
gs crashes while move the only one window from initial workspace to the second
Last modified: 2011-07-13 14:40:50 UTC
If there is only one windows in the desktop session, then new workspace can not handle while drag it from the first workspace to the second one. how to reproduce this issue: 1, login the new session 2, open a gnome-terminal window, it will appear in the initial workspace and the second will be created automatically 3, switch to overview by press [Windows] key 4, move pointer to the right edge to active workspace bar 5, drag the terminal window from the first workspace to the second one 6, gnome-shell will hang up few seconds and then restart automatically
This isn't immediately reproducible for me. (I dragged windows around a *lot* when developing the animation code :-) Does it happen reliably for you?
Yes, it happens all the time. If there is more than 1 window, this issue never happen. My os is fedora15 + gnome-shell git build from jhbuild. The following is the output from terminal: -------------------------------------------------------------------- [jcome@horn ~]$ ~/gnome-shell/bin/gnome-shell --replace Gtk-Message: Failed to load module "pk-gtk-module" Gtk-Message: Failed to load module "atk-bridge" Window manager warning: Log level 16: AT-SPI: Accessibility bus not found - Using session bus. (mutter:6167): Clutter-DEBUG: Signal type minimize not supported (mutter:6167): Clutter-DEBUG: Signal type maximize not supported (mutter:6167): Clutter-DEBUG: Signal type restore not supported JS LOG: Cannot create "Network" item, .desktop file not found or corrupt. [1298354106,000,xklavier_config.c:xkl_config_registry_load_helper/] Missing registry file /usr/share/xmodmap/base.xml JS LOG: GNOME Shell started at Tue Feb 22 2011 13:55:06 GMT+0800 (HKT) Window manager warning: CurrentTime used to choose focus window; focus window may not be correct. Window manager warning: Got a request to focus 0x4406c4a (jcome@horn) with a timestamp of 0. This shouldn't happen! Shell killed with signal 11 [jcome@horn ~]$ ** (metacity:6194): WARNING **: AT-SPI: Accessibility bus not found - Using session bus. [jcome@horn ~]$ --------------------------------------------------------------------
If you have a second computer you can use to ssh into the affected system to get a backtrace, that would be very useful. - ssh in - run $ ~/gnome-shell/bin/gnome-shell --replace (will automatically connect to the current X sesion) (gdb) run [ reproduce crash ] (gdb) backtrace [ attach results here ] If you don't have a second computer, this may be possible to do from a different virtual terminal, but it's trickier.
Created attachment 182508 [details] crash message from abrt
Comment on attachment 182508 [details] crash message from abrt seems like a mutter related issue. should I file it in mutter?
reproduced in other computer with nouveau driver (fedora 15)
After update my fedora 15 today, the issue is gone. Thanks you gnome-shell team for your great work.
Downstream report (?): https://bugzilla.redhat.com/show_bug.cgi?id=680207 Still seeing this happening with today's Fedora 15 packages.
Created attachment 184625 [details] [review] util: fix a reentrancy problem with meta_later Calling meta_later_add() or meta_later_remove() from within a META_LATER_BEFORE_REDRAW callback ended up being a no-op, because of how run_repaint_laters() was fiddling with the laters list. (This resulted in a crash in window.c:idle_calc_repaint(), which assumed it would only be called when a certain queue was non-empty, but was getting called anyway because of a failed meta_later_remove() call.) Fix this by having run_repaint_laters() work on a copy of the laters list instead, and add refcounting to MetaLater so that removing a later that run_repaint_laters() hasn't gotten to yet won't cause problems.
Created attachment 184626 [details] [review] window: hack around a crash If idle_calc_showing() gets called when its queue is empty (which shouldn't happen), just return rather than crashing. ======= Alternative one-line hack to prevent the crash, without worrying about bugs in the larger patch (or the existence of other code somewhere else that depends on the buggy behavior...)
Oh, and with either patch, you'll notice a bug in the workspace-fixing-up animation. This is caused by the lostWorkspaces handling at the end up WorkspacesView._updateWorkspaceActors, which tries to tween workspace.actor.y to "workspace.x" which, besides being the wrong coordinate, doesn't exist because it got removed in Alex's "don't make the workspace actors be the same aspect ratio as the screen" patch. I'm not sure what it should be replaced with...
I think for this release, the short patch is much less likely to reveal unexpected other problems. In fact, I might go even further and make it a silent return, since the other queues just silently ignore empty lists. But I'm OK with the g_return_if_fail() since it does indicate a logic bug. Subsequently we need to fix the real issue. I think the workspace.x bug might be filed elsewhere? Know it was mentioned by alexl in some context.
Review of attachment 184626 [details] [review]: I'd like it if the subject was more specific window: don't crash on spurious calls to idle_calc_showing() Say. Other than that, I'm OK with this pending r-t approval.
Comment on attachment 184626 [details] [review] window: hack around a crash pushed the hacky patch, with the better commit message
poke
Oh interesting; this patch probably explains a weird crash I saw (somewhere, I forget) where _pendingAppIds was null in appDisplay.js, and I couldn't see where that was happening. But it would make sense if the meta_later_remove call was a no-op.
Review of attachment 184625 [details] [review]: Generally, definitely good to fix this up ::: src/core/util.c @@ +719,3 @@ + { + later->notify (later->data); + later->notify = NULL; Don't think this should happen until the refcount goes to zero - freeing the data of a callback currently being executed is dangerous; but you need to clear ->func here or otherwise mark the Later as destroyed since otherwise laters from laters_copy could be invoked after they are destroyed. @@ +774,3 @@ } else + meta_later_remove (later->id); don't love the o(n^2) for accumulating a lot of one-shot laters, but simplicity is worth a lot, so I think the copy approach is probably better than something more complex along the lines of GHookList where we are careful to not mess up our iteration. (I was already taking an easy way out by just removing everything and adding it back in - since that ended up being appends, it was also O(n^2), just O(n^2) in the number of non-one-shots rather than the number of one-shots) @@ +811,2 @@ destroy_later (later); + unref_later (later); rather than adding an unref_later after every destroy_later, shouldn't destroy_later remove the initial reference itself? And that implies that unref_later() shouldn't call destroy_later() - the only way we should ever get to a zero reference count is if destroy_later() has already been called.
(In reply to comment #17) > don't love the o(n^2) for accumulating a lot of one-shot laters, but simplicity > is worth a lot, so I think the copy approach is probably better than something > more complex along the lines of GHookList where we are careful to not mess up > our iteration. So given that we're not using META_LATER_IDLE or META_LATER_RESIZE anywhere, I was going to suggest just getting rid of the later machinery altogether and just using clutter_threads_add_repaint_func() directly. Except that the repaint_func methods have exactly the same problem with callbacks-modifying-the-list as meta_later does/did. Maybe glib needs a GModifiableWhileIteratingList type... I know evolution and libsoup both define one as well... Also, why do we need to start a ClutterTimeline? If all we care about is that the BEFORE_REDRAW later runs before the next redraw, then if the clock isn't running, can't we just wait until it starts again before running? Or if we want BEFORE_REDRAW to mean "asap, and in particular before a redraw" (which I guess we probably do), then we could handle it the way the code currently handles META_LATER_RESIZE.
Created attachment 191828 [details] [review] util: fix a reentrancy problem with meta_later fixes from review
(In reply to comment #18) > (In reply to comment #17) > > don't love the o(n^2) for accumulating a lot of one-shot laters, but simplicity > > is worth a lot, so I think the copy approach is probably better than something > > more complex along the lines of GHookList where we are careful to not mess up > > our iteration. > > So given that we're not using META_LATER_IDLE or META_LATER_RESIZE anywhere, I > was going to suggest just getting rid of the later machinery altogether and > just using clutter_threads_add_repaint_func() directly. Except that the > repaint_func methods have exactly the same problem with > callbacks-modifying-the-list as meta_later does/did. > > Maybe glib needs a GModifiableWhileIteratingList type... I know evolution and > libsoup both define one as well... GHookList was meant to be an abstraction in that area. Complete API disaster. With a lot of hands-on help from Tim Janik I was able to figure out how to use it from GMainLoop, but we shortly after dumped it and wrote the GMainLoop code from scratch. Not sure if a better abstraction would be possible. > Also, why do we need to start a ClutterTimeline? If all we care about is that > the BEFORE_REDRAW later runs before the next redraw, then if the clock isn't > running, can't we just wait until it starts again before running? Or if we want > BEFORE_REDRAW to mean "asap, and in particular before a redraw" (which I guess > we probably do), then we could handle it the way the code currently handles > META_LATER_RESIZE. BEFORE_REDRAW means "after finishing all the other stuff (like event processing) that you would do before redraw". This is basically universally what you want for any sort of batching. Just occasionally (like the code to bind texture pixmaps) you might mean "before the next redraw, whenever that is", but that's rare. So that's the big thing that we're getting over clutter_threads_add_repaint_func() - we have something that acts like g_idle_add() and gets run deterministically, rather than an awkward abstraction that not only isn't reentrant, but you also have to force to run. (The reason to use a timeline over clutter_actor_queue_redraw(stage) or whatever, is that that clutter_actor_queue_redraw(stage) would invalidate the whole stage, while we should draw only the actual damaged area, and redraw nothing if nothing gets damaged.) META_LATER_RESIZE is actually used in window.c. META_LATER_IDLE seemed useful, but I later figured out that "at idle" means "however much later you want" - especially before the clipped and obscured-stacking optimizations it was very possible for a redrawing client to keep mutter 100% occupied and never to go fully idle. - Owen
Review of attachment 191828 [details] [review]: Looks good to me
Attachment 191828 [details] pushed as cd7a968 - util: fix a reentrancy problem with meta_later