GNOME Bugzilla – Bug 332493
self-transient windows get tasklist stuck in infinite loop
Last modified: 2006-04-29 20:29:30 UTC
If you have a window that is marked as being TRANSIENT_FOR itself, the tasklist will eventually get stuck in an infinite loop in wnck_tasklist_active_window_changed() trying to chase TRANSIENT_FOR pointers until it finds a non-transient window. (Yes, you're not supposed to mark a window TRANSIENT_FOR itself, but the X API doesn't actually prevent you from doing it and some apps apparently do, notably x11-ssh-askpass.) This patch makes it just ignore WM_TRANSIENT_FOR if it points to itself, which seems to work fine.
Created attachment 60080 [details] [review] patch
Good catch, thanks. The patch will fix part of the problem, but it's not just self-transiency that we need to worry about. We need to check for transient cycles of length greater than 1 as well.
*** Bug 336264 has been marked as a duplicate of this bug. ***
Bug 336264 has a good testcase too, though only for the self-transiency problem instead of mututally transient or longer transient cycles; bug 336264 claims that the Mathematica bug (bug 316096) may also be a dupe.
Created attachment 62246 [details] [review] Catch the transient cycles Here's a patch that catches transient cycles. It works much the way other code works. It'd probably be better to do it when updating transients, but I'll probably need to go through and redo all the transient code anyway when handling group modal dialogs and I can do that then. So this is good enough for now. Dan's patch is still fine, so I'll commit both.
Thanks Dan! 2006-03-28 Elijah Newren <newren gmail com> Fix transient cycles causing infinite loops, #332493. Portion of patch providing robustness against self-transiency (i.e. cycle length of 1) provided by Dan Winship. * libwnck/tasklist.c (wnck_tasklist_active_window_changed): check for transient cycles * libwnck/window.c (update_transient_for): disallow self-transiency entirely
A few points to keep in mind: 1. wnck_task_state_changed() also chases transiency. There are others (e.g. in window.c) that check for self-transiency only. They can be simplified now that that isn't allowed. (AFAIK none need to check for larger cycles.) 2. Windows in a transiency cycle can have their own transients, so the patch is not robust. For example, the transient_for chain could go 1 -> (2->3->4->2->...). I'm wondering why there needs to be a loop at all, rather than a way for the win_hash value to get updated when a window sets a new TRANSIENT_FOR. I'm guessing a bulletproof temporary fix would be messy... the shortest fix I can see is making the loop terminate after [# of items in win_hash] iterations. I'm new to all this stuff, so I don't know whether this bug is exploitable. (Can you set WM hints from something like an unprivileged java applet that's supposed to be safe to run?) For that matter, I don't know what exactly a group modal dialog is, or whether any of this matters once they're supported. I'd like to give whatever help I can though. If you're interested I can put together a more complex testcase and a patch.
Good catches. Also, I think the other ones in window.c do need to check for larger cycles and have the same problems you point out in your second point. I agree with you that the code for checking for transiency cycles is ugly, expensive, and error prone and that it should be thrown out at some point. It's going to be even uglier with group modal dialogs if we don't do that. :) The fix I'd like to see is to make a tree structure containing the transiency relationships (or one tree per application, or something like that). Then, when we add nodes we just make sure the tree structure remains a tree instead of becoming a directed graph of some sort. A quick explanation for a group modal dialog: First, note that this has nothing to do with libwnck task grouping (that's used for saving space by sticking app windows that are often unrelated together). A group modal dialog is one which is considered both transient and modal to all other windows in the application, rather than just being trasient to one other window. Most apps only have one or two windows open at a time, so the distinction is irrelevant for them, but it does matter for apps which have multiple top-level windows and can have a dialog that should prevent input to all of them while it is open. Technically, according to the EWMH, group modal dialogs are created by setting the transient_for hint to the root window or to None for modal dialogs. (Also, to tie this in to what I said above, group modal dialogs are why I mention a tree structure instead of just a list structure for the transiency handling) It's also worth noting that Metacity needs a very similiar code overhaul, and it's causing the-application-is-frozen kinds of bugs. So, you'd be my hero for fixing this, as I could reuse the code there and I haven't found the time yet to handle this problem myself. :) We should probably file a separate bug for this and then close out bug 310381 since the remainder of it is just part of this issue. Let me know if my explanation isn't very clear; my brain is kind of fried right now.
If you're going to do all that work, you should probably have libwnck or metacity spew a warning when it finds a transient-for cycle, because that's broken by design. (The self-transient-for case seems to be a pre-EWMH trick for trying to get a window to appear above all other windows.)
*** Bug 316096 has been marked as a duplicate of this bug. ***
Created attachment 62832 [details] [review] Patch to eliminate transiency-related infinite loops and support group-transient windows This patch amost completely solves the transiency problems in wnck. It's not quite EWMH compliant since setting WM_TRANSIENT_FOR to None will result in a window which isn't transient. (_wnck_atom_get doesn't really provide a way to tell whether the hint is set to None or unset.) There might be one other exception I haven't thought about much: an application closing a transient window's parent without changing the WM_TRANSIENT_FOR hint. I actually haven't coaxed a program into doing that... and I doubt it'd cause a crash. Central idea of the patch: window.c will do the necessary bookkeeping to ensure that parent-chasing loops behave correctly. Rather than returning whatever nonsense an application reports, wnck_window_get_transient() will return the parent app window if its argument has one and NULL otherwise. The return value is guaranteed not to cycle. If an application sets up a WM_TRANSIENT_FOR cycle, of length one or otherwise, all windows in the cycle will be considered group transient. (A transient for a group-transient window, however, isn't automatically group-transient.) This patch will change wnck's behavior wrt transients. For completeness (and to help you decide whether the patch is acceptable): 1. wnck_window_get_transient() will return NULL if argument is group-modal. Reason: it's not actually transient for an *app* window. In addition, setting TRANSIENT_FOR to root or None should give the same group-modal behavior (even if, as I mentioned, that's not yet true). 2. wnck_window_or_transient_needs_attention() will stick to the old behavior: if a window in the last_transient chain needs attention, it'll return true. If some other transient needs attention, it will be ignored. IMO this behavior is wrong and /any/ (non-group) transient with the NEEDS_ATTENTION hint should count, but that'd be a different patch anyway. Reason group transients shouldn't count: the main use I've seen for the function is to ensure that a tasklist/selector makes the right items flash. Since wnck doesn't mark group transient windows as SKIP_TASKBAR, we'd want group transient windows themselves to flash if they need attention. 3. (Random) wnck_tasklist_activate_task_window() will now call wnck_window_unminimize() instead of wnck_window_activate_transient() if it's activating a minimized window. AFAICT that captures the intent more correctly and ultimately does the same thing. I probably shouldn't have done that, but I was trying to cut down on explicit mentions of transiency anyway. Overall, it seems that libwnck's best bet when faced with group transiency is to ignore it. However, these code changes should make it good at *identifying* it. I imagine the corresponding changes in metacity would be more interesting. Dan: I did not tell wnck to emit a warning, but it's easy to insert one at any point in update_transient_for() when ...->priv->transient_for_cycle is set to TRUE.
I've not really followed this bug, but since we have a nice patch for a bug that is marked as FIXED, should we open a new bug, reopen this bug or mention this patch somewhere else (in bug 310381)?
Yeah, let's make a new bug for this, and mark bug 310381 as depending on it. Justin: Can you file a new bug, and attach your patch and paste your comment 11 into the new bug? I'll try to get around to reviewing it soon, and it'll probably help if it's in an open bug. :)
Done; it's bug 338013. I cannot mark bug 310381 as depending on it.
*** Bug 338264 has been marked as a duplicate of this bug. ***
*** Bug 340146 has been marked as a duplicate of this bug. ***