GNOME Bugzilla – Bug 736527
app-switcher not working when certain windows (jdownloader2 notifications) are shown
Last modified: 2014-12-29 01:14:09 UTC
The app-switcher stops working whenever jdownloader2 shows one of its custom notification windows. After the notification window has been closed it starts working again. This is the output from gnome-shell when this happens: (gnome-shell:1715): Gjs-WARNING **: JS ERROR: Error: JDownloader appears to be running, but doesn't have any windows AppSwitcher<._init@resource:///org/gnome/shell/ui/altTab.js:475 wrapper@resource:///org/gnome/gjs/modules/lang.js:169 _Base.prototype._construct@resource:///org/gnome/gjs/modules/lang.js:110 Class.prototype._construct/newClass@resource:///org/gnome/gjs/modules/lang.js:204 AppSwitcherPopup<._createSwitcher@resource:///org/gnome/shell/ui/altTab.js:108 wrapper@resource:///org/gnome/gjs/modules/lang.js:169 SwitcherPopup<.show@resource:///org/gnome/shell/ui/switcherPopup.js:114 wrapper@resource:///org/gnome/gjs/modules/lang.js:169 WindowManager<._startAppSwitcher@resource:///org/gnome/shell/ui/windowManager.js:1388 wrapper@resource:///org/gnome/gjs/modules/lang.js:169 (gnome-shell:1715): Gjs-WARNING **: JS ERROR: TypeError: this._switcherList is null SwitcherPopup<._allocate@resource:///org/gnome/shell/ui/switcherPopup.js:96 wrapper@resource:///org/gnome/gjs/modules/lang.js:169 _parent@resource:///org/gnome/gjs/modules/lang.js:131 AppSwitcherPopup<._allocate@resource:///org/gnome/shell/ui/altTab.js:64 wrapper@resource:///org/gnome/gjs/modules/lang.js:169 LayoutManager<._init/<@resource:///org/gnome/shell/ui/layout.js:174 --------------------------------- xprop output from one of the notification windows: WM_STATE(WM_STATE): window state: Normal icon window: 0x0 _NET_WM_DESKTOP(CARDINAL) = 0 _NET_WM_ALLOWED_ACTIONS(ATOM) = _NET_WM_ACTION_RESIZE, _NET_WM_ACTION_CHANGE_DESKTOP, _NET_WM_ACTION_ABOVE, _NET_WM_ACTION_BELOW _NET_WM_WINDOW_TYPE(ATOM) = _NET_WM_WINDOW_TYPE_DIALOG _NET_WM_STATE(ATOM) = _NET_WM_STATE_ABOVE WM_HINTS(WM_HINTS): Client accepts input or input focus: False window id # of group leader: 0x2a0890f WM_TRANSIENT_FOR(WINDOW): window id # 0x2a0890f _NET_WM_PID(CARDINAL) = 17573 WM_CLIENT_MACHINE(STRING) = "t400" WM_PROTOCOLS(ATOM): protocols _MOTIF_WM_HINTS(_MOTIF_WM_HINTS) = 0x3, 0x0, 0x0, 0x0, 0x0 WM_NORMAL_HINTS(WM_SIZE_HINTS): program specified location: 1157, 37 program specified size: 273 by 179 window gravity: NorthWest WM_CLASS(STRING) = "sun-awt-X11-XWindowPeer", "JDownloader" WM_CLIENT_LEADER(WINDOW): window id # 0x2a00008 _NET_WM_NAME(UTF8_STRING) = "win0" WM_NAME(STRING) = "win0"
I've tracked this issue down to a bug in the pid based app tracking. What happens is that this is a java app, so no WM_CLASS based window -> app mapping, and the notification window for some reason gets marked as transient to itself rather than the main application window so, the only way get_app_for_window could map the notification window to the already existing jdownloader app is via pid. But since this is broken get_app_for_window creates a new app for the notification window. Now when pressing alt-tab, the AppSwitcher goes through all apps and tries to associate all normal windows from the tab list of the current workspace with each app. Since the notification window is a dialog window it is not part of the tab list and the app that was created for by get_app_for_window ends up having 0 windows which results in the error and causes the initialization of the AppSwitcher to fail. So now to why pid based app tracking is broken: get_app_from_window_pid uses a lookup in launched_pid_to_app, which however only contains apps launched via shell_app_launch, not apps that have been started otherwise or have been running before gnome-shell has been (re-)started. Also the pid used by this does not seem to match the final pid of the application that is actually displaying the window, probably due to some wrapper shell scripts. There is an easy fix for this though, since there already is a shell_window_tracker_get_app_from_pid function, get_app_from_window_pid could use this rather than the lookup in launched_pid_to_app. Another more involved solution would be to properly initialize launched_pid_to_app and ensure it is tracking the right pids. But given how few windows there are tracked only by their pid, I would go with the first variant. Bonus bug found when going through the source): _shell_window_tracker_add_child_process_app uses the address of the lookup key pointer rather than the key pointer itself. Not sure if that lookup (even if done properly) would ever return true though.
Created attachment 292318 [details] [review] window-tracker: Fix pid based window/app association The lookup table used by get_app_from_window_pid contained only pids of apps launched by gnome-shell itself, but not pids of apps running before gnome-shell was (re-)started. Also the pids in that table might not even be the pid of the process that is actually showing the window if wrapper scripts are used. Instead use shell_window_tracker_get_app_from_pid which uses the pids from the windows themselves. This removes the only use of launched_pid_to_app which will be removed in the following commit. This fixes:
Created attachment 292319 [details] [review] window-tracker: remove now unused launched_pid_to_app
Review of attachment 292318 [details] [review]: Thanks for doing this.
Review of attachment 292319 [details] [review]: ::: src/shell-app.c @@ +1361,1 @@ error); You should be able to stop using launch_as_manager now.
If you have wrapper shell scripts, I strongly recommend changing them to end in "exec java ..." or whatever. See: https://bugzilla.mozilla.org/show_bug.cgi?id=477724
Review of attachment 292318 [details] [review]: At a high level, obviously this whole system is fragile, but it's all we have unless we have total control over all aspects of application launching. Which we don't today on a regular Unix system where a user can just run /usr/bin/someapp from a terminal for example. Now, it looks like you're trying to fix two (apparently) unrelated problems in one commit. It *might* be the case that they're the same, but I'm not convinced yet. Let's focus on the second part - "Also the pids in that table might not even be the pid of the process that is actually showing the window if wrapper scripts are used." We need to *somehow* associate the window with an app. Hmm...so because the wrapper shell script is still alive, gnome-shell would think the app is running, but would not have any windows associated. Then your change would have us iterate over all apps, then iterate over all windows, then look at the pid for each. Ok, the more I think about this, the more I think we need to maintain a pid -> app mapping somewhere (e.g. in ShellAppSystem), and not just track launched apps. It's still not entirely clear to me why your patch works, if it does. But if we do want to support this, I don't think we should be iterating over all apps and all windows on every lookup.
(In reply to comment #7) > Review of attachment 292318 [details] [review]: > Then your change would have us iterate over all apps, then iterate over > all windows, then look at the pid for each. > > Ok, the more I think about this, the more I think we need to maintain > a pid -> app mapping somewhere (e.g. in ShellAppSystem), and not just > track launched apps. > > It's still not entirely clear to me why your patch works, if it does. It works because if no app is found for a window a new window backed app is created. Then for the next window with the same pid in _NET_WM_PID this app would be found when iterating over all apps and then the window would be associated with that app as well. This means that compared to the previous approach, we lose the association with the .desktop file in those cases where the app has been started through gnome-shell, gnome-shell has not been restarted since, there have been no broken wrapper-scripts involved and there are no other means of associating the window with the .desktop file. I think this an acceptable trade-off though. > But if we do want to support this, I don't think we should be iterating > over all apps and all windows on every lookup. The only caller of get_app_from_window_wmclass() is track_window() which only gets called when new windows appear or the wm_class/gtk_application_id of an existing window has changed (which is probably not the case for windows that tend to hit this code path). Also it only gets called in case all the previous methods have failed, which should be pretty rare. The resulting window/app association is cached in the window_to_app lookup table. So I don't think there is much to gain from caching the pid/window association, at least not in the code this patch touches.
I reread this bug and I think the patch makes more sense to me now, it's about two windows from a "window based" app, not about the .desktop -> app assocation. The other key point is: > and the notification window for some reason gets marked as transient to itself Should really fix the app or the toolkit or whatever is doing that. But I'm more dubious about removing the launched_pid_to_app case, pretty sure you're going to break something. Annotate takes us to this bug: https://bugzilla.gnome.org/show_bug.cgi?id=637745 And it looks like I didn't have a sample case there, but my memory is that it's "runtime based" (java/python/etc) apps that *do* exec the interpreter, but don't have a correct wmclass.
Created attachment 292418 [details] [review] window-tracker: Fix pid based window/app association [alternative] This is an alternative version of the patch that keeps the original pid tracking and only uses shell_window_tracker_get_app_from_pid if the launched_pid_to_app lookup failed. But I'm still not sure if it is worth keeping the launched_pid_to_app stuff, since this basically results in seemingly random behavior changes after a gnome-shell crash: 1. App has a "New window" context menu item 2. gnome-shell crashes 3. App has no more "New window" context menu item (Something similar could happen with the application name shown in the top panel, which then might be different after a gnome-shell crash; I've seen that happening with chrome beta versions for example) Also if we are going with this variant, we need to fix the lookup in _shell_window_tracker_add_child_process_app. I can't think of anything that would result in a lookup that would return anything other than NULL, even when correctly using "pid_ptr" instead of "&pid_ptr".
For the "loses state after crashing" issue, see https://bugzilla.gnome.org/show_bug.cgi?id=691987
*** Bug 741145 has been marked as a duplicate of this bug. ***
Created attachment 292485 [details] [review] window-tracker: remove now unused launched_pid_to_app [updated] Updated version of the cleanup patch to go with the original version of the fix which Jasper seems to prefer. The original cleanup patch still had G_SPAWN_DO_NOT_REAP_CHILD specified but I removed the call to g_child_watch_add which resulted in hordes of zombies. We still need to use as_manager to get nicer logging to the journal. The app_child_setup that sets up proper logging for the journal needs to be run after fork and before exec, which is something I don't see a way to do using just g_app_info_launch. It would be possible when using g_spawn, but to use that would basically mean to reimplement most of as_manager in gnome-shell. Could you push the patches if they get accepted, because I don't have a git account.
Attachment 292318 [details] pushed as 186f9b2 - window-tracker: Fix pid based window/app association