GNOME Bugzilla – Bug 668667
Kill syndaemon instance on exit or crash
Last modified: 2012-03-14 11:03:54 UTC
Using GNOME 3.2 on Ubuntu Quite some users reported issues with trackpads which stop working during their session (i.e https://launchpad.net/bugs/868400) Once cause seems to be that if g-s-d hits a bug, segfaults and is respawned it starts a new syndaemon where the first g-s-d run one didn't go away, both can be conflicting and leading to bugs
A mere "pidof" style check would be too coarse, though, as syndaemon is specific to a $DISPLAY. I see the following options here: * Use a per-display stamp file, like /run/lock/gsd-syndaemon.$DISPLAY, containing the pid. This would replace the manager->priv->syndaemon_spawned / manager->priv->syndaemon_pid variables. * Iterate over all running syndaemon instances and check their /proc/pid/environ file for a matching DISPLAY. This is an evil hack, though. * Get rid of syndaemon altogether. It's rather horrible code, either using the broken and unsupporting XRecord extension, or using polling (5 times a second usually), which is a sad thing for battery life. It would be more elegant to subscribe to keyboard XEvents in the mouse plugin and enable/disable the touchpad based on those instead of polling. Presumably the third option would be preferred for upstream, and the first at least acceptable for improving the current code?
For an event based reimplementation of the syndaemon functionality we would need to link to libwnck, or reimplement parts of it (we need to enable the KEY event mask on all current windows, plus set up listeners for new/removed windows to update the mask there as well). Would that be acceptable to you? I don't see how we can get notifications about all key events otherwise. XSync only defines an idle timer (used in gnome-session), but that doesn't help us here as we need only key events, not any other kind. If you don't like the overhead of wnck, we can still go with the per-display stamp file, and then try to fix syndaemon to not poll all the time.
Before I'll start working on actual code, I'll wait on an answer from the g-s-d maintainers which solution they prefer. Fixing syndaemon requires writing the key event listening machine in the X11 API, and then we'd use the stamp file approach in g-s-d. When g-s-d wants to drop syndaemon and reimplement the functionality (but event-based, not polling), this would be written using the GDK API, not X.
it is not that trivial - you don't get to listen for arbitrary key events; grabs will get in the way. Probably best to consult the X guys for the best way to implement this functionality. For really correct results, you only want to listen for key events from the keyboard that has the touchpad, not other random key-event producing devices...
Ah, good point about the keyboard grabs, thanks. So for now, let's use a file based flag instead of an in-process one.
(In reply to comment #5) > Ah, good point about the keyboard grabs, thanks. So for now, let's use a file > based flag instead of an in-process one. Actually, we just need to make sure the process gets reaped when the parent dies, so that a gnome-settings-daemon crashing will take the syndaemon process with it. In the long term, I'd rather replace syndaemon with a small gnome-settings-daemon piece of code talking to a timer on the X side (similar to XIDLE) where we could ask to be poked when a particular device gets used. Adding Peter to CC: for opinion.
my first thought was to have per-device idletimers in addition to the current one. I've played around with this and the preliminary patches appear to work but require finishing. From g-s-d's side, the process would be to register idle handlers for the keyboard and toggle the synaptics driver property accordingly. Bonus: this way you we can disable the touchpad only if the keyboard above the touchpad is used for typing, not for any other keyboards. Provided you know which keyboard that is, of course.
(In reply to comment #6) > Actually, we just need to make sure the process gets reaped when the parent > dies, so that a gnome-settings-daemon crashing will take the syndaemon process > with it. Sounds good, as an unintrusive fix for 3.4. Peter's per-device idle timers sound perfect for this, but that's something for the next GNOME release then. So I guess this needs a signal handler for SEGV/INT/ILL/FPE/ABRT/PIPE/BUS/TERM (which should be the most common causes for g-s-d dieing), which kills the syndaemon and then re-raises the signal. Thanks!
Created attachment 209513 [details] [review] Support explicitly setting G_MESSAGES_DEBUG As a side issue, I noticed that it's currently impossible to only get debug messages for one plugin, as g-s-d does if (debug) g_setenv ("G_MESSAGES_DEBUG", "all", TRUE); Attached patch fixes this. Can I push this, please? It makes plugin debugging a lot easier.
(In reply to comment #9) > As a side issue, I noticed that it's currently impossible to only get debug > messages for one plugin, as g-s-d does As it's a side issue, would be great if it could get its own bug :)
Comment on attachment 209513 [details] [review] Support explicitly setting G_MESSAGES_DEBUG Moved debugging patch to bug 671928. It didn't seem worth the overhead of its own bug, but there it is. :-)
Created attachment 209523 [details] [review] mouse: Stop syndaemon when daemon dies This works fine for crashes, Control-C, etc. Tested with and without the "disable touchpad while typing" option. It's a bit ugly because one plugin claims the signal handlers. No other plugin or the daemon currently does, so as soon as we need signal handlers in other plugins we need to implement some chaining instead of just resetting it to SIG_DFL, i. e. remember the previous handlers in more global variables (or make GsdMouseManagerPrivate a global variable, which would be nicer in that case). I'll wait for a first comment before these refinements.
Review of attachment 209523 [details] [review]: A bit frightened by the sigaction calls, tbh. Isn't there a way to spawn it without detaching from the parent process? That wouldn't require sigaction, and very minimal changes to the existing code. ::: plugins/mouse/gsd-mouse-manager.c @@ +510,3 @@ + if (syndaemon_pid > 0) { + kill (syndaemon_pid, SIGHUP); + g_spawn_close_pid (syndaemon_pid); Remove the g_spawn_close_pid() please. It's useless on Unix systems. @@ +574,3 @@ + + /* ensure that it is stopped together with g-s-d */ + sigaction (SIGTERM, SIG_DFL, NULL); Don't we have something in glib to deal with Unix signals properly? This would remove any other handler we might have.
(In reply to comment #13) > A bit frightened by the sigaction calls, tbh. Isn't there a way to spawn it > without detaching from the parent process? That wouldn't require sigaction, and > very minimal changes to the existing code. I'm happy to be educated about how to do this. I'm not aware of a way to tell a child process to die along with the parent, g_spawn_* does not have any magic flag for it, and 20 minutes of googling didn't bring up anything. I very well might have missed something, of course. > Remove the g_spawn_close_pid() please. It's useless on Unix systems. It was there before, and I left it because the documentation recommends so, but I'm happy to drop it. > > @@ +574,3 @@ > + > + /* ensure that it is stopped together with g-s-d */ > + sigaction (SIGTERM, SIG_DFL, NULL); > > Don't we have something in glib to deal with Unix signals properly? g_unix_signal_* only supports SIGINT/HUP/TERM, so not sufficient. atexit() only supports the "normal" exit, so no signals at all. > This would remove any other handler we might have. Inded, that's what I meant in comment 12. Instead of resetting to SIG_DFL, we could remember the original value and set it back to that one, but that would require more global variables or having a global pointer to the GsdMouseManagerPrivate instance, or (which is my preference) a single global struct which bundles all the old signal handlers and the syndaemon pid. I'll work on that.
Created attachment 209582 [details] [review] mouse: Stop syndaemon when daemon dies This now provides proper chaining of signal handlers, i. e. instead of SA_RESETHAND we restore them to their previous value. So should any other plugin also install a signal handler using the same approach, both would work.
(i haven't really read/been following this bug, but saw this fly by in my bug folder: > I'm not aware of a way to tell a > child process to die along with the parent, g_spawn_* does not have any magic > flag for it, and 20 minutes of googling didn't bring up anything. I very well > might have missed something, of course. prctl(PR_SET_PDEATHSIG, SIGTERM) after fork() should do it.
Many thanks Ray, I've learned something new. Unfortunately this doesn't seem to work reliably in the GChildSetupFunc, i. e. between fork() and exec(). Most of the time syndaemon does not launch at all, and if it does it does not get killed. It seems to work fine under strace, it slows down quite a bit; that makes debugging a joy, of course. prctrl() is Linux specific, but that can be done with a configure.ac check.
Created attachment 209677 [details] [review] [broken] attempt to use PR_SET_PDEATHSIG For the record, this is the patch using prctl(). When running g-s-d under strace it works perfectly, but when running it normally it never actually runs syndaemon. With some instrumentation (also in glib) I ensured that the g_spawn_async() call succeeds, the child setup function is called, prctrl(PR_SET_PDEATHSIG) succeeds, after that PR_GET_PDEATHSIG reports "1" (i. e. SIGHUP), it finds the binary, and execve() on it succeeds, but still the process is not actually running. When I try to run syndaemon under strace, the output is empty. When I modify the child setup function to do prctl (PR_SET_PDEATHSIG, 0) instead of SIGHUP, it also works fine. So it looks like the syndaemon process gets killed immediately, or there is a Linux bug with using PR_SET_PDEATHSIG. At this point I'm lost, I'm afraid. The previous patch using signal handlers was intrusive, but rock solid, and also portable to non-Linux systems, but I do appreciate that it is quite ugly.
*headdesk* I totally missed that g_spawn_async() double-forks by default, so the syndaemon grandchild races with the intermediate child. So of course syndaemon gets killed at the instant the intermediate child dies, and strace just changed the timing so that the execve() overtook the intermediate child. New patch coming..
Created attachment 209687 [details] [review] mouse: Stop syndaemon when daemon dies Third time is the charm!
Looks great. Why the additional g_child_watch_add()? Would be useful if it reset manager->priv->syndaemon_pid to 0, or used it to restart the daemon if the death wasn't requested explicitely.
(In reply to comment #21) > Looks great. Why the additional g_child_watch_add()? The patch uses G_SPAWN_DO_NOT_REAP_CHILD to avoid double-forking. That causes syndaemon to become a zombie when it dies, until g-s-d lets it die gracefully by waitpid(). Adding the child watch function has this effect. > Would be useful if it reset manager->priv->syndaemon_pid to 0 It's not really needed, but indeed would be cleaner. I'll fix that. > or used it to restart the daemon if the death wasn't requested explicitely. That's a nice idea indeed. If syndaemon segfaults after 10 minutes, that would make sense. But we need to rate-limit it in case syndaemon keeps segfaulting right at startup, we don't want a tight start/crash/restart loop. I think this should become a separate patch, as it's not a regression of this one (right now, g-s-d doesn't even notice if syndaemon goes away).
Comment on attachment 209687 [details] [review] mouse: Stop syndaemon when daemon dies Fair enough. Looks good!
Created attachment 209691 [details] [review] mouse: Stop syndaemon when daemon dies Reset "syndaemon_spawned" flag in syndaemon_died() callback. Killing syndaemon manually and disabling/enabling the option in control-center still works fine, except that now with this cleanup the mouse plugin doesn't try to kill an already dead syndaemon any more (that didn't harm, but was unclean). Thanks for catching this!
As the previous patch already got approved, I pushed the most recent one with the cleanup now. Thanks for reviewing!