GNOME Bugzilla – Bug 687061
g_unix_catchall_child_source_new: New API to do waitpid(-1, ...)
Last modified: 2013-01-19 17:33:19 UTC
With the goal of modifying gnome-session to use the new Linux PR_SET_CHILD_SUBREAPER, it needs to be able to reap arbitary children that may be reparented to it. But if gnome-session is going to continue to work with GLib and GIO, even if we converted all child watches in gnome-session to use this new API, we'd still have to contend with the possibility of it having a process spawned indirectly (stuff like the GDBus dbus-launch invocation). There is the separate but related issue of continuing to support spawning processes in multiple threads. gnome-session should be able to call g_spawn_sync() from a helper thread, without its waitpid() invocation racing with the waitpid(-1, ...) from the GLib worker thread. In order to make that case work, explicitly block the waitpid(-1, ) while a g_spawn_sync() is in progress. This isn't ideal, but it's better than being racy.
Created attachment 227474 [details] [review] g_unix_catchall_child_source_new: New API to do waitpid(-1, ...)
Created attachment 227475 [details] demo/test program With PR_SET_SUBCHILD_REAPER, you correctly see the "sleep" process as a child of "gsubinit".
Created attachment 227476 [details] [review] g_unix_catchall_child_source_new: New API to do waitpid(-1, ...) Actually, I realized the previous patch had a race condition - if you had both a regular and catchall source, there was a window in which waitpid(pid, ) wouldn't catch it, but waitpid(-1) would. Fix this by explicitly separating the "no catchall source" and "have catchall source" cases. If we have a catchall, just use waitpid(-1, ), and look at the pid, dispatching it to the correct sources. Added a test case for this too.
Also please see GNOME bug 687075, which attempts to fix a documentation problem in this area. It's not clear to me how the two proposed fixes interact.
I see 'gnome-session' and I think 'violation of Colin's two-modules rule'. I also wonder if we can't just use the existing API with a pid of 0 or 1...
Comment on attachment 227476 [details] [review] g_unix_catchall_child_source_new: New API to do waitpid(-1, ...) >+ /* We only expect there to be one or two at most, so >+ * it's not worth refcounting the UnixWaitStatus. >+ */ It would not make sense to create a catchall source from a library (for the same reason it doesn't make sense to call waitpid(-1)), so the only code creating a catchall source should be the app itself, and in that case it ought to be able to put all of its catchall-child-handling code in one place. So I think you ought to say there can be only one catchall source. >+ for (iter = catchall_source->pending_waits; iter; iter = iter->next) >+ { >+ UnixWaitStatus *status = iter->data; >+ >+ (child_callback) (status->pid, status->estatus, user_data); >+ >+ g_slice_free (UnixWaitStatus, status); >+ } >+ >+ g_clear_pointer (&catchall_source->pending_waits, (GDestroyNotify) g_slist_free); There's a race condition here; the worker thread may prepend additional UnixWaitStatuses to catchall_source->pending_waits while the main thread is iterating it, and then you'll free them without having dispatched them. You need a G_LOCK(unix_signal_lock) and a local pending_waits variable, etc. >+ * g_unix_catchall_child_source_new: >+ * >+ * Normally, g_child_watch_add() is used to monitor child processes. >+ * This function exists to support very unusual circumstances, such as You should say what it does before getting into the rationale. >+ * In other words, this function provides the equivalent of >+ * <literal>waitpid(-1, ...)</literal>. However, the source is only >+ * triggered for child PIDs which do not already have an active child >+ * watch. Why? That doesn't seem like the semantic you'd want; if there is something that you want to do for "all child processes", then why would you want to *not* do it just because you (or someone else) wanted to do something else for one particular child process? Likewise for not getting notified of the SIGCHLD inside g_spawn_sync().
(In reply to comment #6) > So I think you ought to say there can be only one catchall source. Ok. I guess we can relax the restriction later if anyone complains. > There's a race condition here; Fixed. > You should say what it does before getting into the rationale. Fixed. > >+ * In other words, this function provides the equivalent of > >+ * <literal>waitpid(-1, ...)</literal>. However, the source is only > >+ * triggered for child PIDs which do not already have an active child > >+ * watch. > > Why? That doesn't seem like the semantic you'd want; if there is something that > you want to do for "all child processes", then why would you want to *not* do > it just because you (or someone else) wanted to do something else for one > particular child process? Basically for gnome-session, I want to warn if a random child that got reparented (think the a11y bus daemon) exits with an error. Ideally, we don't "double error" if GLib launches a helper process which exits unsuccessfully. > Likewise for not getting notified of the SIGCHLD > inside g_spawn_sync(). We do get notified of SIGCHLD - we just fall back to the "no catchall source" case. In other words, while you have a g_spawn_sync() active, just the catchall source won't catch things. But...I could probably make g_spawn_sync() defer to the GLib worker thread for its waitpid() call. That would avoid the race condition.
Created attachment 227545 [details] [review] g_unix_catchall_child_source_new: New API to do waitpid(-1, ...) Some comments addressed, but doesn't yet unify g_spawn_sync()'s waitpid() call
(In reply to comment #5) > I see 'gnome-session' and I think 'violation of Colin's two-modules rule'. True...but one thing that makes this case different from some others is there's just no clean and reasonable way to do it outside of GLib. > I also wonder if we can't just use the existing API with a pid of 0 or 1... Mmm...that's a perennial debate in kernel land. Using -1 would be clearly analogous to waitpid(), which would be nice. But on the other hand, sometimes major new functionality should be new API, even if you could in theory wedge it into a flag value for an existing one. I'm a bit on the fence; do you think -1 is better?
(In reply to comment #9) > I'm a bit on the fence; do you think -1 is better? Mostly because I don't like the new name and because waitpid(-1) is a very well-established idea. On the other hand, we should probably consider what this means to Windows, where a GPid is not an int....
(In reply to comment #7) > Basically for gnome-session, I want to warn if a random child that got > reparented (think the a11y bus daemon) exits with an error. Ideally, we don't > "double error" if GLib launches a helper process which exits unsuccessfully. Ah, so basically you're *only* interested in SIGCHLDs that no one else is interested in. (Making this API even more single-purpose...) But wait... in order for this to be useful, you have to be able to map the returned pid to the right process, so you know what service to restart, right? Which implies you have to already know the pids of all of the child/grandchild/etc processes that you're watching. So can't you just create ordinary GChildWatchSources for them?
(In reply to comment #11) > (In reply to comment #7) > > Basically for gnome-session, I want to warn if a random child that got > > reparented (think the a11y bus daemon) exits with an error. Ideally, we don't > > "double error" if GLib launches a helper process which exits unsuccessfully. > > Ah, so basically you're *only* interested in SIGCHLDs that no one else is > interested in. (Making this API even more single-purpose...) Yes. Rather than "SIGCHLDs", let's say "zombie children". It's more correct, and also *way* more fun to use in a technical conversation. > But wait... in order for this to be useful, you have to be able to map the > returned pid to the right process, so you know what service to restart, right? I'm not interested in service restarting - yet. It's basically just about a pretty "ps auxwf", which is actually fairly useful for debugging and such. In theory we could parse /proc/<pid>/exe and use heuristics on that to determine what service it is. But long term it will be better to just fix things to not double fork. In the meantime though, this helps.
OK. So, random non-glib-modifying idea: waitid() (no 'p') has a WNOWAIT option to make it return information but not actually reap the child. (No, I had never heard of this before today :). So you could have your catchall SIGCHLD handler call waitid(P_ALL, 0, &info, WNOWAIT|WNOHANG) to see if there is at least one pending zombie, and if so, save info.si_pid, and schedule a timeout for 1 second later. In the timeout callback, call waitpid(saved_pid, &status, WNOHANG), which will either reap the child or return ECHILD if someone else already reaped it for you. Then call the SIGCHLD handler again to do another waitid() to see if there's another zombie waiting. The "catchall" may end up never seeing some zombies, but if your goal is just to ensure that every zombie gets reaped, then that doesn't matter. The "1 second" timeout can be adjusted to find the right mix between "doesn't accidentally reap other people's zombies" and "doesn't leave unsightly zombies cluttering up your ps output for too long" (And this works if some library you're using uses non-glib APIs to spawn/wait too.)
Created attachment 227581 [details] [review] gmain: Handle EINTR from waitpid() Just on general principle.
Created attachment 227582 [details] [review] gspawn: Defer waitpid() to GLib worker thread This is preparatory work for a future commit which will add a "catchall" waitpid API. If we don't synchronize here with the worker thread, race conditions are possible. It's also just a nice code cleanup - gmain.c has all the gunk to handle EINTR for waitpid(), so let's just reuse that.
Created attachment 227583 [details] [review] GChildWatchSource: Allow passing -1 for pid With the goal of modifying gnome-session to use the new Linux PR_SET_CHILD_SUBREAPER, it needs to be able to reap arbitary children that may be reparented to it. But if gnome-session is going to continue to work with GLib and GIO, even if we converted all child watches in gnome-session to use this new API, we'd still have to contend with the possibility of it having a process spawned indirectly (stuff like the GDBus dbus-launch invocation). Thus, we don't give the -1 child watch *every* waitpid, only the ones which don't have a specific watcher.
(In reply to comment #13) > > waitid() (no 'p') has a WNOWAIT option to make it return information but not > actually reap the child. (No, I had never heard of this before today :). You probably had actually in https://bugzilla.gnome.org/show_bug.cgi?id=672102#c27 > So you > could have your catchall SIGCHLD handler How can I install a handler for SIGCHLD in gnome-session without stomping on glib's? > call waitid(P_ALL, 0, &info, > WNOWAIT|WNOHANG) to see if there is at least one pending zombie, and if so, > save info.si_pid, and schedule a timeout for 1 second later. These sorts of arbitrary delays are lame. > The "1 > second" timeout can be adjusted to find the right mix between "doesn't > accidentally reap other people's zombies" "How much race condition do you want?"
Created attachment 227587 [details] [review] gmain: Merge warning about incompatible SIGCHLD from gspawn This misconfiguration warning got lost when shuffling the code over to gmain.
(In reply to comment #17) > > actually reap the child. (No, I had never heard of this before today :). > > You probably had actually in > https://bugzilla.gnome.org/show_bug.cgi?id=672102#c27 Huh. > > So you > > could have your catchall SIGCHLD handler > > How can I install a handler for SIGCHLD in gnome-session without stomping on > glib's? When I wrote that I was thinking you could both just register your own SIGCHLD handlers... apparently I've managed to avoid dealing with UNIX signals for long enough that my brain is now confusing them with gobject signals... But anyway, sigaction returns the previous handler, so you *could* chain handlers, although no one actually does... > These sorts of arbitrary delays are lame. Yeah well, we're building on top of some pretty lame APIs. There's a limit to how awesome we can be without more help from the kernel.
Review of attachment 227582 [details] [review]: Blah, not quite good enough - when spawning a child process in a thread, there's still a race condition where the worker's waitpid(-1) might eat it before we install a child watch. This patch gets us to a better place though - we just need to lock the worker thread out until we get a child watch source set up.
A couple of random notes: 1: glib's SIGCHLD handler only gets registered if you actually start watching for child exits. If you don't do that, you're safe to deal with it yourself. Since gnome-session is a program (and not a library) you are capable of doing that. 2: for the same reason, it's possible to use signalfd() from gnome-session sanely. Just set the signal mask for ignoring SIGCHLD before the first thread is spawned and signalfd() is now reliable. #1 is your solution, I think. #2 could simplify the implementation of #1, but only until you realise that gnome-session should probably continue to work on non-Linux, so you'll have to write the code "the hard way" anyway...
(In reply to comment #21) > A couple of random notes: > > 1: glib's SIGCHLD handler only gets registered if you actually start watching > for child exits. If you don't do that, you're safe to deal with it yourself. > Since gnome-session is a program (and not a library) you are capable of doing > that. Except GLib can create child watches internally for helper programs.
(In reply to comment #22) > Except GLib can create child watches internally for helper programs. This is an interesting point that I didn't consider. Do you have a list?
(In reply to comment #23) > (In reply to comment #22) > > Except GLib can create child watches internally for helper programs. > > This is an interesting point that I didn't consider. Do you have a list? I already gave one example in the commit message of the patch in https://bugzilla.gnome.org/show_bug.cgi?id=687061#c16
Created attachment 227681 [details] [review] gmain: Handle EINTR from waitpid() Rebased.
Created attachment 227682 [details] [review] gspawn: Defer waitpid() to GLib worker thread Now merged with gmain: Merge warning about incompatible SIGCHLD from gspawn
Created attachment 227683 [details] [review] GChildWatchSource: Allow passing -1 for pid Fixed g_spawn_sync() race more cleanly.
Comment on attachment 227681 [details] [review] gmain: Handle EINTR from waitpid() sure
Comment on attachment 227682 [details] [review] gspawn: Defer waitpid() to GLib worker thread >- int res; >+ pid_t pid; hm... parts of this patch should really be squashed back into the previous one other than that, this seems to be right, though given the number of tries it took, i'm wary that there's still some problem we're not thinking of (so if anyone else wants to double-check...)
Comment on attachment 227683 [details] [review] GChildWatchSource: Allow passing -1 for pid >+ /* Normal EINTR loop */ >+ do >+ pid = waitpid (-1, &estatus, WNOHANG); >+ while (pid == -1 && errno == EINTR); slightly annoying that in the catchall case you do do waitpid while EINTR normal processing and in the invidividually case you do do waitpid normal processing while EINTR >+GSource * >+g_unix_catchall_child_source_new (void) This should have gone away, and the test program should use g_child_watch_source_new(-1) >+ * Since GLib 2.36, -1 may be given as @pid. In this mode, the source Since GLib 2.36, on UNIX, ... (or else add FIXMEs for a Windows implementation) >+ * GLib-using process to perform a function equivalent to calling >+ * <literal>waitpid(-1, ...)</literal>. There can at present be at "similar to", not "equivalent to", since you don't see children that have their own watches. >+ /* Assumes we have the unix signal lock */ >+ source = _g_main_child_watch_source_new_unix_unlocked (pid); > g_source_set_callback (source, (GSourceFunc)on_sync_waitpid, &waitpid_data, NULL); > g_source_attach (source, context); > g_source_unref (source); >+ >+ /* Now we safely have our waitpid() queued up for the GLib worker >+ * thread to process. Unlock the mutex so it can continue. >+ */ >+ g_assert (have_unix_signal_lock); >+ _g_main_unix_signal_unlock (); >+ have_unix_signal_lock = FALSE; This is pretty tricky... And actually, g_spawn_async() has the same problem anyway; if you pass G_SPAWN_DO_NOT_REAP_CHILD, then you need to arrange for the catchall reaper to not reap that child, even if no child watch has been created for it yet when the child exits. (I guess it's not really "not reap", but "remember the exit status and return it immediately when the child watch for that pid does get created".) And if the API was something like "GPid g_main_unix_fork_and_do_not_reap_child (void);", then you wouldn't need to also expose lock/unlock ops.
(In reply to comment #30) [everything not replied to is fixed] > And actually, g_spawn_async() has the same problem anyway; Right. I need to think about this. I had been setting it aside temporarily some under the (kind of invalid) assumption that glib itself wouldn't be using g_spawn_async() *internally*, but that's not really valid. And even if it was, we still want a way for gnome-session to use g_spawn_async() and use individual watches. > if you pass > G_SPAWN_DO_NOT_REAP_CHILD, then you need to arrange for the catchall reaper to > not reap that child, even if no child watch has been created for it yet when > the child exits. (I guess it's not really "not reap", but "remember the exit > status and return it immediately when the child watch for that pid does get > created".) The problem is that we have no way to atomically fork() *and* ensure a child watch is created. Even discarding threads in the app, we still have to contend with the fact that the GLib helper thread and the main thread will race. > And if the API was something like "GPid g_main_unix_fork_and_do_not_reap_child > (void);", then you wouldn't need to also expose lock/unlock ops. Super ugly to implement, as far as I can see - I'd have to move the first 1/3 of fork_exec_with_pipes() into gmain.c. Maybe that's right, but let me consider some more alternatives.
(In reply to comment #31) > > And if the API was something like "GPid g_main_unix_fork_and_do_not_reap_child > > (void);", then you wouldn't need to also expose lock/unlock ops. > > Super ugly to implement, as far as I can see - I'd have to move the first 1/3 > of fork_exec_with_pipes() into gmain.c. Maybe that's right, but let me > consider some more alternatives. what I was imagining: GPid g_main_unix_fork_and_do_not_reap_child (void) { pid_t pid; G_LOCK (unix_signal_lock); pid = fork (); if (pid > 0) { UnixWaitStatus *waitstatus = g_slice_new (UnixWaitStatus); waitstatus->pid = pid; waitstatus->status = FIXME something that's not a valid wait() result; /* or add a gboolean to UnixWaitStatus */ do_not_reap_pids = g_slist_prepend (do_not_reap_pids, waitstatus); } G_UNLOCK (unix_signal_lock); return pid; } (oh, probably want to preserve errno around the G_UNLOCK if pid < 0) fork_and_exec_with_pipes() would just replace its call to fork() with a call to g_main_unix_fork_and_do_not_reap_child() (if DO_NOT_REAP). dispatch_sigchld_with_catchall() would have a loop over do_not_reap_pids in addition to its loop over unix_child_watches, which would fill it in and unset give_to_catchall if it found a match. g_child_watch_source_new() would check if its pid was in do_not_reap_pids, and if so, remove it, and if it already had a valid status, then it would set child_watch_source->child_status from that and set ->child_exited so the source would trigger right away.
Created attachment 228110 [details] [review] GChildWatchSource: Allow passing -1 for pid With the goal of modifying gnome-session to use the new Linux PR_SET_CHILD_SUBREAPER, it needs to be able to reap arbitary children that may be reparented to it. But if gnome-session is going to continue to work with GLib and GIO, even if we converted all child watches in gnome-session to use this new API, we'd still have to contend with the possibility of it having a process spawned indirectly (stuff like the GDBus dbus-launch invocation). Thus, we don't give the -1 child watch *every* waitpid, only the ones which don't have a specific watcher. Also, fix up g_spawn_sync() to hold the unix signal lock when forking. This way we ensure that GLib itself gets the waitpid() result for the synchronously spawned child, instead of racing with the worker thread.
That actually worked out quite well, Dan! Thanks a lot for the concept. The minor tweak I did here to avoid adding a gboolean to UnixWaitStatus is to set pid = -1 until it's exited. Also, there's now a test case that actually uses PR_SET_SUBCHILD_REAPER (if available). One thing occurs to me - I don't actually have test coverage of the g_spawn_async() + catchall case. I'll add one.
I love the mutex that gets locked once and unlocked twice.... Some scary code there, Dan. :)
On a more serious, note though: the case that this code tries to catch sort of illustrates the conceptual dirtiness of what we're trying to do here... We have this hash table that contains a list of PIDs not to reap. If I read correctly, it never gets cleaned out except when doing a child watch. Problem 1: I do a lot of this: setup_catchall(); for (big number) { pid = g_spawn(); waitpid(pid); } and the table gets bigger and bigger..... Problem 2: Imagine this situation (in gnome-session): setup_catchall(); pid = g_spawn(); waitpid(pid); ... time passes, 'pid' gets recycled ... ... 'pid' quits and gets punted back up to gnome-session ... ... it is never reaped, because it's still in the don't-reap table ... Another problem (or a different angle on the same problem?) is that if glib ever calls waitpid(-1) and sees something that it isn't supposed to see then no matter how hard we try, we can never put that child's information back into the kernel, do-not-reap-list or otherwise. In that sense, it's not so much do-not-reap as it is reap-then-ignore. This means that it's completely unsafe to call any variant of wait() from outside of GLib in any process that is using the catch-all. We might have an idea that we are able to impose this restriction for something like gnome-session but the argument falls apart when it comes to libraries (along much the same lines as goes your argument for why we need this patch in the first place).
Created attachment 228118 [details] [review] GChildWatchSource: Allow passing -1 for pid Now with more tests, and also an updated commit message
(In reply to comment #36) > setup_catchall(); > > for (big number) > { > pid = g_spawn(); > waitpid(pid); > } Such an application can not exist currently because it can't be using the catchall watch yet - and the hash table is only used in the case where there is one. > Problem 2: Imagine this situation (in gnome-session): > > setup_catchall(); > > pid = g_spawn(); > waitpid(pid); Likewise here. I control gnome-session's codebase, it doesn't invoke waitpid() directly. > Another problem (or a different angle on the same problem?) This means that it's completely unsafe > to call any variant of wait() from outside of GLib in any process that is using > the catch-all. Right, so by "using the catch-all" we mean "gnome-session". > We might have an idea that we are able to impose this restriction for something > like gnome-session but the argument falls apart when it comes to libraries > (along much the same lines as goes your argument for why we need this patch in > the first place). You do have an interesting point here - what about libraries outside of GNOME like say libcanberra? It is entirely possible it tries to auto-activate pulseaudio and that involves finding some helper binary. I don't have a great answer for that. We'd have to have non-GLib libraries export an API where we can "plug in" process helper APIs, just like we expect these libraries to export mainloop integration. Actually libdbus-1 is probably the best example here. Which in fact does spawn dbus-launch. And gnome-session does link to it =/ systemd also links to libdbus-1 too of course, and as far as I can see they simply don't handle this at all. In practice, it's not really a problem since the autolaunching just doesn't happen for systemd, and that's really the only case where we launch helper binaries. So we do have a general path for solving the "non-GLib helper library" case: 1) Change library to export hooks for integrating external launching 2) Drop dependency on library, reimplement with GLib-using library (e.g. port from dbus-glib to GDBus) 3) Split library into separate helper process I can iterate on this for gnome-session, and come back to this patch when it's ready. But do you agree that after having done so, this patch is mergable and would make sense to use in gnome-session?
(In reply to comment #38) > > We might have an idea that we are able to impose this restriction for something > > like gnome-session but the argument falls apart when it comes to libraries > > (along much the same lines as goes your argument for why we need this patch in > > the first place). > > You do have an interesting point here - what about libraries outside of GNOME > like say libcanberra? Actually, it is not just libraries outside of GNOME; the g_spawn_async_with_pipes() docs allow you to use waitpid() directly (on unix). > So we do have a general path for solving the "non-GLib helper library" case: > > 1) Change library to export hooks for integrating external launching Ugh > 2) Drop dependency on library, reimplement with GLib-using library (e.g. port > from dbus-glib to GDBus) Sure, but glib-using libraries can still cause problems, and even after you fix them they could easily get broken again since their maintainers aren't going to be considering this use case which only matters to one program. > 3) Split library into separate helper process might make sense in some cases, probably doesn't in other cases > I can iterate on this for gnome-session, and come back to this patch when it's > ready. But do you agree that after having done so, this patch is mergable and > would make sense to use in gnome-session? Every time I think we've figured this out another problem comes up. Given that PR_SET_CHILD_SUBREAPER is linux-specific anyway, wouldn't having a linux-specific sane new kernel API really be a much better solution?
(In reply to comment #39) > Actually, it is not just libraries outside of GNOME; the > g_spawn_async_with_pipes() docs allow you to use waitpid() directly (on unix). Yes, but I have the source code to every dependency of gnome-session, and the ability to fix it, and: > Sure, but glib-using libraries can still cause problems, and even after you fix > them they could easily get broken again since their maintainers aren't going to > be considering this use case which only matters to one program. But would they really go and change code back from using g_child_watch_add() to raw waitpid()? I can't imagine that happening in practice. > Every time I think we've figured this out another problem comes up. Given that > PR_SET_CHILD_SUBREAPER is linux-specific anyway, wouldn't having a > linux-specific sane new kernel API really be a much better solution? Like some new waitpid (-1, &estatus, WNOHANG | LINUX_ONLY_REPARENTED) ? The cost for the kernel would be tracking processes which were reparented - probably not too hard. But I do think this problem is a bit overblown...we haven't yet identified any *in actual practice* case where gnome-session will call into a library that will spawn a process *and* the library calls waitpid() directly.
(In reply to comment #40) > But would they really go and change code back from using g_child_watch_add() to > raw waitpid()? I can't imagine that happening in practice. No, but they may add new code that needs to spawn something, and if they want to do a blocking wait, using a child watch would be annoying and seemingly-unnecessary. > Like some new waitpid (-1, &estatus, WNOHANG | LINUX_ONLY_REPARENTED) ? I meant the more general problem of "SIGCHLD/wait sucks as an API".
(In reply to comment #39) > Every time I think we've figured this out another problem comes up. Given that > PR_SET_CHILD_SUBREAPER is linux-specific anyway, wouldn't having a > linux-specific sane new kernel API really be a much better solution? This pretty much sums up my opinion of this bug. Add to that the fact that this would only be for one process in the entire session (even though I understand that GLib has the unique position here in terms of being able to implement it in a way that stays 'friendly' with other things that may get spawned in that same process). I think that, as is, this is a WONTFIX.