After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 136871 - New g_child_watch implementation.
New g_child_watch implementation.
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: mainloop
2.3.x
Other All
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks: 157195
 
 
Reported: 2004-03-11 10:26 UTC by Sebastian Wilhelmi
Modified: 2011-06-04 03:57 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
new g_child_watch implementation (16.03 KB, patch)
2004-03-11 10:27 UTC, Sebastian Wilhelmi
needs-work Details | Review

Description Sebastian Wilhelmi 2004-03-11 10:26:26 UTC
I know, this isn't the time to propose such changes, but after noticing
that  the current g_child_watch implementation does not work for
multithreaded programs for at least linux < 2.6 I decided to reimplement it
to make it work there too.

This is done by calling waitpid in the signal handler (This is safe
according to the docs, it is in fact recommended for SIGCHLD signal
handlers). However locking mutexes is not allowed, so how to get the pids
to check for (we shouldn't call waitpid (-1) and interfere with the
application) into the signal handler. Here I found quite a nice solution,
which, I think, is legally working. I use a list of pids, which is read by
the signal handler without a lock. This list can only grow and never shrink
or otherwise lose elements. Also proper memory barriers are used. Thus the
signal handler always interates a valid list of pids. The pids can occur
twice or they can be stale, but both is not a problem for calling waitpid.

An added advantage is, that the extra helper thread isn't needed anymore
(This could also have been done with the current solution)

Also sigaction(2) is used instead of signal(2), as the latter isn't
MT-safe. However we use signal at other places also. sigaction should be
pretty portable by now. We might want to use it at the other places also.
Comment 1 Sebastian Wilhelmi 2004-03-11 10:27:08 UTC
Created attachment 25511 [details] [review]
new g_child_watch implementation
Comment 2 Owen Taylor 2004-03-14 15:40:55 UTC
I don't see how this sort of implementation can possible be 100%
reliable - we can't use a blocking write to the pipe, and 
once the pipe is full there is no place for the status to
be stored.

(Even using a helper thread to remove data from the pipe
doesn't help because the interrupted thread might be holding
a lock on malloc data structures)

Of course, it would be *hard* to trigger this problem ...
you would need thousands of children exiting at one time,
but it does seem fundemental.
Comment 3 Owen Taylor 2004-03-14 17:17:29 UTC
Another problem is that you can have races where signal
handlers in multiple threads wait for the same PID,
though there you mostly have harmless ECHILD as long
as PIDs take a while to get reused.

Thinking about the memory allocation problem, maybe it's possible to
make child_watch_pids actually have
structures in it, not just pids - so you have:

 PID 
 FLAG
 STATUS

With transitions

 g_child_watch_add()
 
  FLAG = 0
  <memory barrier>
  PID = PID

 signal handler

  if (PID != -1 && FLAG != 0) {
    if (waitpid (PID) != 0/-1) {
      STATUS = status;
      <memory barrier>
      FLAG = 1
    }
  }

 check

  if (FLAG) {
    status = STATUS
    pid = PID
    <memory barrier>
    PID = -1
    notify (pid, status);
  }

Though gettting all the memory barriers portably right is
hard... 
Comment 4 Matthias Clasen 2004-11-08 15:53:56 UTC
Probably not a 2.6 issue at this point. Moving to 2.8, for further consideration
Comment 5 Matthias Clasen 2011-06-04 03:57:04 UTC
The childwatch implementation has been refactored in the meantime, and will perhaps soon be supplanted by a signalfd-based implementation.