After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 145597 - child-test failure on HP-UX 11.x, Solaris 2.5.1-9, AIX 4.3.x, 5.x, IRIX 6.5
child-test failure on HP-UX 11.x, Solaris 2.5.1-9, AIX 4.3.x, 5.x, IRIX 6.5
Status: RESOLVED DUPLICATE of bug 136867
Product: glib
Classification: Platform
Component: general
2.4.x
Other Solaris
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks: 157195
 
 
Reported: 2004-07-08 02:15 UTC by The Written Word
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement


Attachments
Patch to determine signal behavior (1.76 KB, patch)
2004-07-20 21:58 UTC, The Written Word
none Details | Review
Updated patch (1.75 KB, patch)
2004-08-10 17:55 UTC, The Written Word
none Details | Review

Description The Written Word 2004-07-08 02:15:00 UTC
$ cd tests
$ ./child-test
child 19773 (ttl 10) exited, status 0
[hang]
Comment 1 The Written Word 2004-07-08 02:16:11 UTC
I ran a quick test on Solaris 9/SPARC and this bug does not occur if I disable
threads.

The following platforms do pass child-test successfully:
  HP-UX 10.20 (doesn't have threads)
  Tru64 UNIX 4.0D, 5.1
  Redhat Linux 7.1, 9
  Redhat Enterprise Linux 2.1, 3.0
Comment 2 The Written Word 2004-07-08 02:52:28 UTC
The ps output while the process is hung:
  $ ps -fu china
  ...
     china 19293   464  0 21:48:51 pts/3    0:00
/opt/build/glib-2.4.2/tests/.libs/child-test
     china 19309 19293  0                   0:00 <defunct>
  ...

So, it looks like someone is not catching SIGCHLD.
Comment 3 Thomas Thorberger 2004-07-11 17:12:16 UTC
On Solaris the signal list is reset after the call of the handler function when signal() is used.
The error does not occur if you use sigset() in gmain.c for the SIGCHLD handling.
What goes wrong in child-test:
  1) first child is generated and SIGCHLD is set.
  2) second child is generated and SIGCHLD is set the second time (obviously this has no impact).
  ...
  3) first child returns, the handler is called and the SIGCHLD is deleted fom the active signal list.
  4) second child returns and the signal list does not contain SIGCHLD anymore, thus child-test hangs
      waiting for an interrupt.
So, if you want a signal to be set for more than one interrupt you need to use sigset() on Solaris.

  Regards
Comment 4 The Written Word 2004-07-11 22:57:17 UTC
BTW, the hang doesn't occur only on Solaris. HP-UX 11.x hangs as well. Should we
just replace:
  signal (SIGCHLD, g_child_watch_signal_handler);
with:
  sigset (SIGCHLD, g_child_watch_signal_handler);
Comment 5 Thomas Thorberger 2004-07-14 14:27:01 UTC
Yes, this happens on systems where the signal() implementation is following
System V standard and do not have a BSD implementation.
From Linux signal(3) manual:

PORTABILITY
  The original Unix signal() would reset the handler to SIG_DFL, and Sys-
  tem  V  (and the Linux kernel and libc4,5) does the same.  On the other
  hand, BSD does not reset the handler, but blocks new instances of  this
  signal from occurring during a call of the handler.  The glibc2 library
  follows the BSD behaviour.

The best way to solve this problem is to add a test to "configure"
which attempts to get the behaviour of signal():

/* check if signal handler is set to SIG_DFL after a signal */
#include <signal.h>
int sig;
void handler(int x)
{ sig++; }
main()
{ sig=0;
  signal( SIGUSR1, handler);
  kill(getpid(), SIGUSR1);
  sleep(1);
  kill(getpid(), SIGUSR1);
  exit (sig);
}
If the check does not return "2" sigset() must be used.

Another solution may to test for the existence of sigset(),
and always use sigset() if it is defined in libc and use signal()
if not (Linux for example does not have sigset).
Comment 6 The Written Word 2004-07-14 14:45:43 UTC
Ok, I'll work on a patch based on this. Thanks. BTW, why the call to sleep(1)?
I'd rather not introduce a 1s sleep in the autoconf script.
Comment 7 Thomas Thorberger 2004-07-14 15:14:25 UTC
You may be right, I just wanted to be sure that the signals are delivered
one after another and that the signal handler ist already processed wenn the
second signal is generated. I tested the script without the sleep() on a Solaris
and a Linux Box and it is working fine.
Comment 8 The Written Word 2004-07-20 21:58:30 UTC
Created attachment 29715 [details] [review]
Patch to determine signal behavior
Comment 9 The Written Word 2004-08-10 17:55:59 UTC
Created attachment 30402 [details] [review]
Updated patch 

configure.in patch in #29715 wouldn't work.
Comment 10 Jonas Jonsson 2004-08-30 14:59:08 UTC
Take a look at bug 136867
Comment 11 The Written Word 2004-08-31 05:45:33 UTC
Ok, looks like the same bug. So, what solution should we use?
Comment 12 Matthias Clasen 2004-11-08 15:47:26 UTC

*** This bug has been marked as a duplicate of 136867 ***