GNOME Bugzilla – Bug 155546
Startup race?
Last modified: 2018-07-01 08:47:26 UTC
Followup to http://mail.gnome.org/archives/gamin-list/2004-October/msg00040.html Playing a bit with the very stupid command line client called monitor.c which is copy-pasteable from here (Example 8-1): http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=0650&db=bks&fname=/ SGI_Developer/books/IIDsktp_IG/sgi_html/ch08.html (replace GETfd with GETFD and simply compile it) I found this strange behavior: When gam_server is not yet running, launcing "./monitor /etc" sometimes starts up correctly, but sometimes exits with a "Connection refused" message. However, this leaves gam_server running in the background, and a subsequent attept to launch "./monitor /etc" always succeeds as long as I keep gam_server running. When I kill it, the story starts from the beginning. Is there maybe a race condition in the code that starts the daemon? I guess it fork+exec's gam_server and then tries to connect to it, but sometimes gam_server is not yet accepting connections by this time. May this be the case? Version details: gamin 0.0.14 inotify patch 0.14 running, but gamin complied against 0.13.1's header since it doesn't compile with 0.14. kernel based on 2.6.9-rc3 glib 2.4.7 glibc 2.3.3 (from Fedora Core 2) gcc 3.3.4
As I see, there's absolutely no synchronization between libgam and gam_server. libgam launches gam_server and then doesn't care about the child anymore, it tries to connect to the socket at most 25 times usleeping 50000 between attemts. This is in libgamin/gam_api.c gamin_connect_unix_socket(). Here's my recommendation, this is what roughly I'd do, in some pseudo code: inside libgamin, actually in gam_fork.c gamin_fork_server(): pipe(&pipefd); fork, setsid, fork... child: close(pipefd[0]); setenv("GAM_PIPEFD_NO", pipefd[1] (as a text)); execl... parent: close(pipefd[1]); read(pipefd[0], ...); inside gam_server: when it's ready to serve requests (the socket is already created and listened on) then close(getenv("GAM_PIPEFD_NO")); To summarize: create a pipe between the two processes, the parent reads data from the pipe, the server (child) closes the pipe (so that the parent reads an EOF) as soon as it's ready to serve requests. Actually, even some worthful data can be sent on this pipe, e.g. an "OK" word, which the parent could check for to distinguish between "everything's OK" and "unexpected error (such as exec failed)". And then I don't know if that retrying heuristics will still be needed... Patching the library is trivial as only gam_fork.c gamin_fork_server() needs some modification. I haven't yet checked the code of gam_server to find where it needs to be patched.
FYI, the link to monitor.c is broken.
This URL should work: http://web.archive.org/web/20030419104847/http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=0650&db=bks&fname=/SGI_Developer/books/IIDsktp_IG/sgi_html/ch08.html
Created attachment 63798 [details] monitor.c monitor.c (fixed version) attached, so that you don't have to bother with copy-pasting from html. Simply compile with "gcc -o monitor monitor.c -lgamin-1"
gamin is not under active development anymore and has not seen code changes for many years. Its codebase has been archived: https://gitlab.gnome.org/Archive/gamin/commits/master Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect reality. Please feel free to reopen this ticket (or rather transfer the project to GNOME Gitlab, as GNOME Bugzilla is deprecated) if anyone takes the responsibility for active development again.