GNOME Bugzilla – Bug 395488
[PATCH] gnome-session does not wait long enough for dbus-daemon to start
Last modified: 2008-10-15 12:11:40 UTC
Please describe the problem: The new gnome-session automatically spawns dbus-daemon in session mode as most desktop components need it now. However, there is a problem. dbus-daemon wants to write its child PID twice: once before dbus-daemon forks its child, then again once the dbus-daemon child starts (from within the child process). Unfortunately, there is a big chance that when the child process writes the PID, the pid_fd within gnome-session is closed. This results in a SIGPIPE within dbus-daemon, and dbus-daemon dies. We were running with a previous version of this patch in our local ports tree for some time, and never saw this problem. However, the race was always there. It seems GNOME 2.17 is quicker, and more likely to encounter this problem. The attached patch reads the dbus-daemon PID twice before proceeding. This allows dbus-daemon to stay running. Steps to reproduce: 1. Start gnome-session without an existing dbus-daemon session bus running 2. 3. Actual results: dbus-daemon starts, then dies on SIGPIPE. This results in all dbus-dependent desktop components to either die (SIGABRT) or misbehave. Expected results: dbus-daemon should continue to run for the life of the GNOME session. Does this happen every time? No, but in GNOME 2.17, it is much more likely to happen. Other information: Here is a ktrace snippet showing this problem. Here, PID 1956 is the dbus-daemon spawned by gnome-session, and 1957 is the dbus-daemon child spawned by the first dbus-daemon: 1956 dbus-daemon CALL fork 1956 dbus-daemon RET fork 1957/0x7a5 1956 dbus-daemon CALL write(0x12,0x8208ba0,0x5) 1956 dbus-daemon GIO fd 18 wrote 5 bytes "1957 " 1956 dbus-daemon RET write 5 1956 dbus-daemon CALL exit(0) 1952 gnome-session RET wait4 1956/0x7a4 1952 gnome-session CALL read(0xe,0xbfbfe720,0x100) 1952 gnome-session GIO fd 14 read 73 bytes "unix:path=/var/tmp/dbus-j2wceYnoKN,guid=724d6adba50748aea077b50045a674\ 18 " 1952 gnome-session CALL read(0x11,0xbfbfe620,0x100) 1952 gnome-session GIO fd 17 read 5 bytes "1957 " 1952 gnome-session RET read 5 1952 gnome-session CALL close(0xe) 1952 gnome-session RET close 0 1952 gnome-session CALL close(0x11) 1952 gnome-session RET close 0 1952 gnome-session CALL gettimeofday(0xbfbfe818,0) 1952 gnome-session RET gettimeofday 0 1957 dbus-daemon RET fork 0 1957 dbus-daemon CALL open(0x8085b09,O_RDWR,<unused>0x2813a138) 1957 dbus-daemon NAMI "/dev/null" 1957 dbus-daemon RET open 4 1957 dbus-daemon CALL dup2(0x4,0) 1957 dbus-daemon RET dup2 0 1957 dbus-daemon CALL dup2(0x4,0x1) 1957 dbus-daemon RET dup2 1 1957 dbus-daemon CALL dup2(0x4,0x2) 1957 dbus-daemon RET dup2 2 1957 dbus-daemon CALL umask(S_IWGRP|S_IWOTH) 1957 dbus-daemon RET umask 18/0x12 1957 dbus-daemon CALL setsid 1957 dbus-daemon RET setsid 1957/0x7a5 1957 dbus-daemon CALL getpid 1957 dbus-daemon RET getpid 1957/0x7a5 1957 dbus-daemon CALL write(0x12,0x8208ba0,0x5) 1957 dbus-daemon RET write -1 errno 32 Broken pipe 1957 dbus-daemon PSIG SIGPIPE SIG_DFL
Created attachment 80045 [details] [review] Read the dbus-daemon PID twice to avoid a race to a SIGPIPE
Created attachment 80046 [details] [review] Read the dbus-daemon PID twice to avoid a race to a SIGPIPE
Actually, the change to the read_line() function is what broke this. The previous implementation did not set done to TRUE once the '\n' had been read. Therefore, the second read() was invoked, and this picked up the PID from the child.
Shouldn't dbus-daemon handle SIGPIPE instead?
Also, running when I run dbus-daemon from the command line, I get this: $ dbus-daemon --fork --session --print-address 1 --print-pid 1 unix:abstract=/tmp/dbus-IPUOan09lL,guid=763085997106771d83a25d0045ae7aef 23983 It only prints the PID once on my system. (And really, having --no-fork would mitigate this bug too)
Lowering severity a bit.
*** Bug 336237 has been marked as a duplicate of this bug. ***
Jan, Joe: I really think this has to be handled on the dbus side. Is there a bug in dbus bugzilla about this?
Not yet, but I have written the code to do it. I just need to file the bug.
I have filed https://bugs.freedesktop.org/show_bug.cgi?id=10929 to track this on the dbus side.
as an aside, I think it's slightly odd to start dbus from gnome-session; the intent (and afaik practice to date) is to start it from the X session launch scripts, so gnome-session would inherit dbus from there. i.e. "dbus-launch gnome-session" basically Don't see any reason to bother with the code to do this in gnome-session, it's just extra work, esp. since every major distribution has dbus-launch in their X scripts already afaik. I'll comment on the dbus bug in dbus bugzilla
Well, dbus bugzilla is busted, so a quick comment here - I don't think the pid should be written twice. Reading it twice, or ignoring sigpipe, sounds like a workaround rather than a real fix. Looking at the code in bus/bus.c it looks like it just writes it twice - once in dbus_become_daemon and once in bus.c. Though I admit I could be misunderstanding things since fork() always gets confusing. There's a ChangeLog entry mentioning bug #1720 as possibly related or possibly when the bug was introduced, but since bugzilla.freedesktop.org is hosed I can't go see what that bug was about.
Executing dbus with dbus-launch gnome-session will break some things. One of them is the gnome settings daemon that knows nothing about gnome-keyring. Launching evolution with a hotkey using that method will cause it to ask it for a password for every mailfolder you try to access on an IMAP server, just because it doesn't know where the keyring socket is. As long as dbus doesn't have an option to pass environment variables to it just like bonobo did, I don't see other options than launching it from gnome-session.
So the problem is you need gnome-session to set env variables that dbus-launched apps will then see? I hadn't thought of that, though - wouldn't the modern approach be to use dbus instead of the env variables (i.e. have a method on gnome-keyring to ask it for its socket name) fwiw, a patch to put env variable support in dbus-daemon seems sensible and would probably be easier than screwing with launching dbus.
I think gnome-keyring has that option for modern versions. There's still the session management variables that are exported by gnome-session that don't make it to dbus then. A method to inject environment variables into a running dbus would be nice in this case.
wasn't there a thread on desktop-devel or somewhere about this env variable thing? It seems like I remember one. It would be good to fix, so gnome-session doesn't need all the launch-dbus-daemon logic. it's probably about as easy to fix as this bug, anyhow. (though of course both should be fixed)
(In reply to comment #16) > wasn't there a thread on desktop-devel or somewhere about this env variable > thing? It seems like I remember one. Yup, there was. (I'd hunt it down and provide a URL, but IIRC it wasn't interesting. Someone reported some problem and you were like "oh yeah, we should have an interface to set environment variables in dbus-daemon's environment", and that was the end of the thread.) > It would be good to fix, so gnome-session doesn't need all the > launch-dbus-daemon logic. The original patch to add the features wasn't to solve this problem though. It was to make GNOME work on systems that can't easily start dbus-daemon from the xsession scripts, because they don't install dbus by default. And if gnome-session starts up and dbus-daemon *isn't* running, for whatever reason, it's probably not a good idea to continue logging in without it. So having code to start it in gnome-session seems good. It could probably be simplified by having it just run "dbus-launch --sh-syntax --exit-with-session" though, and then directly putenving the output, rather than all the complicated stuff there now.
libdbus will autolaunch dbus-daemon automatically in the broken case where it isn't running (though I do feel this is a broken case that should never happen on a properly-configured system). For example if you log in to a remote system via ssh and have no DBUS_SESSION_BUS_ADDRESS, autolaunch should occur as a broken fallback (given dbus >= 1.0). If you just need the autolaunch you shouldn't need special code in gnome-session - the special code is only needed because you need to set env variables and dbus can't do it for you right now. If a system doesn't want to install dbus by default, then I think a "test -e /usr/bin/dbus-launch" in the X init scripts would be a fine hack for the developers of that system. But, at the same time, we should just say gnome requires dbus, which means if the script contains a line to exec gnome-session, on that same line it should be able to assume dbus existence.
*** Bug 507662 has been marked as a duplicate of this bug. ***
In the new code base (in trunk now), Dan implemented the dbus handling exactly as he describes in comment #17. Should we consider this fixed then? Is there something else that should be properly handled?
Everythng is working on FreeBSD with gnome-session 2.22 and dbus 1.1.20.
I've been having this problem for a while on Ubuntu gutsy, with occasional failures to start. (gnome-session-2.20.1-0ubuntu1, dbus 1.1.1-3ubuntu4), so have tried applying the changes shown in the patch (2007-01-11 18:18 UTC). This left me with a very broken system. I've now checked and the code in 2.22 is somewhat different to that for 2.20, so surely this patch is obsolete and ought to be marked as such.
This doesn't apply to 2.23/2.24 anymore since we changed the way we launch dbus. See bug #546863 for details.
(In reply to comment #11) > as an aside, I think it's slightly odd to start dbus from gnome-session; the > intent (and afaik practice to date) is to start it from the X session launch > scripts, so gnome-session would inherit dbus from there. > > i.e. "dbus-launch gnome-session" basically > > Don't see any reason to bother with the code to do this in gnome-session, it's > just extra work, esp. since every major distribution has dbus-launch in their X > scripts already afaik. > > I'll comment on the dbus bug in dbus bugzilla > Sorry if I'm off-topic/in the wrong area, but I think I may be having problems around this bug. I'm having issues around dbus launching hp-toolbox (&& various other 'hplip' apps. Am also getting problems launching epiphany. I'm running gnome in ubuntu (hardy) on a 64 bit machine. I'm connecting via ssh (the machine is not local) but had the hplip problems when I was at the machines terminal too). Is this a good place to post, or is there better? Thanks for your time, and apologees for my lack of experience (in advance) Martin