GNOME Bugzilla – Bug 634988
my computer is too fast
Last modified: 2012-09-23 19:28:42 UTC
I have a pretty fast system with solid state disks. When I login to my machine there is no GTK theme -- just the default ugly one. If I check the xsession error log, I see a message to the effect of "Only one X Settings provider may be running at once". Happens 100% of the time and the only way to fix it is by killing off the settings daemon and restarting it. I thought that maybe my login happened too quickly and the XSettings manager from gdm had not exited yet, so I tried making my login slower. root@velocity:~# echo sleep 1 > /etc/X11/Xsession.d/50slowdownthere fixed it. 100% of the time. Either we should change something about the login process to make sure the settings daemon from gdm is fully exited before starting the user's gnome-settings-daemon or we should make gnome-settings-daemon more robust against finding a display with an existing XSettings manager (maybe it should wait internally for a second or two before giving up).
Are you sure about the error message? It doesn't seem to come from gnome-settings-daemon in any case. What session type are you running? Is it gnome-shell or classic?
"classic", as per Ubuntu 10.10. ** (gnome-settings-daemon:7613): WARNING **: You can only run one xsettings manager at a time; exiting ** (gnome-settings-daemon:7613): WARNING **: Unable to start xsettings manager: Could not initialize xsettings manager.
This looks like a GDM bug to me. Ray?
BTW, lots of people getting this same bug at https://bugs.launchpad.net/gnome-settings-daemon/+bug/649809 . It looks indeed like it's a GDM bug
Ryan, can you please replicate it and try getting the g_debug from client_proxy_signal_cb() output somewhere?
gdm does the equivalent of: kill (pid_of_gnome_session, SIGTERM); wait_again: waitpid (pid_of_gnome_session, &status, 0); if (waitpid returned EINTR) goto wait_agan; So GDM is waiting for gnome-session to exit. gnome-session must be exiting early before gnome-settings-daemon exits.
Created attachment 184538 [details] [review] Pathc to make g-s-d wait a little bit for the gdm's g-s-d to end
I don't like it. Feels like it's working around the problem.
Yes, I don't like it neither, but after debugging a lot this issue and confirming that gdm waits correctly for gnome-session to end and gnome-session waits for g-s-d to end (I even tested adding a sleep(30) to g-s-d to try to replicate this on my non-so-fast machine), the issue really seems to be with X, which has the property set by the xsettings manager not removed inmediately when g-s-d dies, but a few milliseconds after that, which is enough to run into this problem on fast machines.
i think the issue is that even though g-s-d has been killed, X needs to notice this and update itself internally accordingly. that doesn't happen before X gets the new request. we could add some new serialisation in gdm (ie: waiting until we see the XSettings provider go away from X's point of view before starting the desktop) but i actually like this approach because it reduces the amount of serialisation -- no extra delays when they are not needed.
Review of attachment 184538 [details] [review]: as for the patch, i looked at it earlier on irc and it seems fine. i might add a comment /* 100ms */ in the appropriate spot for clarity. rodrigo said that he intends to test the patch in ubuntu to seek feedback before pushing it upstream.
It also happens for Debian users sometimes. Thanks for the analysis.
Yes, already tested and confirmed fixed by several Ubuntu users: https://bugs.launchpad.net/gnome-settings-daemon/+bug/649809 (see the last few comments)
(In reply to comment #10) > i think the issue is that even though g-s-d has been killed, X needs to notice > this and update itself internally accordingly. that doesn't happen before X > gets the new request. . That sounds very unlikely. This would be a tiny race.
I've requested a freeze exception for Rodrigo's patch: https://mail.gnome.org/archives/release-team/2011-March/msg00566.html
Ray: release team is asking for your opinion
We have 2/2 but I think there is no harm in waiting a couple of days for feedback from some gdm people before we proceed.
This proposed solution looks like a hack to me. It may address a symptom, but I imagine it is not solving the underlying problem. Since GDM was rewritten, numerous bugs have been exposed caused by the GDM welcome session not exiting cleanly and leaving things in bad state for the user session which is started after authentication. Bug #607658 is an example. I know that we have also had some problems on Solaris with stale Xatoms left behind by the GDM session causing odd behaviors in the user session. It does not seem like we have a full understanding of all the issues and possible race conditions that could cause issues with GDM stopping the welcome session and starting the user session. So I anticipate that we will continue to find similar problems until we have a more robust solution. That said, if this hack makes GNOME work better, I see no harm in it. This late in the release cycle, such a minimally intrusive fix may make the most sense. I recommend planning to remove this hack after doing the 3.0 release and work towards actually fixing the underlying problems properly in a more sane way. I should think we could fix the user session or the Xserver so that GDM can be provided with a more clear indication of when it is safe to go ahead and start the user session. Fixing this on a per-application basis (like this patch fixing gnome-settings-daemon), seems a temporary fix at best.
(In reply to comment #18) > Fixing this on a per-application basis (like this patch fixing > gnome-settings-daemon), seems a temporary fix at best. While the fix certainly needs to be made a proper one (probably in the X server), I don’t think anything else than gnome-settings-daemon needs fixing. The processes are all guaranteed to have exited before we start the user session, and the only thing that is supposed to leave things in the X server is g-s-d.
(In reply to comment #14) > (In reply to comment #10) > > i think the issue is that even though g-s-d has been killed, X needs to notice > > this and update itself internally accordingly. that doesn't happen before X > > gets the new request. > . > > That sounds very unlikely. This would be a tiny race. it's enough in fast machines, it seems
(In reply to comment #20) > (In reply to comment #14) > > (In reply to comment #10) > > > i think the issue is that even though g-s-d has been killed, X needs to notice > > > this and update itself internally accordingly. that doesn't happen before X > > > gets the new request. > > . > > > > That sounds very unlikely. This would be a tiny race. > > it's enough in fast machines, it seems If the race is tiny, and it's indeed the problem, why do we wait for 0.1 seconds?
(In reply to comment #21) > (In reply to comment #20) > > (In reply to comment #14) > > > (In reply to comment #10) > > > > i think the issue is that even though g-s-d has been killed, X needs to notice > > > > this and update itself internally accordingly. that doesn't happen before X > > > > gets the new request. > > > . > > > > > > That sounds very unlikely. This would be a tiny race. > > > > it's enough in fast machines, it seems > > If the race is tiny, and it's indeed the problem, why do we wait for 0.1 > seconds? the problem was fixed for people adding a sleep 1, so it's anything less than 1 second. If you have a better number, I'm ok with changing the patch, but I don't know exactly how many milliseconds it is.
Review of attachment 184538 [details] [review]: Got 2 approvals from r-t, so pushed this to master. Keeping the bug open though, so that we get a better fix for 3.0.1
Workaround committed, hence removing blocker flag.
Has the bug already been reported against the X server? Otherwise I can do it if no one steps in.
This is almost certainly not an X bug.
(In reply to comment #26) > This is almost certainly not an X bug. Where would it be then? I don’t see many possibilities here: - a design error in the Xsettings specification; - the Xsettings specification being wrongly implemented; - the X server not providing accurate information.
All that's here is a race that occurs between the XSettings provider starting and the previous one exiting after it has been *requested* to exit. There could be very many things that delay this request from being turned into an actual exit. It could also be that it does exit, but X doesn't notice right away that the socket is dropped -- X can't do anything better in this case anyway since it's up to the OS to schedule it.
(In reply to comment #28) > All that's here is a race that occurs between the XSettings provider starting > and the previous one exiting after it has been *requested* to exit. I thought that the gnome-session process in the login session would exit only after g-s-d has exited. If not, that could be fixed in gnome-session.
so one possibility is gnome-session could be crashing after it tells all the clients to go away before they actually do. If that happened then login would proceed before the clients finished exiting. desrt is going to try to set up a reproduction environment when he gets home and attach to the gnome-session process to see if it's crashing.
I'm sorry Ray -- I forgot that I don't have this bug on any of my machines anymore since I moved from Ubuntu to Debian on my home machine (which was the only one I ever experienced the problem with). Perhaps Rodrigo can help more.
Rodrigo, would you mind gdb attaching to gdm's gnome-session and trying to log in, reproduce, etc and see if gnome-session ends up exiting or crashing?
I also wonder when this happens if there's a message like this: WARNING: Client '/org/gnome/SessionManager/ClientN' failed to reply before timeout in /var/log/messages or /var/log/gdm/:0-greeter.log
I'm sorry, but I haven't been able to reproduce this problem at all. There are lots of Ubuntu users though getting it (before the fix was done), so I'll try to get someone with enough knowledge to do the debugging.
So what do we do with this bug now? Closing as OBSOLETE?
Well, without anyone to reproduce there's not much we can do I guess. We could revert the workaround in an attempt to make the problem happen again, but I think we have bigger fish to fry. Let's WONTFIX for now, and revisit later if it ever comes up again.
I've been profiling g-s-d startup time. The make-it-work hack adds a whopping 400ms to the g-s-d startup time. Has any work been done to address the actual problem since 3.0 or are we destined to work around this forever? Thanks.
i'd say we should revert it for the 3.4 devel period, see if the problem comes back to a reproducible state, and if so, fix it.
Has this been reverted ? 3.4 development is underway now...
Yes, already reverted in git master
okay, hopefully the bug will start happening again so we can fix it.
We completely nuked the d-bus autostart capability of gnome-settings-daemon, so it should now get started through gnome-session only. Do you still see this Ryan?
I haven't seen this in a long time... I stopped using that computer, though :)
(In reply to comment #43) > I haven't seen this in a long time... I stopped using that computer, though :) You should! It's fast! Let's close this then.