GNOME Bugzilla – Bug 427022
greeter hangs
Last modified: 2007-04-12 07:49:54 UTC
With SVN head the greeter hangs before it appears. Only recently.
Created attachment 85918 [details] bt of greeter
Created attachment 85919 [details] bt of slave
First off, it looks like there was a bad svn merge that caused an extra copy of some code to be added: http://svn.gnome.org/viewcvs/gdm2/trunk/daemon/slave.c?r1=4738&r2=4762 However, when that extra code is removed it still hangs at the other occurance of that function:
+ Trace 125576
Which was added here: http://svn.gnome.org/viewcvs/gdm2/trunk/daemon/slave.c?view=diff&r1=4737&r2=4738 If I comment out that section it starts fine.
Created attachment 85923 [details] [review] just #if out the offending code I removed the duplicate code from svn. This just #if 0 out the code that hangs.
Caused by: http://svn.gnome.org/viewcvs/gdm2?view=rev&revision=4738 From bug #108820
Thanks for catching this. I've tested the functionality fixed by bug #108820 and it still works great after your change. I also committed some fixes to fix the remaining bugs in bug #108820, so I think that bug is also now finally closed. I'm marking this bug as fixed.
I didn't commit the patch yet. I only removed the obviously duplicate code. So, this bug isn't fixed. I haven't been following bug #108820 so I can't be sure that this is the correct fix... Takao, what do you think?
Okay. I'm confused here. Probably because the attachment in comment #4 above seems to be messed up somehow. It doesn't seem to relate to GDM code at all. Could you provide a patch of what you think should change, and I'll review it with Takao and verify that it works okay?
Created attachment 86161 [details] [review] just #if out the offending code Wow, not sure what happened there. I assume that bugzilla messed up since I don't think that attachment came from my computer and it isn't even a patch... Here's the patch I was referring to.
OK, I guess that slave sent GDM_ALWAYS_RESTART but greeter does not send STX, then gdm_slave_greeter_ctl() is loop in gdm_fdgetc(). E.g. does anybody add printf() debug codes besides my patch? I had a similar experience when I had debuged the code. We need to use gdm_error() for slave and gdm_common_error() for greeter.
William - I don't see this problem on Solaris. Let me explain what this code is trying to do: 1. When you change a language GDM now asks you if you want to restart the GUI in the new language. If you pick "Yes", then the GUI tells the daemon to set always_restart to TRUE. Then the daemon restarts the GUI. 2. If the user then changes the language again, it will restart again. So, when the daemon starts the GUI, it tells it the current state via GDM_ALWAYS_RESTART, which should cause the GUI to call gdm_lang_op_always_restart in gui/gdmlanguages.c. This function sets the state and sends back a STX to indicate that it got the message. The code looks correct to me, and since it works on Solaris, I think something odd may be going on. Could you perhaps do as Takao suggests and see if adding debug explains anything? Is something in the greeter causing gdm_lang_op_always_restart to not get called, or perhaps there is a race condition that is causing the daemon to get confused?
As I was digging into bug #428630, I noticed that I could reproduce the hang. It only happens when the Face Browser is turned on. The problem is that you *have* to call run_pictures () before using the socket for other purposes. We currently send picture data over the pipe because the root user has better permissions for reading files from user's $HOME directory than the gdm user (which is used for running the actual GUI's). There has been some debate about whether it'd be better to rip out this code and make the GUI's access face images, and expect user's to set file permissions properly for this to work to get rid of this uglyness (and also to avoid using the socket for so much data - it's probably slow). I'd entertain discussion on this, or finding a better way to send the image data to the GUI's. That said, this hanging bug is fixed. Now the code is fixed so run_pictures is called before the code you #ifdef'ed out. Should now work without you needing to ifdef out anything.