GNOME Bugzilla – Bug 473480
hangs when restarting login screen with new chosen language
Last modified: 2007-09-28 20:08:11 UTC
Please describe the problem: After changing Language for GDM, it shows message showing "Do you wish to restart the login screen with chosen Language", Pressing "YES", it showing only mouse, but GDM hanged and not restarted, only after killing it, it start again Steps to reproduce: 1. Change Language at GDM login screen 2. it will show screen to restart Login Screen 3. Press "Yes" Actual results: only Mouse shown, but screen not available back Expected results: GDM login screen should be back Does this happen every time? YES Other information:
What version of GDM are you using. Note there was a similar problem in 2.19.4 that was fixed in 2.19.5. If you are using 2.19.4 or earlier, I'd upgrade to the latest GDM. If you are using the latest GDM, could you turn on debug by running gdmsetup and checking debug messages on the security tab. Then restart gdm by running gdm-restart, and recreate the problem. Then you should find a bunch of gdm related messages at the end of your syslog. Please try and locate the messages that seem to relate to the failure, and attach them to this bug report. The syslog can be found at /var/log/messages or /var/adm/messages, depending on your OS.
My gdm version is 2.19.6. I found gdm is blocked in function call gdm_fdgetc (greeter_fd_in) in slave.c 5598 line. It seems not all of write peers of pipe greeter_fd_in was closed when shutdown gdmgreeter. So the gdm_fdgetc will be blocked instead of return EOF immediately. I guess it should close slave_fd_out before call gdm_fdgetc.
It would be helpful if you were able to provide a GDM debug log so I can see what messages are going over the pipe. This might help highlight what is causing the problem. Do you have the face browser running or not?
Created attachment 94972 [details] The messages file form /var/log/
Unfortunately, the GDM debug messages don't include any information about the messages passed between the greeter and the slave via the stdin/stdout interface. Could you perhaps add some gdm_debug lines to the gdm_slave_greeter_ctl function to print out what message's the slave is receiving? It might also be useful to put some other gdm_debug lines in this function just to see what it's doing when it hangs. Also, is the greeter also hanging? If so, what is it doing? Could you provide a stack trace of it when it hangs?
> My gdm version is 2.19.6. > I found gdm is blocked in function call gdm_fdgetc (greeter_fd_in) in slave.c > 5598 line. It seems not all of write peers of pipe greeter_fd_in was closed > when > shutdown gdmgreeter. So the gdm_fdgetc will be blocked instead of return EOF > immediately. I guess it should close slave_fd_out before call gdm_fdgetc. > I found the greeter have exited, when gdm is hanging. I guess the bug is because of gdm holding an input peer (slave_fd_out) of pipe greeter_fd_in, so read on greeter_fd_in can not return EOF immediately.
If the greeter has exited, then this is probably your bug. It shouldn't exit. Can you find out if it is crashing? Can you log into a session via some other means? If so, you could try setting DOING_GDM_DEVELOPMENT=1 and then run gdmgreeter. Does it fail to start? Does it print out any useful debug messages?
When the gdm fork and start greeter process, gdm does not close writing peer of the communication pipe. It causes gdm being blocked in read sys_call on reading peer of the communication pipe. It is the reason. I just checked the HEAD version in svn, the writing peer of pipe is closed. I think this bug should not happen.
We have this downstream as well in openSUSE 10.3 (GNOME 2.19.x) https://bugzilla.novell.com/show_bug.cgi?id=308378
Can you verify that this bug doesn't happen in the latest GDM? If so, please close this bug, or let me know so I can close it.
I have been attempting to fix this bug downstream in openSUSE 10.3 Beta 3. I've tested with GDM head and GDM 2.19.8. The bug exists in 2.19.7, 2.18.8 and head. Brian, any tips for debugging the problem. My current thinking is the dup'd file descriptor owned by the daemon is not getting closed in the work flow. I see the greeter hang when the user picks a new language from the greeter and then attempts to restart GDM.
When you say it happens in 2.19.7 and 2.19.8 and head - do you mean to say it doesn't happen in 2.19.6? If so, then this is really odd. Looking at the code, the changes between 2.19.6 and 2.19.7 are as follows: - Updated DTD to better reflect how gdmgreeter works. This shouldn't affect GDM runtime at all. - Some gdmgreeter xml theme improvements. I'd be really surprised if this affected runtime behavior. - Fix some warnings (see bug #467401) - Disable autocompletion in the face browser (bug #467335) Perhaps you could backout these patches one at a time and see which one changes the behavior? Also you might check if openSUSE is adding any patches that might be affecting your behavior.
I test HEAD version gdm. It has this problem. I cheched fds of all processes in /proc/$PID/fd/, and then I found process /usr/libexec/at-spi-registryd hold the write peer of the pipe as stdout. I used gdb attached this process and used 'call close (1)' command in gdb, and then the gdm resumed and restarted a new gdmgreeter. Do you have any idea about process /usr/libexec/at-spi-registryd? What is it?
I haven't really read this bug closely (just going through my morning bug routine), but are we just missing a fcntl call to set close-on-exec somewhere?
at-spi-registryd is the process that is used to enable accessibility. It should only be started if GDM has accessibility turned on in the GDM configuration. I'm a bit confused that at-spi-registryd would want to talk with the daemon directly. My understanding is that the greeter launches at-spi-registryd when needed.
Right, that's why i'm wondering if the fd is just getting leaked across the fork() used to launch at-spi-registryd.
Oh, so what's probably going on is the greeter uses its stdout for IPC with the slave instead of for console i/o. I guess gdm manually runs at-spi-registryd, and presumably uses g_spawn_* to do it. g_spawn_* won't close fd 1 before the exec, because it assumes the child will want to be able to write the console. In this case, fd 1 isn't hooked up to the console, so it would be really bad if the child (at-spi-registryd did try to talk over the fd). If the above guess is right then the fix is probably do something like fd = open("/dev/null", O_RDWR); dup2(fd, STDIN_FILENO); dup2(fd, STDOUT_FILENO); close(fd); in a child setup function passed to the g_spawn call. Do you want to take a look at doing this, Brady? If not, I can do it, but not for a few days.
Thanks, Ray. If you were able to look into this that would be great.
Yea i'll definitely look into it if Brady, doesn't. If Brady's looking into it like he mentioned in comment 11, then I'll standby.
Created attachment 95776 [details] [review] fix this problem Using flags G_SPAWN_STDOUT_TO_DEV_NULL & G_SPAWN_STDERR_TO_DEV_NULL, the g_spawn_* will close stdout & stderr before exec. This patch can fix the problem.
ah even better! I forgot about those flags.
Huang, have you tested the patch and found it fixes your original problem?
yeah. It can fix this problem.
Thanks. I put this patch into 2.20 and SVN head.
Created attachment 96339 [details] [review] ack savedie request before saving and dying So there was a downstream Fedora report for this bug, too, and the above patch wasn't enough to fix the problem. I investigated, and the slave was stuck waiting for a response from the greeter that never came because the greeter died without sending it. This behavior wasn't noticeable upstream, because when the greeter exited, the pipe the slave was waiting for a response on closed, and it stopped waiting because of the hangup. We ship the reset-pam patch mentioned in another bug report that prevents a hangup from happening.
Thanks, patch applied to 2.20 and head branches.