After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 473480 - hangs when restarting login screen with new chosen language
hangs when restarting login screen with new chosen language
Status: RESOLVED FIXED
Product: gdm
Classification: Core
Component: general
2.19.x
Other All
: Normal normal
: ---
Assigned To: GDM maintainers
GDM maintainers
Depends on:
Blocks:
 
 
Reported: 2007-09-04 08:25 UTC by Peng Huang
Modified: 2007-09-28 20:08 UTC
See Also:
GNOME target: ---
GNOME version: 2.19/2.20


Attachments
The messages file form /var/log/ (111.40 KB, text/plain)
2007-09-05 05:45 UTC, Peng Huang
  Details
fix this problem (861 bytes, patch)
2007-09-18 06:36 UTC, Peng Huang
none Details | Review
ack savedie request before saving and dying (845 bytes, patch)
2007-09-28 18:12 UTC, Ray Strode [halfline]
none Details | Review

Description Peng Huang 2007-09-04 08:25:12 UTC
Please describe the problem:
After changing Language for GDM, it shows message showing "Do you wish to
restart the login screen with chosen Language", Pressing "YES", it showing only
mouse, but GDM 
hanged and not restarted, only after killing it, it start again

Steps to reproduce:

1. Change Language at GDM login screen
2. it will show screen to restart Login Screen
3. Press "Yes"

Actual results:
only Mouse shown, but screen not available back

Expected results:
GDM login screen should be back

Does this happen every time?
YES

Other information:
Comment 1 Brian Cameron 2007-09-04 21:25:16 UTC
What version of GDM are you using.  Note there was a similar problem in
2.19.4 that was fixed in 2.19.5.  If you are using 2.19.4 or earlier, I'd upgrade to the latest GDM.

If you are using the latest GDM, could you turn on debug by running gdmsetup and checking debug messages on the security tab.  Then restart gdm by running gdm-restart, and recreate the problem.

Then you should find a bunch of gdm related messages at the end of your syslog.  Please try and locate the messages that seem to relate to the failure, and attach them to this bug report.  The syslog can be found at /var/log/messages or /var/adm/messages, depending on your OS.
Comment 2 Peng Huang 2007-09-05 02:07:20 UTC
My gdm version is 2.19.6.
I found gdm is blocked in function call gdm_fdgetc (greeter_fd_in) in slave.c
5598 line. It seems not all of write peers of pipe greeter_fd_in was closed when
shutdown gdmgreeter. So the gdm_fdgetc will be blocked instead of return EOF
immediately. I guess it should close slave_fd_out before call gdm_fdgetc.
Comment 3 Brian Cameron 2007-09-05 05:17:55 UTC
It would be helpful if you were able to provide a GDM debug log so I can see what messages are going over the pipe.  This might help highlight what is causing the problem.

Do you have the face browser running or not?
Comment 4 Peng Huang 2007-09-05 05:45:03 UTC
Created attachment 94972 [details]
The messages file form /var/log/
Comment 5 Brian Cameron 2007-09-08 06:57:36 UTC
Unfortunately, the GDM debug messages don't include any information about the messages passed between the greeter and the slave via the stdin/stdout interface.

Could you perhaps add some gdm_debug lines to the gdm_slave_greeter_ctl function to print out what message's the slave is receiving?  It might also be useful to put some other gdm_debug lines in this function just to see what it's doing when it hangs.

Also, is the greeter also hanging?  If so, what is it doing?  Could you provide a stack trace of it when it hangs?
Comment 6 Peng Huang 2007-09-08 13:58:15 UTC
> My gdm version is 2.19.6.
> I found gdm is blocked in function call gdm_fdgetc (greeter_fd_in) in slave.c
> 5598 line. It seems not all of write peers of pipe greeter_fd_in was closed
> when
> shutdown gdmgreeter. So the gdm_fdgetc will be blocked instead of return EOF
> immediately. I guess it should close slave_fd_out before call gdm_fdgetc.
> 
I found the greeter have exited, when gdm is hanging. I guess the bug is because of gdm holding an input peer (slave_fd_out) of pipe greeter_fd_in, so read on greeter_fd_in can not return EOF immediately.
Comment 7 Brian Cameron 2007-09-10 17:02:58 UTC
If the greeter has exited, then this is probably your bug.  It shouldn't exit.  Can you find out if it is crashing?  Can you log into a session via some other means?  If so, you could try setting DOING_GDM_DEVELOPMENT=1 and then run gdmgreeter.  Does it fail to start?  Does it print out any useful debug messages?

Comment 8 Peng Huang 2007-09-11 00:57:03 UTC
When the gdm fork and start greeter process, gdm does not close writing peer of the communication pipe. It causes gdm being blocked in read sys_call on reading peer of the communication pipe. It is the reason. I just checked the HEAD version in svn, the writing peer of pipe is closed. I think this bug should not happen.
Comment 9 JP Rosevear 2007-09-12 17:49:34 UTC
We have this downstream as well in openSUSE 10.3 (GNOME 2.19.x)
https://bugzilla.novell.com/show_bug.cgi?id=308378
Comment 10 Brian Cameron 2007-09-12 18:28:53 UTC
Can you verify that this bug doesn't happen in the latest GDM?  If so, please close this bug, or let me know so I can close it.
Comment 11 Brady Anderson 2007-09-13 20:52:32 UTC
I have been attempting to fix this bug downstream in openSUSE 10.3 Beta 3.  I've tested with GDM head and GDM 2.19.8.  The bug exists in 2.19.7, 2.18.8 and head.  Brian, any tips for debugging the problem.  My current thinking is the dup'd file descriptor owned by the daemon is not getting closed in the work flow.  I see the greeter hang when the user picks a new language from the greeter and then attempts to restart GDM.
Comment 12 Brian Cameron 2007-09-15 06:24:15 UTC
When you say it happens in 2.19.7 and 2.19.8 and head - do you mean to say it doesn't happen in 2.19.6?  If so, then this is really odd.  Looking at the code, the changes between 2.19.6 and 2.19.7 are as follows:

- Updated DTD to better reflect how gdmgreeter works.  This shouldn't affect
  GDM runtime at all.
- Some gdmgreeter xml theme improvements.  I'd be really surprised if this 
  affected runtime behavior.
- Fix some warnings (see bug #467401)
- Disable autocompletion in the face browser (bug #467335)

Perhaps you could backout these patches one at a time and see which one changes the behavior?  Also you might check if openSUSE is adding any patches that might be affecting your behavior.
Comment 13 Peng Huang 2007-09-17 06:46:56 UTC
I test HEAD version gdm. It has this problem. I cheched fds of all processes in /proc/$PID/fd/, and then I found process /usr/libexec/at-spi-registryd hold the write peer of the pipe as stdout. I used gdb attached this process and used 'call close (1)' command in gdb, and then the gdm resumed and restarted a new gdmgreeter. Do you have any idea about process /usr/libexec/at-spi-registryd? What is it?
Comment 14 Ray Strode [halfline] 2007-09-17 15:08:36 UTC
I haven't really read this bug closely (just going through my morning bug routine), but are we just missing a fcntl call to set close-on-exec somewhere?
Comment 15 Brian Cameron 2007-09-17 17:12:24 UTC
at-spi-registryd is the process that is used to enable accessibility.  It should only be started if GDM has accessibility turned on in the GDM configuration.  I'm a bit confused that at-spi-registryd would want to talk with the daemon directly.    My understanding is that the greeter launches at-spi-registryd when needed.
Comment 16 Ray Strode [halfline] 2007-09-17 17:17:34 UTC
Right, that's why i'm wondering if the fd is just getting leaked across the fork() used to launch at-spi-registryd.
Comment 17 Ray Strode [halfline] 2007-09-17 17:28:22 UTC
Oh, so what's probably going on is the greeter uses its stdout for IPC with the slave instead of for console i/o.  I guess gdm manually runs at-spi-registryd, and presumably uses g_spawn_* to do it.

g_spawn_* won't close fd 1 before the exec, because it assumes the child will want to be able to write the console.  In this case, fd 1 isn't hooked up to the console, so it would be really bad if the child (at-spi-registryd did try to talk over the fd).

If the above guess is right then the fix is probably do something like 

fd = open("/dev/null", O_RDWR);
dup2(fd, STDIN_FILENO);
dup2(fd, STDOUT_FILENO);
close(fd);

in a child setup function passed to the g_spawn call.  Do you want to take a look at doing this, Brady?  If not, I can do it, but not for a few days.
Comment 18 Brian Cameron 2007-09-17 17:42:30 UTC
Thanks, Ray.  If you were able to look into this that would be great.
Comment 19 Ray Strode [halfline] 2007-09-17 17:57:45 UTC
Yea i'll definitely look into it if Brady, doesn't.  If Brady's looking into it like he mentioned in comment 11, then I'll standby.
Comment 20 Peng Huang 2007-09-18 06:36:55 UTC
Created attachment 95776 [details] [review]
fix this problem

Using flags G_SPAWN_STDOUT_TO_DEV_NULL & G_SPAWN_STDERR_TO_DEV_NULL, the g_spawn_*  will close stdout & stderr before exec. This patch can fix the problem.
Comment 21 Ray Strode [halfline] 2007-09-18 13:56:50 UTC
ah even better! I forgot about those flags.
Comment 22 Ray Strode [halfline] 2007-09-18 13:58:12 UTC
Huang, have you tested the patch and found it fixes your original problem?
Comment 23 Peng Huang 2007-09-18 14:23:50 UTC
yeah. It can fix this problem.
Comment 24 Brian Cameron 2007-09-18 18:41:10 UTC
Thanks.  I put this patch into 2.20 and SVN head.
Comment 25 Ray Strode [halfline] 2007-09-28 18:12:43 UTC
Created attachment 96339 [details] [review]
ack savedie request before saving and dying

So there was a downstream Fedora report for this bug, too, and the above patch wasn't enough to fix the problem.  

I investigated, and the slave was stuck waiting for a response from the greeter that never came because the greeter died without sending it.

This behavior wasn't noticeable upstream, because when the greeter exited, the pipe the slave was waiting for a response on closed, and it stopped waiting because of the hangup.  We ship the reset-pam patch mentioned in another bug report that prevents a hangup from happening.
Comment 26 Brian Cameron 2007-09-28 20:08:11 UTC
Thanks, patch applied to 2.20 and head branches.