After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 427022 - greeter hangs
greeter hangs
Status: RESOLVED FIXED
Product: gdm
Classification: Core
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: GDM maintainers
GDM maintainers
Depends on:
Blocks:
 
 
Reported: 2007-04-06 19:10 UTC by William Jon McCann
Modified: 2007-04-12 07:49 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
bt of greeter (684 bytes, text/plain)
2007-04-06 19:16 UTC, William Jon McCann
  Details
bt of slave (749 bytes, text/plain)
2007-04-06 19:17 UTC, William Jon McCann
  Details
just #if out the offending code (287.99 KB, patch)
2007-04-06 19:52 UTC, William Jon McCann
none Details | Review
just #if out the offending code (696 bytes, patch)
2007-04-11 12:13 UTC, William Jon McCann
none Details | Review

Description William Jon McCann 2007-04-06 19:10:31 UTC
With SVN head the greeter hangs before it appears.  Only recently.
Comment 1 William Jon McCann 2007-04-06 19:16:58 UTC
Created attachment 85918 [details]
bt of greeter
Comment 2 William Jon McCann 2007-04-06 19:17:20 UTC
Created attachment 85919 [details]
bt of slave
Comment 3 William Jon McCann 2007-04-06 19:43:09 UTC
First off, it looks like there was a bad svn merge that caused an extra copy of some code to be added:
http://svn.gnome.org/viewcvs/gdm2/trunk/daemon/slave.c?r1=4738&r2=4762

However, when that extra code is removed it still hangs at the other occurance of that function:

  • #0 __kernel_vsyscall
  • #1 __read_nocancel
    from /lib/libc.so.6
  • #2 gdm_fdgetc
    at misc.c line 1079
  • #3 gdm_slave_greeter_ctl
    at slave.c line 5203
  • #4 gdm_slave_greeter_ctl_no_ret
    at slave.c line 5228
  • #5 gdm_slave_greeter
    at slave.c line 2784
  • #6 gdm_slave_run
    at slave.c line 1556
  • #7 gdm_slave_start
    at slave.c line 890
  • #8 gdm_display_manage
    at display.c line 324
  • #9 gdm_start_first_unborn_local
    at gdm.c line 259

Which was added here:
http://svn.gnome.org/viewcvs/gdm2/trunk/daemon/slave.c?view=diff&r1=4737&r2=4738

If I comment out that section it starts fine.
Comment 4 William Jon McCann 2007-04-06 19:52:47 UTC
Created attachment 85923 [details] [review]
just #if out the offending code

I removed the duplicate code from svn.  This just #if 0 out the code that hangs.
Comment 5 William Jon McCann 2007-04-06 19:55:47 UTC
Caused by:
http://svn.gnome.org/viewcvs/gdm2?view=rev&revision=4738

From bug #108820
Comment 6 Brian Cameron 2007-04-10 03:58:12 UTC
Thanks for catching this.  I've tested the functionality fixed by bug #108820 and it still works great after your change.  I also committed some fixes to fix the remaining bugs in bug #108820, so I think that bug is also now finally closed.

I'm marking this bug as fixed.
Comment 7 William Jon McCann 2007-04-10 13:43:05 UTC
I didn't commit the patch yet.  I only removed the obviously duplicate code.  So, this bug isn't fixed.

I haven't been following bug #108820 so I can't be sure that this is the correct fix... Takao, what do you think?
Comment 8 Brian Cameron 2007-04-11 07:28:30 UTC
Okay.  I'm confused here.  Probably because the attachment in comment #4 above seems to be messed up somehow.  It doesn't seem to relate to GDM code at all.

Could you provide a patch of what you think should change, and I'll review it with Takao and verify that it works okay?

Comment 9 William Jon McCann 2007-04-11 12:13:58 UTC
Created attachment 86161 [details] [review]
just #if out the offending code

Wow, not sure what happened there.  I assume that bugzilla messed up since I don't think that attachment came from my computer and it isn't even a patch...

Here's the patch I was referring to.
Comment 10 Takao Fujiwara 2007-04-11 15:26:38 UTC
OK, I guess that slave sent GDM_ALWAYS_RESTART but greeter does not send 
STX, then gdm_slave_greeter_ctl() is loop in gdm_fdgetc().

E.g. does anybody add printf() debug codes besides my patch? I had a similar experience when I had debuged the code.
We need to use gdm_error() for slave and gdm_common_error() for greeter.

Comment 11 Brian Cameron 2007-04-12 03:18:12 UTC
William - I don't see this problem on Solaris.  Let me explain what this code is trying to do:

1. When you change a language GDM now asks you if you want to restart the 
   GUI in the new language.  If you pick "Yes", then the GUI tells the
   daemon to set always_restart to TRUE.  Then the daemon restarts the GUI.
2. If the user then changes the language again, it will restart again.

So, when the daemon starts the GUI, it tells it the current state via GDM_ALWAYS_RESTART, which should cause the GUI to call gdm_lang_op_always_restart in gui/gdmlanguages.c.  This function sets the state and sends back a STX to indicate that it got the message.

The code looks correct to me, and since it works on Solaris, I think something odd may be going on.  Could you perhaps do as Takao suggests and see if adding debug explains anything?  Is something in the greeter causing gdm_lang_op_always_restart to not get called, or perhaps there is a race condition that is causing the daemon to get confused?


Comment 12 Brian Cameron 2007-04-12 07:49:54 UTC
As I was digging into bug #428630, I noticed that I could reproduce the hang.  It only happens when the Face Browser is turned on.

The problem is that you *have* to call run_pictures () before using the socket for other purposes.  We currently send picture data over the pipe because the root user has better permissions for reading files from user's $HOME directory than the gdm user (which is used for running the actual GUI's).

There has been some debate about whether it'd be better to rip out this code and make the GUI's access face images, and expect user's to set file permissions properly for this to work to get rid of this uglyness (and also to avoid using the socket for so much data - it's probably slow).  I'd entertain discussion on this, or finding a better way to send the image data to the GUI's.

That said, this hanging bug is fixed.  Now the code is fixed so run_pictures is called before the code you #ifdef'ed out.  Should now work without you needing to ifdef out anything.