After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 152906 - PostSession not evaluated upon client disconnects
PostSession not evaluated upon client disconnects
Status: RESOLVED FIXED
Product: gdm
Classification: Core
Component: general
2.6.0.x
Other Linux
: Urgent blocker
: ---
Assigned To: GDM maintainers
GDM maintainers
Depends on:
Blocks:
 
 
Reported: 2004-09-17 13:21 UTC by tmp
Modified: 2005-10-06 19:53 UTC
See Also:
GNOME target: ---
GNOME version: 2.5/2.6


Attachments
patch fixing the problem (2.65 KB, patch)
2005-10-06 19:51 UTC, Brian Cameron
committed Details | Review

Description tmp 2004-09-17 13:21:21 UTC
As the following potential bug causes that we cannot use gdm in our
terminal/server-environment I've marked this bug as a "blocker" and "urgent"
because we really want to use gdm as soon this is fixed.

Context: A client/server environment

1) For testing purposes, add "touch /tmp/postsession_evaluated" in
PostSession/Default at the server

2) Connect to gdm at the server through xdmcp from a client

3) Login as a user (from the client but on the server)

4) Pull out the power cable on the client (or simply shut down the X by pressing
ctrl-alt-backspace)

5) Notice that the PostSession script is NOT getting evaluated - even if
gdm.conf contains a PingIntervalSeconds = 15

If you in stead of pulling the power cable in 4) just logout nicely, the
PostSession script IS evaluated.
Comment 1 Daniel Hedblom 2005-02-21 10:46:35 UTC
Can confirm this behaviour. Very annoying with 300+ Terminal Server users. This
is also happening on desktop systems. Makes them behaive "Windows like" ie the
user reboots because the processes dont die by login/logout.
Comment 2 Brian Cameron 2005-04-12 17:22:45 UTC
I've looked at this problem a bit, and at first glance it looked like the logic
should work.  The ping alarm callback notices that a session is started and uses
setjmp/longjmp to call functions that should clear the session.  But then I
started looking more into how setjmp/longjmp work, and I think that this might
be causing the problem you report.  My schedule is getting a bit tight, so it
might take me a while to fix this.  But if you have some time to help test some
things, that could speed getting a fix into CVS.

Here is my analysis so far...

Looking at daemon/slave.c, it seems that when the XDMCP client drops connection
GDM2 finds out about it in the gdm_slave_alrm_handler.  slave.c sets up a JUMP
via setjmp() in gdm_slave_start and the gdm_slave_alrm_handler ends up
calling longjmp with JMP_SESSION_START_AND_QUIT if session_started is TRUE.
The same logic of using JMP_SESSION_STOP_AND_QUIT is used in
gdm_slave_term_handler for TERM/INT signal handling and in the function
gdm_slave_xioerror_handler.  I suspect that these are also broken.  It would
be interesting to test if the PostSession script also does not get evaluated 
if the daemon is sent a TERM/INT signal.

Reading the setjmp manpage, it seems that this function saves the stack and
the registers off to the side and when longjmp is called, it replaces the
stack and registers with the values they had when setjmp was called and
also sets the program counter so that the program thinks that the call to
setjmp just returned with the return value as specified (in this case
JMP_SESSION_START_AND_QUIT).

So then the logic in gdm_slave_start calls term_session_stop_and_quit.  But
since the setjmp function was called before gdm_slave_run was started
(which calls gdm_slave_session_start when sets the session_started global),
doesn't this mean that the pc and stack get reset so the daemon doesn't
know anything about the session anymore?  Not only would session_started
be back to FALSE, but the "d" structure probably isn't set up either.

I'm not sure my analysis here is completely on-base, but we could try testing
this theory by removing the setjmp/longjmp logic and just calling
term_session_stop_and_quit directly from gdm_slave_alrm_handler.  I suspect that
this would fix the problem.  A similar change might be necessary in the other
areas where this is being used (like the TERM/INT handler).  

I'm not exactly clear why setjmp/longjmp is being used here.  I've pinged George
to ask him if there might be a reason that he's aware of why the setjmp/longjmp
logic is being used.
Comment 3 Brian Cameron 2005-08-15 19:06:37 UTC
From the gdm@sunsite.dk mail alias, some discussion relating to this bug.

> George Lebl said
>> Brian Cameron said:
>>
>> I believe in the situation you describe, the gdm_slave_xioerror_handler
>> will get called to process the signal and this should notice that the
>> session was started and call term_session_stop_and_quit, which will call
>> gdm_slave_quick_exit.
>> 
>> I suspect this might not be working due to the setjmp/longjmp logic
>> because calling longjmp will return the state of the program to when
>> setjmp was called, so the state of the global variables may get lost
>> causing gdm to "forget" it has a running session and causing the
>> PostSession to not get called.  You can refer to the bug report
>> mentioned above for more information.
>
> longjmp don't change heap (and thus not global variables) only the stack.
>
> xioerror and signals are a problem since longjmp is the only way to do work
> outside the context of a signal handler.  You MUST longjmp if you want to
> call certain system calls to avoid hangs / memory corruption etc...  Of
> course longjmp brings it's own problems.  You simply cannot call session_stop
> from xioerror_handler since that might be inside a signal handler.  Xlib
> sucks this way.
>
> It must be a problem of logic, not with the longjmp.  It would be interesting
> to find out what the state of the globals is when stop_session is called
> especially it would be interesting why when we setjmp with
> JMP_SESSION_STOP_AND_QUIT, what is the state of variables mentioned in
>
> 	/* only if we're not hanging in session stop and getting a
> 	   TERM signal again */
>	if (in_session_stop == 0 && session_started)
>		gdm_slave_session_stop (d->logged_in && login != NULL,
>					TRUE /* no_shutdown_check */);
>
> Is slave_session_stop called at all?  If so, is 'd->logged_in' and 'login' in
> a wrong state?
>
> It could be that there is some sort of race happening with signal handlers.
> These are very hard to catch.
>
>> I would add some gdm_debug() calls to the code and verify that this
>> is the problem.  If so, we could rip out the setjmp/longjmp code
>> and fix the code so it does the same thing without using jumping.
>> I suspect that this will fix the problem.  Could you help with
>> this?
>
> You cannot do most things out of signal handlers or out of the xioerror
> handler without getting undefined behaviour.  You NEED longjmp to deal with
> xioerror because of the way it works.  If you rip out the longjmp stuff you
> will bring back even worse problems that happen on things like ctrl-alt-bs.
>
> The correct fix would be to rewrite the slave to be totally event oriented
> with a proper mainloop.  This would NOT be trivial and would need to either
> introduce threads or rewrite the synchroneous parts of Xlib inside gdm.  Such
> a rewrite would (if done properly which is not an easy task) solve all the
> race issues with the signal handlers.  The signal handlers would just trip
> over a global variable and break the mainloop like they do in the master
> daemon.  You would still however need longjmp for xioerror, unless you would
> not use Xlib at all.
Comment 4 Brian Cameron 2005-10-06 19:51:43 UTC
Created attachment 53132 [details] [review]
patch fixing the problem

Patch provided by Jerry DeLapp <jgd@lanl.gov> in private email with me.
Comment 5 Brian Cameron 2005-10-06 19:53:43 UTC
Fixed in CVS head and 2.12 branch.  Note discussion in gdm-list archives in
August-September 2005 regarding this fix.