After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 73179 - Connection Timeout Incorrectly Handled
Connection Timeout Incorrectly Handled
Status: RESOLVED FIXED
Product: Pan
Classification: Other
Component: general
0.11.2.91
Other Linux
: Normal major
: 0.11.3
Assigned To: Charles Kerr
Charles Kerr
Depends on:
Blocks:
 
 
Reported: 2002-03-02 04:56 UTC by john
Modified: 2004-12-22 21:47 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description john 2002-03-02 04:56:02 UTC
Connection timeout on any of the established connections seems to cause
stopping or failure of all of the other established connections.

I normally have 4 connections, and a server timeout on any of them causes
all connections to timeout or fatally block.  This causes the network
connection to chunk badly, in turn exacerbating the problem.
Comment 1 Sven Neuhaus 2002-04-04 10:25:39 UTC
I have the same problem i think. It causes another problem: Despite
having pan configured to use 3 connections (1 reserved for interactive
use), I get 

** WARNING **: Legitimerung fehlgeschlagen: 502 Authentication failed:
max sessions per user (3) exceeded

from my newsserver. Apparently, pan uses more than 3 connections at a
time (maybe it doesn't always close connections properly before
opening a new one?)... I see connection in state FIN_WAIT2 in
"netstat" output.

Tested with 0.11.2.91 on RH7.2. Newsprovider news.clara.net.
Please change severity to "major".
Let me know if you need logs or something!
Comment 2 Charles Kerr 2002-04-05 06:30:28 UTC
Confirmed, I'm seeing this too.

Looks like Pan is somehow dropping a connection before it's fully closed.
Comment 3 Charles Kerr 2002-04-05 07:33:17 UTC
Actually what I'm seeing is sockets in a SOCK_WAIT connection.  This
means that the server has closed the socket, but Pan hasn't closed yet...?

To duplicate: rev up Pan to four connections doing a quick task like
reading an image.  After that, just do one image at a time so that
one connection gets exercised but three sit idle.  After a few
minutes when the idle-disconnect code comes along, the sockets
seem to be closed, but netstat shows that one or two are in
SOCK_WAIT.

Possibly-related-but-maybe-not dept: the SO_KEEPALIVE that we do
on each socket seems to be misguided.  SO_KEEPALIVE sends pinglets
down the wire every once in awhile, but according to rfc 1122
section 4.2.3.6, the ping defaults to no less than once every
two hours, which is far past our three-minute timeout.  We may
want to remove that code from Pan.  Agree/disagree?
Comment 4 Christophe Lambin 2002-04-05 16:45:00 UTC
Regarding SO_KEEPALIVE: agreed.

Regarding FIN_WAIT2 / TIMEOUT: I see a couple of issues here. Firstly,
queue.c::socket_cleanup() does the following in case of an error:

    pan_socket_putline (socket, "QUIT\r\n");
    pan_object_unref (PAN_OBJECT(socket));

So, we issue the 'quit' command and immediately close the socket,
without waiting for the reply. I guess the newsserver could wait on
this (now invalid) socket for a while, trying to send the reply. This
could explain the 'max. sessions' errors.  Unfortunately, my ISP
upgraded their newsserver, and I can't reproduce this problem (so,
unfortunate for this bug, good for me :-)).

Secondly, I see a possible design problem: to close a session, we
issue a 'QUIT' through nntp_disconnect() and then close the socket.
However, the server will also close the socket upon receiving the
quit. So, both ends will try to close at the same time. Not sure if
that's a good idea (my TCP bible's at work, so I can't validate this
right now).  I've managed to get rid of the FIN_WAIT's by introducing
a 50msec sleep between the nntp_disconnect() and the closing of the
socket, though.
Comment 5 Christophe Lambin 2002-04-05 22:37:27 UTC
Committed on the pan-0-11-fix branch.

Sven: do you use CVS ?  If so, could you update to the latest versions
on this branch and see if these changes improve the situation ?   If
you're still having problems, send in a run log. You can find
instructions to do this is http://pan.rebelbase.com/bugreport.html.


Comment 6 Charles Kerr 2002-04-12 17:18:09 UTC
John, Sven, are you still seeing this behavior in CVS?
Comment 7 Sven Neuhaus 2002-04-15 15:13:47 UTC
Sorry, don't have a build environment at the moment. Can you point me
to some devel RPMs for RH7.2 so I can build CVS again?
Comment 8 Christophe Lambin 2002-04-15 22:15:05 UTC
What do you need/have ?  You should be able to build the pan-0-11-fix
branch on a RH7.2 system without installing too many packages (I do). 

Are you referring to gtk2 ? You only need that to build the HEAD
branch. The pan-0-11-fix branch is still using gtk+-1.2. You can check
out a copy on that branch with the following command: 'cvs co -r
pan-0-11-fix pan'.


Comment 9 Charles Kerr 2002-04-16 15:11:40 UTC
From: Sven Neuhaus
Date: 16 Apr 2002 13:04:40 +0200

Hi,

The bugzilla for gnome is hosed this morning.
I built the pan-0-11-fix branch from CVS today and it seems to have
fixed the problem. Thanks!

-Sven
Comment 10 Sven Neuhaus 2002-04-16 15:25:57 UTC
OK, bad news.. the error occurred again a bit later.
However, this time "netstat" showed only 3 connections, and all 3 in
State "established". So, I'm not sure it is a pan problem or if the
news server is just overly sensitive.
Comment 11 Sven Neuhaus 2002-04-16 15:47:27 UTC
Any suggestions how we can nail this down properly? Is there a debug
log where pan writes the timestamp of every connection opened and closed?