GNOME Bugzilla – Bug 346994
Gossip/Loudmouth crashes on network errors
Last modified: 2006-10-26 08:23:12 UTC
Steps to reproduce: 1. gossip -n 2. Chat -> Connect 3. 'The Application "gossip" has quit unexpectedly' Stack trace: Backtrace was generated from '/usr/bin/gossip' Using host libthread_db library "/lib/tls/i686/cmov/libthread_db.so.1". [Thread debugging using libthread_db enabled] [New Thread -1226045760 (LWP 7225)] 0xffffe410 in __kernel_vsyscall ()
+ Trace 69249
Thread 1 (Thread -1226045760 (LWP 7225))
Other information: i use Debian SID and it crash with * libloudmouth1-0 1.1.2-2 * gossip 0.12-2
oops, i forgot to copy/past the protocol log --- ~$ LANG=C LM_DEBUG=NET gossip Going to connect to jabber.belnet.be Trying 2001:6a8:3c80::64 port 5222... Connection failed. Connection failed: Connection refused (error 111) Trying 193.190.198.23 port 5222... Connection success. SEND: ----------------------------------- <?xml version='1.0' encoding='UTF-8'?> ----------------------------------- SEND: ----------------------------------- <stream:stream xmlns="jabber:client" xmlns:stream="http://etherx.jabber.org/streams" to="jabber.belnet.be" id="msg_1"> ----------------------------------- ~$
ok, it's ipv6 related; if i try to connect to an ipv4 only server, it works tips: jabber.belnet.be has ipv4 & v6 addresses but the jabber deamon only listens on ipv4
I guess I have to enable ipv6 somehow on my machine to reproduce? I registered an account and can successfully log in.
(In reply to comment #3) > I guess I have to enable ipv6 somehow on my machine to reproduce? > yes, you need a working ipv6 connection to reproduce, enable ipv6 support in the kernel (with "modprobe ipv6") isn't enough. You may need to set up a tunnel via a Tunnel Broker if your ISP doesn't provide you a native ipv6 connection.
OK, thanks. If someone else can debug this I'd be happy.
Since no one really has IPv6 so easily available, is there any chance you could run this in gdb and provide a back trace?
(In reply to comment #6) > Since no one really has IPv6 so easily available, is there any chance you could > run this in gdb and provide a back trace? > what do you need exactly ? after re-reading http://live.gnome.org/GettingTraces it seems to me that i already provide the trace in the initial post. sorry if i'm saying HUGE stupidities, i'm not very familiar with gdb :/
Sorry, I was talking crap, I missed the backtrace, not sure how :/ Yea, by the looks of it, it is a Loudmouth issue.
I can reproduce this crash with loudmouth and Gossip HEAD.
I don't think this problem is ipv6 specific... If an hostname has more than one A record, and that the first one is not listening it crash too For example I add in /etc/hosts 127.0.0.1 plop.foo 193.190.198.23 plop.foo and it crashes too.
(In reply to comment #10) > I don't think this problem is ipv6 specific... > > If an hostname has more than one A record, and that the first one is not > listening it crash too > > For example I add in /etc/hosts > > 127.0.0.1 plop.foo > 193.190.198.23 plop.foo > > and it crashes too. > When I try to connect to plop.foo of course (gdb) thread apply all bt full
+ Trace 72586
Thread 1 (Thread -1226455376 (LWP 9876))
Ah, good catch, that helps debugging this a lot :) However, I can't reproduce the crash, I just get an error callback that propagates up to gossip. There is a bug in loudmouth though, where it doesn't get the socket error correctly, which could be the reason. Do you think you could try a patch for loudmouth for me?
Created attachment 73048 [details] [review] Fixes one problem in loudmouth If anyone wants to test this patch for loudmouth 1.1.x that would be great.
keep crashing with the patch
That was quick, thanks :) OK, good to know.
Do you still get the same stack trace, by the way?
OK, I've looked into this some more, it has nothing to do with if an address has many records even, it's just the same old problem with nonworking connections. The problem leads to some random memory corruption so all the stack traces look different. I'm working on a loudmouth fix.
I'll use this bug as the general network problem crasher bug.
*** Bug 356430 has been marked as a duplicate of this bug. ***
*** Bug 357642 has been marked as a duplicate of this bug. ***
*** Bug 357643 has been marked as a duplicate of this bug. ***
I think the severity could be raised
We don't really use the severity/priority fields.
*** Bug 358258 has been marked as a duplicate of this bug. ***
*** Bug 358121 has been marked as a duplicate of this bug. ***
I don't know if it's relevant but I discover that in function _lm_connection_succeeded(), connect_data->connection take a strange value after line 401 Breakpoint 6, _lm_connection_succeeded (connect_data=0x8315dc8) at lm-connection.c:401 401 connection->io_watch_in = (gdb) display *connect_data 4: *connect_data = {connection = 0x81d5ed0, resolved_addrs = 0x8315d50, current_addr = 0x8315d90, fd = 21, io_channel = 0x8126910} (gdb) n 413 connection->io_watch_err = 4: *connect_data = {connection = 0x1, resolved_addrs = 0xb78f6bcb, current_addr = 0x8315628, fd = 0, io_channel = 0x8126910}
The reasons for this bug is known already, just needs to be fixed: http://developer.imendio.com/issues/browse/LM-59 http://developer.imendio.com/issues/browse/LM-60
Laurent also reported https://launchpad.net/distros/ubuntu/+source/loudmouth/+bug/64618 and asked me to apply http://developer.imendio.com/issues/secure/attachment/10096/stale-source-unreffed.patch (of LM-60) and get it into Ubuntu. As I don't suffer from the problem myself, I'd like to hear your input on it - it looks fairly safe though.
Yes, that patch should be safe (it's kind of a workaround and might not be needed when LM-59 is fixed, but doesn't harm in the meantime).
Gracias.
Botth LM-59 and LM-60 have been committed to loudmouth so I'll close this.