GNOME Bugzilla – Bug 700332
Crash in end_gnutls_io after suspending system
Last modified: 2018-03-21 00:02:46 UTC
A user at Launchpad has reported a Geary crasher which is occuring inside of GIO 2.32.3, specifically the TLS layer: https://bugs.launchpad.net/geary/+bug/1178102 The stack trace: 0xa8becdf5 in end_gnutls_io (error=0x9d4ff09c, status=-52, gnutls=0x897de90) at gtlsconnection-gnutls.c:491 491 gtlsconnection-gnutls.c: No existe el archivo o el directorio. (gdb) bt
+ Trace 231945
This appears to be the offending line (gtlsconnection-gnutls.c:491) G_TLS_CONNECTION_GNUTLS_GET_CLASS (gnutls)->failed (gnutls); Note that the TLS connections are issuing BAD_IDENTITY warnings when connecting; no idea if that's relevant here. It looks like the user can reproduce this quite easily with Geary.
> This appears to be the offending line (gtlsconnection-gnutls.c:491) > > G_TLS_CONNECTION_GNUTLS_GET_CLASS (gnutls)->failed (gnutls); Yes, and there's not really any plausible reason it would crash on that line... if "gnutls" is bad, then it should have crashed sooner, and if it's not, then it shouldn't have crashed at all... The stack trace also looks weird: note that frames 0 and 1 are both in end_gnutls_io(). So maybe some sort of stack corruption? Running it under valgrind might be useful.
Is it possible it's a threading or concurrency issue? The key point is that the user is suspending the system while Geary is running. I wonder if a background thread is closing or cleaning up the connection when this method is running. It would be a tight timing hole, but if the system is just waking up, then the network stack would undoubtedly be cleaning up dead connections as everything comes back to life.
(In reply to comment #2) > Is it possible it's a threading or concurrency issue? in that case, a gdb "thread apply all bt" would be useful. But yeah, the fact that the crash occurs inside a gio op that has been pushed off to another thread is definitely suspicious.
Created attachment 245778 [details] gdb output
I've attached the gdb output. Apparently the user can reproduce this quite easily.
(In reply to Dan Winship from comment #1) > > This appears to be the offending line (gtlsconnection-gnutls.c:491) > > > > G_TLS_CONNECTION_GNUTLS_GET_CLASS (gnutls)->failed (gnutls); > > Yes, and there's not really any plausible reason it would crash on that > line... if "gnutls" is bad, then it should have crashed sooner, and if it's > not, then it shouldn't have crashed at all... > > The stack trace also looks weird: note that frames 0 and 1 are both in > end_gnutls_io(). So maybe some sort of stack corruption? Running it under > valgrind might be useful. Yeah, for this one the stack trace doesn't seem actionable. We'll need to catch this one in valgrind/asan/tsan. I'm not expecting a response, since this bug is old and there's a layer of indirection here, but I'll set it to NEEDINFO anyway in case it's still reproducible and you are able to get diagnostics.
Closing this bug report as no further information has been provided. Please feel free to reopen this bug report if you can provide the information that was asked for in a previous comment. Thanks!