GNOME Bugzilla – Bug 631804
Evolution aborts with pthread error in imapx_server_dispose()
Last modified: 2013-09-14 16:53:47 UTC
GThread-ERROR **: file gthread-posix.c: line 385 (g_thread_join_posix_impl): err or 'Resource deadlock avoided' during 'pthread_join (*(pthread_t*)thread, &ignor e)' aborting... Was running under Valgrind, didn't get to see a backtrace or anything. Will see if it happens again...
Program received signal SIGABRT, Aborted.
+ Trace 224171
Thread 140735634454288 (LWP 26471)
Thread 31 (Thread 0x7fff917fb710 (LWP 26471))
Thread 1 (Thread 0x7ffff5208980 (LWP 26425))
I got similar crash in 2.91.2. I moved to mail view from calendar view, then clicked on Inbox, clicked on 'x' to quit evolution and it crashed.
+ Trace 224218
I saw this only once, and I understand this as the thread has the last reference to the CamelImapxServer object, and during the "shutdown" signal emission is this reference removed, and object about to dispose, and because the dispose is now waiting on the thread to finish (my change, and correct), then the pthread detects that the thread wants to wait on itself, which is obviously impossible, then it aborts. Possibilities: a) invoke "shutdown" signal on the main thread, in some timeout routine - it may mean that there should be one more synchronization mechanism for indication that the thread properly gone, otherwise we would be able to finalize the CamelImapxServer object before the timeout will be reached b) do the unref of the 'conn' in camel-imapx-conn-managet.c:free_connection on timeout, which will ensure the object's reference will be removed only from the main thread. The b) seems like easier way, but does it make any sense?
Created attachment 173067 [details] [review] Don't join parser thread Or we could just stop joining that thread altogether. When I go offline I get a different deadlock now, with idle:
+ Trace 224290
Thread 44 (Thread 0x7fffa37a8710 (LWP 25504))
Hm, cannot reproduce that deadlock now.
(In reply to comment #4) > Or we could just stop joining that thread altogether. Yes, that can do it too. Though I'm afraid it can have some side effects, like the server will be gone before the thread finishes, which means the thread will access freed memory. The thread cannot add a reference to the server, as otherwise the dispose will be never called, and the thread will never stop (maybe except of going online/offline, it needs more testing). So I think there should be some synchronizing mechanism to know whether the thread gone or not. Probably some EFlag. > @@ -5182,6 +5177,8 @@ imapx_server_get_message (CamelIMAPXServer *is, This chunk is for other bug, isn't it?
(In reply to comment #6) > (In reply to comment #4) > > Or we could just stop joining that thread altogether. > > Yes, that can do it too. Though I'm afraid it can have some side effects, Hm, very true. I was thinking that it already *has* a reference, otherwise it couldn't be dropping it and thus ending up in imapx_server_dispose(). But that's only a temporary one when it sends the signal. In that case I'm inclined to revert to the simple and dirty option: --- a/camel/providers/imapx/camel-imapx-server.c +++ b/camel/providers/imapx/camel-imapx-server.c @@ -4944,7 +4944,8 @@ imapx_server_dispose (GObject *object) QUEUE_UNLOCK (server); if (server->parser_thread) { - g_thread_join (server->parser_thread); + if (server->parser_thread != g_thread_self()) + g_thread_join (server->parser_thread); server->parser_thread = NULL; } > > @@ -5182,6 +5177,8 @@ imapx_server_get_message (CamelIMAPXServer *is, > > This chunk is for other bug, isn't it? Yeah, no idea why that was lying around uncommitted. Pushed now.
If, and only if, you want to use this change, then do the join on idle or timeout, because otherwise you'll end-up with a memory leak and it's pity to see all your memory leak bugs filled and then this :) > > This chunk is for other bug, isn't it? > > Yeah, no idea why that was lying around uncommitted. Pushed now. Do not forget to close/update the bug, if not done already.
Created attachment 173174 [details] [review] patch (In reply to comment #8) > If, and only if, you want to use this change, then do the join on idle or > timeout, because otherwise you'll end-up with a memory leak and it's pity to > see all your memory leak bugs filled and then this :) Oh yeah. My original attempt did get that far, but then I ripped it all out in favour of the 'don't join at all' version. > Do not forget to close/update the bug, if not done already. I don't think I got round to filing one for that, for some reason.
(In reply to comment #9) > > Do not forget to close/update the bug, if not done already. > > I don't think I got round to filing one for that, for some reason. Sure you did, see bug #631754 (I know I was "reviewing" it recently).
Oh, so I did. Thanks :)
OK, the patch looks good, from the functionality point of view. Just format new code in evo's way - add new lines where they should be, add spaces where they should be, and please commit to master and gnome-2-32. Thanks.
Created commit 0a2718c in eds master (2.91.3+) Created commit 7231e71 in eds gnome-2-32 (2.32.1+)
Ho hum - if the thread is created to be joined, and we fail to join it - then we leak address space; this exacerbates our "thread 2840" type problems, and eventually causes extremely odd bugs & crashes. ie. every joinable thread we create, must be joined after it exits - or we'll get even worse at this ... - surely ?
(In reply to comment #14) > ie. every joinable thread we create, must be joined after it exits - or we'll > get even worse at this ... - surely ? Yup, that's basically what the patch does. Or I didn't get your complaint/comment.