GNOME Bugzilla – Bug 221604
Calls to timeout_cb lead to crashes
Last modified: 2005-11-15 02:22:21 UTC
Description of problem: Calls to timeout_cb seem to lead to crashes. While it's not clear to me if timeout_cb itself or the way it gets called has a problem, there are too many crashes related to its invokation to ignore. Steps to reproduce the problem. Not a clear pattern: Most likely to happen when a time consuming operation is performed. Looks related to mail checking, particulary in background. Does not require user interaction. No apparent OS version dependency. No clear Evolution version. Earlier reports are from around version 0.99 Additional Information: This seem to be related to the following bugs: 21219 Moving something (sent message or rule-based move) 21488 17709 18133 19433 17888 18689 No user interaction with Evolution 20394 15702 16455 Importing Pine data 16012 Importing Netscape data 18109 18276 Fetching POP mail 17548 17976 20953 login/send/receive (IMAP server) 17985 16453 login/send/receive No mail server specified 18458 16480 After deleting accounts 18454 After creating IMAP accounts. 16621 15855 After setup 17623 Reading The most common seen sequence in backtraces is: #N <signal handler called> #N+1 timeout_cb(data=param_A) at camel-remote-store.c:219 #N+2 mail_receive_uri () #N+3 mail_msg_wait_all () #N+4 thread_received_msg (e=param, m=param2) #N+5 thread_dispatch (din=param) #N+6 pthread_start_thread This is seen in bugs 21219, 21488, 20394, 17709, 16453, 21372, 16445, 18109, 17023, 20953, 18298, 16621, 17796, 18133, 18276, 18882, 17888, 17985, 18454,18689, 18892 Second pattern is: #N <signal handler called> #N+1 timeout_cb (data=0x82286e8) at camel-remote-store.c:219 #N+2 timeout_timeout (mm=0x821af78) at mail-session.c:636 #N+3 mail_msg_received (e=param1, msg=param2, data=0x0) at mail-mt.c:500 #N+4 0x400736f5 in thread_received_msg (e=param1, m=param2) Note that suspicious data=0x0 This happens with bugs 15702, 17548, 16012, 18458, 19433, 15855, 17623 (Yes, I know: maybe I'm mixing two different bugss, but maybe not). Please note that some of those are alreadey marked as duplicates. Please review 16453, 16235, 15794 All bugs noted here have backtrace, it's up to developers to find something usefull Others are marked as notabug or invalid. To look for more related bugs, make a bugzilla query using timeout_cb. You'll be surprised. Since it would be an administrative mess to close / reopen / reassign duplicates then: a) I will add as duplicates of this bug only those still new/unconfirmed. b) Please let me know when this is solved in order to make an appropriate close of other bug branches.
*** bug 221219 has been marked as a duplicate of this bug. ***
*** bug 221488 has been marked as a duplicate of this bug. ***
*** bug 220394 has been marked as a duplicate of this bug. ***
*** bug 217709 has been marked as a duplicate of this bug. ***
*** bug 221372 has been marked as a duplicate of this bug. ***
*** bug 215702 has been marked as a duplicate of this bug. ***
*** bug 221926 has been marked as a duplicate of this bug. ***
*** bug 221948 has been marked as a duplicate of this bug. ***
Please check bug 221948 stacktrace. If you think it should be an attachment, please let me know. By fejj on bug 221948 ============================ hmmm, this *might* be a duplicate of the timeout_cb bug. Not really sure though... maybe this also solves the mystery of the timeout_cb bug? if the crash is really happening in some i/o call but the other stacks didn't show that, then we could have a winner. Then again, this might simply be crashing because of some stack corruption (which is what NotZed thinks is happening in the timeout_cb crashes). =================
*** bug 222076 has been marked as a duplicate of this bug. ***
*** bug 222064 has been marked as a duplicate of this bug. ***
Agreed Gerado, problem is already looked at this a LOT and haven't been able to find anything (one of my bugs covers this btw). 21948 is an interesting case though, it warrants closer inspection of that (rather awful) code, since there seems to be no way to reproduce the bug.
*** bug 219985 has been marked as a duplicate of this bug. ***
*** bug 223105 has been marked as a duplicate of this bug. ***
Jeff, I wonder if that folder_info_free change would affect this any ...
hmmm, yea - maybe.
*** bug 223260 has been marked as a duplicate of this bug. ***
*** bug 216453 has been marked as a duplicate of this bug. ***
*** bug 223851 has been marked as a duplicate of this bug. ***
*** http://bugzilla.ximian.com/show_bug.cgi?id=22469 has been marked as a duplicate of this bug. ***
I had a another very long look at this today. I think I have discovered a possible case, but I cannot see how that can come about. If a store sets it state to offline, without removing its timeout handler, it may be possible for the timeout continue to execute against a stale object. I re-verified the timeout handling code. There is no way it can execute on a stale timeout object. I think removing the timeout code entirely or changing it completely may be the only way to fix this problem, if it is indeed due to camel and not some random external corruption. I also wonder. Has this been reported for the 1.1.x series?
We don't have 1.1.x snapshots (yet -- sigh) so I don't think there is a high chance of people seeing it in 1.1.
*** bug 224153 has been marked as a duplicate of this bug. ***
*** bug 224198 has been marked as a duplicate of this bug. ***
*** bug 224242 has been marked as a duplicate of this bug. ***
I submitted a patch to evolution-patches that might affect this, from fixing another bug. Have to see how it goes.
*** bug 223862 has been marked as a duplicate of this bug. ***
well, this is now fixed in cvs due to the complete removal of CamelRemoteStore and thus, no more timeout_cb function :-)
*** bug 231746 has been marked as a duplicate of this bug. ***
*** bug 226260 has been marked as a duplicate of this bug. ***
*** bug 232295 has been marked as a duplicate of this bug. ***