GNOME Bugzilla – Bug 342545
exchange-connector crashes caused by libsoup
Last modified: 2006-06-07 18:08:24 UTC
Steps to reproduce: 1. Use a e-mail account(not exchange account) send a e-mail to exchange account. 2. Select exchange account and click Receive button repeatly. 3. Sometimes exchange-connector would crash. Stack trace: The stack trace then exchange-connector crashes: ** (evolution-exchange-storage:6075): WARNING **: Unexpected error 0 (null) from POLL (evolution-exchange-storage:6075): GLib-GObject-WARNING **: instance of invalid non-instantiatable type `account-removed' Program received signal SIGSEGV, Segmentation fault.
+ Trace 68362
Thread 1098457760 (LWP 6075)
which indicate when io_cleanup called on msg(0x80cd128), a new_iostate is called on the same msg(0x80cd128).Is this correct? I think maybe this is the problem.
Sorry, forget something when input "Other information". New "Other information" attached as follows. Other information: Everytime this crash bug happens, i found line 746, 747 of new_iostate 746: if (priv->io_data) 747: io_cleanup (msg); will be executed and when i set a breakpoint at line 747 i got a stack trace:
+ Trace 68430
which indicates when io_cleanup called on msg(0x80cd128), a new_iostate is called on the same msg(0x80cd128).Is this correct? I think maybe this is the problem.
We found that it is because when msg->status_code is 440(E2K_HTTP_TIMEOUT), in io_read the status_code callback function fba_timeout_handler registered by exchange-connector will be called and it requeues the msg(set msg->stauts to SOUP_MESSAGE_STATUS_QUEUED). When soup_message_io_finished is called by io_read, soup_message_io_finished will call soup_connection_disconnect(indicate by the trace in "Other information") which will sigal a disconnect singal on the connction. And when the signal handler connection_closed is callded for this signal, the msg will be sent again. But when leaving soup_message_io_finished, the reference count of msg is 0. So when io_read is called on the msg when reply arrived from exchange server, a segment fault occurs. But the most amazing thing is that, in most case, even though disconnect signal is signaled by soup_connection_disconnect, but connection_closed will not be called. So not every msg whose status_code is 440 will cause the bug, but only the one leads connection_closed called.
I think what exchange-connector wants to do when connction timeout is to resend the message, this is resonable. In this case, soup_message_restarted in soup_message_io_finished should be called and resend the message. But now, the message is resend when io_cleanup is called, and after that the msg->status is running and soup_message_restarted would not be called but soup_message_finished instead. I changed static void soup_message_io_finished (SoupMessage *msg) { g_object_ref (msg); io_cleanup (msg); if (SOUP_MESSAGE_IS_STARTING (msg)) soup_message_restarted (msg); else soup_message_finished (msg); g_object_unref (msg); } to static void soup_message_io_finished (SoupMessage *msg) { g_object_ref (msg); if (SOUP_MESSAGE_IS_STARTING (msg)) soup_message_restarted (msg); else soup_message_finished (msg); io_cleanup (msg); g_object_unref (msg); } to ensure the message would be restarted before io_cleanup, so it would be send in io_cleanup. It works fine now, but i am not sure if this is correct or not.
Created attachment 66227 [details] [review] patch You can't move the io_cleanup call because then if the message starts getting re-sent by the soup_message_restarted() call, then io_cleanup() will free the *new* iostate rather than the *old* iostate. Fundamentally, the problem seems to be that: (a) soup_session_requeue_message sets the message status to QUEUED, but leaves its iostate unchanged (b) soup-session-async.c:run_queue() believes it can send off any QUEUED message whenever it wants to, regardless of its iostate. Ideally I think we'd want a new SoupMessageStatus value "REQUEUED", but that would break API/ABI. The attached patch makes run_queue() refuse to requeue messages that are still being processed by soup-message-io. I think this should fix the problem. Can you test it and get back to me?
I tested the patch, and I can not reproduce the bug any more. It works fine.
Great. Thanks for figuring this out. Fixed in CVS, and will be released in 2.2.93 by Monday (for GNOME 2.14.2)
*** Bug 322901 has been marked as a duplicate of this bug. ***