Bug 608327 – Cannot recover after connection lost

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 608327 - Cannot recover after connection lost


Summary:	Cannot recover after connection lost


Status:	RESOLVED FIXED

Product:	evolution-mapi
Classification:	Applications
Component:	miscellaneous
Version:	3.5.x
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	---
Assigned To:	evolution-mapi-maint
QA Contact:	evolution-mapi-maint

URL:
Whiteboard:

Duplicates:	618415 669219 (view as bug list)
Depends on:
Blocks:

Reported:	2010-01-28 11:14 UTC by Milan Crha
Modified:	2012-07-17 10:26 UTC

See Also:
GNOME target:	---
GNOME version:	2.29/2.30

Attachments
ema patch (mailter part) (34.53 KB, patch) 2012-07-16 15:59 UTC, Milan Crha	committed	Details \| Review

Description Milan Crha 2010-01-28 11:14:46 UTC

Moving this upstream from a downstream bug report
https://bugzilla.redhat.com/show_bug.cgi?id=558651
which contains also few console logs of the issue. I skipped some parts of investigation and I'm pasting here only the description and findings.

Description of problem:
In order for evolution to function properly, it is necessary to quit and
restart it periodically throughout the day.  It reaches a point where no new
messages are brought over, or messages are partially brought over (from
exchange).  Verification against outlook or outlook web show that there ARE new
messages, but none are transmitted, or they are transmitted without subjects,
or with subjects but no body, or the headers come over, but the message cannot
be read.

Version-Release number of selected component (if applicable):
Same behaviour is apparent in 2.28.2 and 2.29.5 of evolution / 0.28.2 and
0.29.5volution-mapi

How reproducible:
constantly throughout the day.  I haven't found the catalyst yet (though it
could very well be that exchange gets busy and doesn't respond in a timely
manner)

Steps to Reproduce:
1.start evolution
2.use it for some period of time
3.

Actual results:
messages stop coming over from exchange, or come over missing subject/content
matter.

Expected results:
messages come over when send/receive is pressed, or when the configured time
period elapses for it to automatically check for new messages, and I expect
messages to come over in their entirety.
-----------------------------------------------------------------------------

Thanks for the update. I hoped there will be something specific in the logs,
but it unfortunately isn't. What I see there, in evo console log, is that it's
working properly until ~ middle, when suddenly a connection to the server was
lost probably, and it prints on the console the only error:
> camel-mapi-provider-WARNING **: Could not get folder list..
after that it fails to operate.

The gdb log isn't showing anything unusual, few threads due to db-summary.

From the above it seems it cannot recover after connection lost to the exchange
server.

-----------------------------------------------------------------------------

I have the exact same behaviour though my ability to send messages drops out
first then I loose the ability to receive any new messages. THis has also been
confirmed using outlook/OWA.

I am running:

kernel 2.6.31.9-174.fc12.i686.PAE
evolution-exchange-2.28.2-1.fc12.i686
evolution-mapi-0.28.2-1.fc12.i686  

-----------------------------------------------------------------------------

I've noticed the same thing, regarding the ability to send messages.  It will
place new messages in the outbox, but will not send them, merely popping up an
error that it is unable to send.  But that would fall in what what was said
above, that it never successfully recovers from losing its connection to
exchange.

Comment 1 Akhil Laddha 2010-05-12 03:53:54 UTC

*** Bug 618415 has been marked as a duplicate of this bug. ***

Comment 2 Aaron 2010-12-13 23:08:31 UTC

I can confirm this bug. If I disconnect then reconnect to the MAPI server, my connection is restored and my new e-mails arrive.

Comment 3 Derek Atkins 2011-09-19 12:44:59 UTC

It's been 9 months since the last comment on this bug.  Has there been any progress?

This is particularly evident from the calendar feature, and I can reliably reproduce this problem with my environment where I need OpenVPN to connect to the exchange server.  It seems (to me) that the calendar does not honor the "work offline" or does not reconnect properly.

Comment 4 Milan Crha 2011-09-19 14:01:03 UTC

There is no progress on this, I'm sorry. I tried to reproduce this briefly and I realized that if the offline state is detected by the NetworkManager plugin, then the factories aren't notified at all, because they are listening to a GConf key, which is used only when user goes offline/online on his/her own, thus the factories think that they are still online. The code would need more changes too, because it tries to close the connection properly, but it usually timeouts, because the server is not reachable, but the timeout is so long that it blocks from reconnecting during my tests.

Comment 5 Derek Atkins 2011-09-19 14:50:29 UTC

This issue occurs also when manually using File -> Work Offline.

File -> Work Offline
move networks (change IP)
restart VPN
File -> Work Online

And now calendar is no longer accessible to me until I kill (manually) the calendar factory and restart evo.

I can reproduce this quite reliably here.  I am happy to help test as best I can.  (I'm using Fedora 15)

Comment 6 Adam DiFrischia 2012-02-02 13:48:52 UTC

This is still happening all the time in Evolution 3.2.3-1 coupled with evolution-mapi-3.2.3-1 in Fedora 16. At work, while on our company network everything is fine, but if I undock my laptop before it suspends and Evolution is still open, the next morning when I come in Evolution is frozen and unresponsive. Even after telling it to dismiss the warning about lost connection, and despite being on the same internal network, until I kill and restart Evolution it won't work properly.

Comment 7 Adam DiFrischia 2012-02-02 13:49:39 UTC

(In reply to comment #6)
> This is still happening all the time in Evolution 3.2.3-1 coupled with
> evolution-mapi-3.2.3-1 in Fedora 16. At work, while on our company network
> everything is fine, but if I undock my laptop before it suspends and Evolution
> is still open, the next morning when I come in Evolution is frozen and
> unresponsive. Even after telling it to dismiss the warning about lost
> connection, and despite being on the same internal network, until I kill and
> restart Evolution it won't work properly.

Wanted to add in it's kernel 3.2.2-1.

Comment 8 Milan Crha 2012-02-10 10:22:18 UTC

*** Bug 669219 has been marked as a duplicate of this bug. ***

Comment 9 Milan Crha 2012-02-10 10:24:17 UTC

From bug #669219 current evolution-mapi (3.3.5) deadlocks when going offline with disconnected network. There don't seem to be set timeouts on the tevent requests.

Comment 10 Milan Crha 2012-07-12 11:40:02 UTC

This is partly problem in samba4. With alpha18 the connection doesn't timeout, as in the below backtrace. It's better with beta2, I get MAPI_E_CALL_FAILED, which is significantly better. Evo-mapi doesn't recover, though.

Comment 11 Milan Crha 2012-07-12 11:42:47 UTC

This is from alpha18. beta2 has almost the same backtrace, but as I said, it timeouts on its own after a minute.

+ Trace 230502

Thread 2 (Thread 0x7f2e6ffff700 (LWP 8082))

#0 epoll_wait
from /lib64/libc.so.6
#1 epoll_event_loop
at ../tevent_standard.c line 281
#2 std_event_loop_once
at ../tevent_standard.c line 565
#3 _tevent_loop_once
at ../tevent.c line 504
#4 tevent_req_poll
at ../tevent_req.c line 210
#5 dcerpc_binding_handle_call
at ../librpc/rpc/binding_handle.c line 542
#6 dcerpc_EcDoRpc_r
at gen_ndr/ndr_exchange_c.c line 12090
#7 emsmdb_transaction
at libmapi/emsmdb.c line 446
#8 emsmdb_transaction_wrapper
at libmapi/emsmdb.c line 598
#9 OpenFolder
at libmapi/IMsgStore.c line 96
#10 e_mapi_connection_open_personal_folder
at e-mapi-connection.c line 1043
#11 cmf_open_folder
at camel-mapi-folder.c line 80
#12 mapi_folder_synchronize_sync
at camel-mapi-folder.c line 1685
#13 mapi_refresh_folder
at camel-mapi-folder.c line 898
#14 mapi_folder_refresh_info_sync
at camel-mapi-folder.c line 1558
#15 camel_folder_refresh_info_sync
at camel-folder.c line 3927
#16 refresh_folders_exec
at mail-send-recv.c line 1044
#17 mail_msg_proxy
at mail-mt.c line 423
#18 ??
from /lib64/libglib-2.0.so.0
#19 ??
from /lib64/libglib-2.0.so.0
#20 start_thread
from /lib64/libpthread.so.0
#21 clone
from /lib64/libc.so.6

Comment 12 Milan Crha 2012-07-16 15:59:34 UTC

Created attachment 218923 [details] [review]
ema patch (mailter part)

for evolution-mapi;

This enables reconnect on MAPI mailer account only. Note the requirement of samba4 beta2 being used. The book/cal backends are waiting for a result of [1], only then I'll close this bug report.

[1] https://mail.gnome.org/archives/evolution-hackers/2012-July/msg00008.html

Comment 13 Milan Crha 2012-07-16 16:01:26 UTC

Created commit 4f82687 in ema master (3.5.5+)

Comment 14 Derek Atkins 2012-07-16 16:47:38 UTC

Milan, I don't understand what Samba has to do with anything?  I'm running against Exchange and still having this issue..  It's purely an Evolution issue.

Comment 15 Milan Crha 2012-07-17 05:16:12 UTC

evolution-mapi is using OpenChange's libmapi to communicate with Exchange servers. OpenChange uses samba4 to connect to Exchange servers. Thus samba4 has everything to do with this.

If you look into backtrace at comment #11, then the gen_ndr/ and libmapi/ functions are from OpenChange, which calls librpc function, which is provided by samba4, same as tevent at the top of the backtrace. Earlier versions of samba4 (before beta2 with which I'm testing right now), didn't have set timeout on simple RPC calls, thus it was just stuck with no way to recover. Beta2 timeouts the request after several seconds (I didn't measure it, it's about minute), thus the caller, in our case evolution-mapi through OpenChange, can eventually recover and act accordingly.

Comment 16 Milan Crha 2012-07-17 09:34:32 UTC

I managed to fix this for book/cal parts as well, thus this is finally finished.

Created commit 467b2c3 in ema master (3.5.5+)

Comment 17 Milan Crha 2012-07-17 10:26:09 UTC

Oops, the previous commit had an issue in EBookBackend, it connected to the server twice, preventing in local cache update if book was set for offline. Follow up commit c5001b6.