GNOME Bugzilla – Bug 746276
e_client_cache_get_client_sync() gets stuck
Last modified: 2016-05-16 14:22:11 UTC
I frequently get a task stuck "Filtering new messages in 'INBOX' (n% complete)" where n is some value less than 100. Sometimes the "(%n complete)" part is not included in the message. When this happens, I am unable to expunge the folder. The only way to resolve it is to restart evolution. Here's the stacktrace of the evolution process:
+ Trace 234854
Thread 7 (Thread 0x7f32f6e18700 (LWP 15022))
Thread 2 (Thread 0x7f32f5615700 (LWP 19802))
Thanks for a bug report. The backtrace shows two threads running a junk test during the message filtering. Each is run on a different message of the same account. One thread is waiting for a response from some addressbook, another one is waiting on an EClientCache to get an ESource corresponding to some address book. I suppose the problem is that the address book doesn't respond and the junk test code is waiting for it indefinitely - the operation cannot be even cancelled, according to the backtrace. Finding out why the address book is waiting for so long may help to identify the issue. Or maybe to figure out what else failed here. What is your exact evolution version, please? Evolution 3.12.11 contains a change in the EClientCache [1], which can have influence on this. [1] https://git.gnome.org/browse/evolution/commit/?id=0e6ad402fdd0
(In reply to Milan Crha from comment #1) > Finding out why the address > book is waiting for so long may help to identify the issue. Indeed. Any hints on finding that out? > Or maybe to figure out what else failed here. What is your exact evolution > version, please? evolution-3.12.11-1.01.fc21 Which is upstream's 3.12.11-1 + the upstream patch to allow expunging in vfolders. > Evolution 3.12.11 contains a change in the EClientCache > [1], which can have influence on this. > > [1] https://git.gnome.org/browse/evolution/commit/?id=0e6ad402fdd0 Looking in the tarball that the RPM was built from, that patch seems to be there.
Try to get a backtrace of the running evolution-addressbook-factory, (have installed debuginfo package for evolution-data-server), to see what it does. The command can look like: > $ gdb --batch --ex "t a a bt" -pid=`pidof evolution-addressbook-factory` &>bt.txt Please check the bt.txt for any private information, like passwords, email address, server addresses,... I usually search for "pass" at least (quotes for clarity only).
*** Bug 746500 has been marked as a duplicate of this bug. ***
I tried to reproduce it here, but no luck. I also checked the code, whether the change mentioned in comment #1 could cause any harm (it's one of the possibilities), but the calls to e_client_cache_get_client_sync() seem to be correct, at least on the first look. I thought there is some issue with the main context, like a wrong thread default context being used for the EBookClient, or the main thread context being blocked for some reason, causing this stuck state, but none of it had been seen here. I can try to create a debugging patch, which would print some hopefully-useful information on the console, and create a scratch build for Fedora 21, if you'd be willing to give it a try. If you get this stuck state frequently, then it'll be easier for you to provide the debug log. Could I do it, please?
Created attachment 300730 [details] addressbook-factory stack trace addressbook-factory stack trace
(In reply to Milan Crha from comment #5) > > I can try to create a debugging patch, which would print some > hopefully-useful information on the console, and create a scratch build for > Fedora 21, if you'd be willing to give it a try. Most certainly!
*** Bug 747009 has been marked as a duplicate of this bug. ***
Thanks for the update. The address book factory is idle, which might mean that the response from it is hidden somewhere. I'll try to cook the debug patches and provide packages for Fedora 21 with them (I will most likely touch both evolution-data-server and evolution).
Here [1] is the scratch build. It's the evolution only, for now. Please download it, install it and run it. It'll print many things into the console. Capture it by usual means, like: $ evolution &>log.txt It logs only pointers, source (book/calendar/...) names and some other hopefully useful information, but nothing really private, I hope. Thanks in advance. [1] http://koji.fedoraproject.org/koji/taskinfo?taskID=9398945
Hrm. It is strange. My system is not seeing the koji build you did as upgrading my existing version: Examining /var/tmp/yum-root-vZb5zL/evolution-3.12.11-1.1.fc21.x86_64.rpm: evolution-3.12.11-1.1.fc21.x86_64 /var/tmp/yum-root-vZb5zL/evolution-3.12.11-1.1.fc21.x86_64.rpm: does not update installed package. Error: Nothing to do and rpm even doesn't see it as upgrading the existing evolution installation: file /usr/share/locale/zh_TW/LC_MESSAGES/evolution-3.12.mo from install of evolution-3.12.11-1.1.fc21.x86_64 conflicts with file from package evolution-3.12.11-1.01.fc21.x86_64 But my installed evolution looks fine to me: # rpm -qi evolution Name : evolution Version : 3.12.11 Release : 1.01.fc21 Architecture: x86_64 I've never seen RPM this confused before. Certainly I can force the install/upgrade of the koji build but the above is just slightly concerning.
it will be due to the version. If you downgrade first, then it will work. By the way, what does that "1.01" contain?
(In reply to Milan Crha from comment #12) > it will be due to the version. Why is that? Both packages have the same name. > If you downgrade first, then it will work. If the problem were that RPM thinks that your package is older than mine it does not complain of file conflicts but says that a newer package is already installed. RPM is not understanding that your and my packages are both "evolution" for some strange reason. > By > the way, what does that "1.01" contain? The patch to allow vfolders to be expunged. Can you attach your debug patch to this bug?
Created attachment 301046 [details] evo-gn746276-01.patch The debug patch for evolution. To answer the rest, it was only my first idea of the things involved. No doubt, I can be wrong.
*** Bug 747271 has been marked as a duplicate of this bug. ***
Created attachment 301083 [details] evo-gn746276-02.patch Updated debug patch, which may have the issue addressed. The problem is thread unsafety of the EAsynClosure (or the way it is used here). Bad timing made the EAsyncClosure finish before it was run, thus the call to run didn't know it's actually over.
Just a note, a patch for evolution-data-server from [1] would be a better choice, adding thread safety on each usage of the EAsyncClosure, not only to this exact place in evolution. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1207661#c5
Thanks to Brian I was able to investigate this further and it led me to more findings and related circumstances. The EAsyncClosure thread safety is needed: Created commit_d22de17 in eds master (3.17.1+) [1] Created commit_7473c39 in eds gnome-3-16 (3.16.2+) Here [2] is a test build of evolution-data-server for Fedora 21 with the change applied. I made a similar change in evolution itself too: Created commit 07a0565 in evo master (3.17.1+) Created commit 13d5477 in evo gnome-3-16 (3.16.2+) [1] https://git.gnome.org/browse/evolution-data-server/commit/?id=d22de17 [2] http://koji.fedoraproject.org/koji/taskinfo?taskID=9478633
*** Bug 747647 has been marked as a duplicate of this bug. ***
*** Bug 748290 has been marked as a duplicate of this bug. ***
*** Bug 752353 has been marked as a duplicate of this bug. ***
I'm reopening this. All the above changes are still wrong, as can be reproduced with steps from bug #752353.
Let's avoid the e_async_closure() in this function completely, for a price of a semi-duplicated code. It helps a lot, thus it worth it. Created commit a6aa24f in evo master (3.17.4+) Created commit edd9a7f in evo gnome-3-16 (3.16.5+)
*** Bug 753308 has been marked as a duplicate of this bug. ***
*** Bug 766378 has been marked as a duplicate of this bug. ***