GNOME Bugzilla – Bug 620095
Deadlock while filling folder list in Contacts
Last modified: 2010-05-31 20:47:37 UTC
When going to the contacts section, and selecting "properties" from the RMB menu for the Exchange's "Contacts" database, Evolution stops responding and eats up 100% of the CPU. After attaching to the running evolution process with gdb, I can step through the exchange-mapi-account-setup.c code, and I get this: ---8<--- check_node (ts=0x3697e60 [GtkTreeStore], folder=0x3b50e40, iter=...) at exchange-mapi-account-setup.c:414 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 419 } (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 419 } (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 415 status = check_node (ts, folder, iter); (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 415 status = check_node (ts, folder, iter); (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 415 status = check_node (ts, folder, iter); (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 419 } (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { (gdb) n 415 status = check_node (ts, folder, iter); (gdb) n 414 while (gtk_tree_model_iter_next (ts_model, &iter) && !status) { ---8<--- And go on for pretty much as much time as I had to spend. I can detach from the process, let it run for a few hours and come back to find it at the exact same code lines.
I've been running Evolution under gdb since yesterday, and it doesn't look like its going to finish what its doing soon. I've been looking at the relevant code (at http://git.gnome.org/browse/evolution-mapi/tree/src/account-setup-eplugin/exchange-mapi-account-setup.c?h=gnome-2-30#n391 ) and while I don't understand what it actually tries to achieve, the algorithms still looks suspicious to me and its not clear when check_node() returns TRUE to add_to_store(), for example: Supposedly the top level has one child (A), that has two children (A1,A2): * check_node() is called on A * check_node(A) doesn't find parent_folder_id * check_node(A) lists A's children and calls each one in turn: * check_node() is called on A1: * check_node(A1) finds parent_folder_id * check_node(A1) appends folder to A1 * check_node(A1) returns TRUE * check_node() is called on A2 (discarding the TRUE return value from check_node(A1)): * check_node(A2) doesn't find parent_folder_id * check_node(A2) doesn't have children * check_node(A2) doesn't have more siblings * check_node(A2) returns FALSE * the FALSE return status is stored * check_node(A) doesn't find any more children * check_node(A) doesn't find any more siblings * check_node(A) returns FALSE * add_to_store() gets FALSE return value from check_node and adds folder again, this time under A. It looks like the same folder can be added twice under, unless you can guarantee that folders are always listed in ordered DFS, and I see no such guarantee anywhere in the code - on the contrary, the fact that check_node does scan through all the children and siblings in order, and does not abort after the first TRUE return value from a recursive call, seems to suggest that DFS order is not expected.
Created attachment 162370 [details] [review] patch to fix the recursion problem in check_node() There also seems to be a problem in the recursion behavior itself, as it iterates over the same nodes multiple times in the last while loop in check_node(): - Suppose the following folder list: A, B, C - check_node(A) is called - check_node(A) finds no parent, has no children, then calls in a loop to check_node(B) followed by check_node(C) (assuming neither returns true). - check_node(B) finds no parent, has no children, then calls check_node(C) in a loop. We can see that check_node(C) is called twice because iter in is passed to check_node by value. With long folder lists the performance can quickly degrade catastrophically. See attached patch that fixes this to implement tail-recursion properly. I've tested this on Fedora 13, and with the attached patch the properties dialog for the address book appears almost instantly :-)
Thanks for a bug report and a patch. It seems to produce still the same tree for me, thus I'm committing it to master and stable.
Created commit 782dc90 in ema master (0.31.3+) Created commit d46f5bf in ema gnome-2-30 (0.30.2+)
Thanks :-)