GNOME Bugzilla – Bug 271113
Contact searching is very slow.
Last modified: 2013-09-10 14:04:09 UTC
Simple field-test searches on a ~500 contact local addressbook take between 0.10 and 0.50 seconds on my pretty nice computer. These times are for a synchronous call to e_book_get_contacts. One-tenth of a second may sound speedy, but even a small number of searches (say, 10) add up quickly. This particularly plagues the BBDB e-plugin (evolution/plugins/bbdb) which appears to lock up when you reply to messages with large recipient lists, because so much time is spent synchronously querying the addressbook. One solution might be to make the BBDB plugin run in a thread, but this will doubtless introduce all kinds of new complexity, and will still leave us with slow query performance in Evolution's addressbook. So I decided to poke around inside the beast. As far as I can tell, here's what happens when you call e_book_get_contacts: 1. The query is passed to the file backend: e-book-backend-file.c, e_book_backend_file_get_contact_list(). It is sent over CORBA. 2. The db3 database is opened. The backend iterates over every single record in the database and: 2a. Reads the entire vcard out of the database (including embedded base64 images). 2b. Converts the vcard to an EContact: e-book-backend-sexp.c, e_book_backend_sexp_match_vcard(). 2c. Runs the sexp query against the EContact. 2d. Destroys the EContact. 2e. If the query matched the EContact, the entire vcard string is added to a list of matches. 3. When all the vcard match strings are collected, they get sent over CORBA to the caller: e-data-book.c, e_data_book_respond_get_contact_list(). 4. The list of vcard strings is then converted to EContacts on the client side: e-book-listener.c, impl_BookListener_respond_get_contact_list(). 5. e_book_get_contacts returns the list of EContacts. Step 2b seems to consume 70-90% of the total query time, based on some GTimer instrumentation. Probably the best way to fix this is to augment or replace the db3 database with a searchable index of commonly-searched fields. I noticed there is an addressbook.db.summary file already in my ~/.evolution. I'm not sure what that's all about. Another option would be to just do a full-text search of the vcard strings, instead of converting to an EContact. For example, when someone searches for (is "email" "nat@nat.org"), you could search the vcard text for "EMAIL:nat@nat.org^M".
I should also note that I noticed while I was in there that I am listed as the sole author of many of those files, most of which I haven't touched in nearly five years. Which I think is because no one else has wanted to take responsibility for them. :-)
Created attachment 44560 [details] [review] patch to test
attached patch uses summary in case of get_contact_list also, i have tried "(is \"email\" \"foo@bar\")" "beingswith \"full_name\" \"foo\")" queries and things seems to faster with summary. Full name comparioson in summary seems to be different from e_book_backend_sexp_match_contact and look broken. So "(is \"full_name\" \"foo\")" queries do not return results when complete name is given as part of the query with the above patch . But that is a totally different issue to be fixed in summary.
This patch seems to improve query performance for email and full name searches by 2 orders of magnitude. Really, really nice.
Created attachment 44561 [details] [review] fix for is full_name query
Siva's patches seem to be in HEAD now. For 127 sequential searches against email/full_name, I get an average search time of 0.014 secs, which is at least 10x better than what we saw without the summary files. Max was 0.0858, min was 0.00128. What's strange is the queries seem to get slower over time. I'll attach a graph.
Created attachment 44568 [details] A graph of query times
Someone suggested that the query time spikes we see around query #85 are due to EDS spawning new threads for each search. This is possible, but seems strange considering that these are sequential, synchronous (not simultaneous/async) queries.
Not sure if this can be closed now, since patch made head and seems to solve the problem.
nat, are you content with closing this one here? at least punting target milestone from 2.1 to 2.2.x.
patch at 312690 would help to improve the performance.
adding perf keyword
at least retargetting from 2.2 to 2.4 as there won't go any fixes anymore into 2.2.x series. sush: please review http://bugzilla.gnome.org/buglist.cgi?short_desc_type=allwordssubstr&short_desc=&product=Evolution&product=Evolution-Data-Server&long_desc_type=allwordssubstr&long_desc=&status_whiteboard_type=allwordssubstr&status_whiteboard=&keywords_type=anywords&keywords=perf%2C+memory&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&emailreporter1=1&emailtype1=substring&email1=burtonini.com&emailtype2=substring&email2=&bugidtype=include&bug_id=&changedin=&chfieldfrom=&chfieldto=Now&chfieldvalue=&cmdtype=doit&newqueryname=&order=Reuse+same+sort+as+last+time&field0-0-0=noop&type0-0-0=noop&value0-0-0= to improve this here. thanks in advance.
The patch in bug 312690 was commited as well. Can we close this? Or at least mark the patches as commited if they really are in CVS?
first patch has been committed: http://cvs.gnome.org/viewcvs/evolution-data-server/addressbook/backends/file/e-book-backend-file.c?r1=1.26&r2=1.27 issue of the second patch has also been already fixed in the code. closing this bug as fixed. kmaraas: thanks for poking. :-)