GNOME Bugzilla – Bug 652172
EBookView: Notify with UIDs Only
Last modified: 2013-09-14 16:53:49 UTC
EBookView emits signals when contacts are changed, added, or removed. But those signals sends lists of entire EContacts, containing the entire contact data, though we often have all the data already and just want to know which contacts have changed. e_book_get_book_view() takes an EBookQuery which could be extended to take a new flag to make the EBookView send only the UID in the signals' EContacts. For instance, "X-EDS-QUERY-FLAG-NOTIFY-UID-ONLY". The UID is already stored separately for use as the primary key of the local-storage backend's BDB database, so no VCard parsing should be necessary to provide it. This would allow easier integration with the QContact API, greatly improving performance. We plan to provide a patch for this soon, but please let us know if it sounds acceptable.
What's the proposed API? It's not immediately obvious how flags can be set for EBookQuery, at least to me.
Bug #652171 mentions setting flags via the query language, so I guess it is the same here. Ross already suggested a different approach on chat: add a new API for setting flags on an EBookView. I know that I originally proposed the query extensions mechanism because I wanted to avoid modifications to the API methods and D-Bus interface underneath. But I agree with Ross that a proper solution is to do it via EBookView.
The proposed implementation changed a few times, I'm about ready to write this up so I'll just jot down here what I plan to do. Hopefully you will have time to correct me if I've misunderstood. a.) Add api to EBookView to set whether the full vcards should be sent in notifications or if only the uids should be sent. b.) Add api to EGdbusBookView to set this on a backend view, probably this should be e_[gdbus_]book_view_set_flags() and use a flags type so that any boolean options in the future could be added to the flags without having to pave out new dbus api every time. c.) Handle the new flags api only in the 'file' backend. The new flag will be E_BOOK_UIDS_ONLY or such, if the file backend "book view" is flagged with UIDS_ONLY then any notifications will be sent as shallow vcards (from what I understand its a string vcard with only the UID field). The result as I understand it would be: o There is no change to e_book_get_book_view(), this function creates a proxy object (EGdbusBookView) to receive notifications from and as far as I can see does not result in transferring any contacts. o After creating a view, one can then call a new api e_book_view_set_flags (E_BOOK_VIEW_UIDS_ONLY) and then that particular view will only ever send the UID fields on the contact vcards transfered during notifications.
> c.) Handle the new flags api only in the 'file' backend. The new flag > will be E_BOOK_UIDS_ONLY or such, if the file backend "book view" is > flagged with UIDS_ONLY then any notifications will be sent as shallow > vcards (from what I understand its a string vcard with only the UID > field). This is a crucial point. I believe that if setting the UIDS_ONLY flag has this result, it'll be identical to requesting a view and setting the "requested fields" to just "UID". In that case, no new EBookView API is needed. We can achieve the same result by implementing a special case in the file backend based on the "requested fields". My understanding was that UIDS_ONLY would go one step further: instead of returning a vCard, it returns just the UID string via D-Bus. Murray just pointed out that the EBookView signals still require a EContact. We could keep that in place and simply populate that EContact with an empty vcard and then setting the UID retrieved from D-Bus.
Yes, I agree. The meaning of the requested_fields parameter is unclear, and it's not used really anyway. The notification signals seem to be the only way to get data from the EBookView, so it's natural to think that requested_fields should influence only those signals. I suggest doing this and only trying to add new signals (that send UID ints instead of EContact*s) if the first step doesn't provide a big enough performance gain.
(In reply to comment #5) > I suggest doing this and only trying to add new signals (that send UID ints > instead of EContact*s) if the first step doesn't provide a big enough > performance gain. Agreed.
Created attachment 189779 [details] [review] Patch that implements 'requested_fields' parameter of e_book_get_book_view() to a certain extent I've created a branch and called it 'openismus-work' where I have a handful of commits towards this. The branch is based on evolution-data-server gnome-2-32 branch and the attached patch is a compilation of the commits on the new branch. Here's a run down of what the patch/branch does at this point: o Extend the 'org.gnome.evolution.dataserver.addressbook.Book.getBookView' dbus api to include the 'requested_fields' parameter. o Handle the said parameter in EDataBook o EDataBookView now stores the requested fields and has a e_data_book_get_requested_fields() api. o The 'file' backend (e-book-backend-file.c) Now check the requested fields on a given view, if there is only one requested field and that field is the UID, then shallow vcards are send with only the UID field. o Added addressbook/tests/test-ebook-get-book-view-uid-only.c which creates a book view with the requested_fields set and asserts that only the UID field is returned through the notifications. o Consequently, I needed to adjust a signal's signature and found that the marshaller files were committed, so the branch also removes the committed marshallers and fixes the Makefile to use glib-genmarshal. That said, the tests are still sometimes randomly failing on the same "cannot connect to local addressbook" errors and I'm not sure how to fix that. There are also some warning messages from e-addressbook-factory which I suspect are normal but I might be wrong so I'll quote them here: Output from running e-addressbook-factory with gdb: ---------------------------------------------------- [Thread debugging using libthread_db enabled] e-data-server-Message: adding type `EBookBackendGroupwiseFactory' e-data-server-Message: adding type `EBookBackendVCFFactory' e-data-server-Message: adding type `EBookBackendFileFactory' e-data-server-Message: adding type `EBookBackendWebdavFactory' e-data-server-Message: adding type `EBookBackendGoogleFactory' [New Thread 0x7ffff2af5700 (LWP 29772)] [New Thread 0x7ffff22f4700 (LWP 29773)] Migrating cached backend data rmdir /home/tristan/.evolution/cache FAILED: Directory not empty (contents follows) tmp Migrating local backend data rmdir /home/tristan/.evolution FAILED: Directory not empty (contents follows) cert8.db secmod.db addressbook key3.db calendar camel-cert.db mail cache tasks memos Server is up and running... [New Thread 0x7ffff1af3700 (LWP 29804)] book_view file uref [Thread 0x7ffff1af3700 (LWP 29804) exited] (e-addressbook-factory:29769): libebookbackend-WARNING **: libdb error: /opt/devel/share/evolution/addressbook/1307874004.29788.0@ragamuffin/addressbook.db: unable to flush: No such file or directory [New Thread 0x7ffff1af3700 (LWP 29805)] book_view file uref [Thread 0x7ffff1af3700 (LWP 29805) exited] (e-addressbook-factory:29769): libebookbackend-WARNING **: libdb error: /opt/devel/share/evolution/addressbook/1307874004.29788.1@ragamuffin/addressbook.db: unable to flush: No such file or directory Bye. ---------------------------------------------------- For some reason after a while the e-addressbook-factory exits normally and says "Bye", I suspect I'm not running the addressbook daemon properly.
Also I forgot to mention: I'm not sure if e-book-backend-file.c could be optimized further. Currently we still call DBcursor->c_get() to iterate through the database while we really only need the key (not sure if passing NULL as data there is acceptable or if it would even be faster to fetch the next key without fetching it's data).
I did some initial profiling and extended the test-ebook-get-book-view-uid-only.c test. Currently I have it set up to generate 1000 contacts and I've compared with smaller and larger vcards. Using smaller vcards containing the fields E_CONTACT_FULL_NAME, E_CONTACT_NICKNAME and email fields E_CONTACT_EMAIL_1-4 I got the following results: --------------------------- (ran the test several times, here are 2 samples) Loading all data from book view synchronously finished in 0.098129 seconds Loading only uids from book view synchronously finished in 0.083867 seconds Loading all data from book view asynchronously finished in 0.123957 seconds Loading only uids from book view asynchronously finished in 0.088376 seconds Loading all data from book view synchronously finished in 0.093787 seconds Loading only uids from book view synchronously finished in 0.094309 seconds Loading all data from book view asynchronously finished in 0.125320 seconds Loading only uids from book view asynchronously finished in 0.099812 seconds --------------------------- Using larger vcards (containing 24 additional strings each vcard) I got these following results: --------------------------- (2 samples of the test output using large vcards) Loading all data from book view synchronously finished in 0.365277 seconds Loading only uids from book view synchronously finished in 0.084443 seconds Loading all data from book view asynchronously finished in 0.306958 seconds Loading only uids from book view asynchronously finished in 0.097560 seconds Loading all data from book view synchronously finished in 0.347809 seconds Loading only uids from book view synchronously finished in 0.091495 seconds Loading all data from book view asynchronously finished in 0.263470 seconds Loading only uids from book view asynchronously finished in 0.101330 seconds -------------------------- The verdict so far I guess is that the performance gain of loading a book view with only UID field updates is noticeably faster with sparsely populated vcards and dramatically faster when dealing with richly populated vcards. The test case itself is available in the 'openismus-work' branch as addressbook/tests/ebook/test-ebook-get-book-view-uid-only.c
(In reply to comment #9) > I did some initial profiling and extended the > test-ebook-get-book-view-uid-only.c test. [...] > The verdict so far I guess is that the performance gain of loading > a book view with only UID field updates is noticeably faster with > sparsely populated vcards and dramatically faster when dealing > with richly populated vcards. Thanks for testing, this is a useful confirmation of the initial gut feeling that it would be worthwhile. On which hardware was this? This is likely to be CPU-bound, so the improvements on lower end Netbook-class devices should be more obvious than on desktop (multi-)processors.
It does not seem to build here: e-gdbus-egdbusbook.c: In function 'e_gdbus_book_default_init': e-gdbus-egdbusbook.c:497:19: error: '_e_gdbus_gdbus_cclosure_marshaller_BOOLEAN__OBJECT_STRING' undeclared (first use in this function) e-gdbus-egdbusbook.c:497:19: note: each undeclared identifier is reported only once for each function it appears in
I think you forgot to git add e-gdbus-marshal.h ?
(In reply to comment #10) > (In reply to comment #9) > > I did some initial profiling and extended the > > test-ebook-get-book-view-uid-only.c test. > > [...] > > > The verdict so far I guess is that the performance gain of loading > > a book view with only UID field updates is noticeably faster with > > sparsely populated vcards and dramatically faster when dealing > > with richly populated vcards. > > Thanks for testing, this is a useful confirmation of the initial gut feeling > that it would be worthwhile. On which hardware was this? This is likely to be > CPU-bound, so the improvements on lower end Netbook-class devices should be > more obvious than on desktop (multi-)processors. These tests were done on a hp pavilion dv6000 laptop which is about 4 or 5 years old with 2GB of ram and 2 AMD Turion 64bit cores clocked at 800MHz (so I've got 1.6GHz here). (In reply to comment #12) > I think you forgot to git add e-gdbus-marshal.h ? As was discussed on irc, the patch removes the previously committed marshallers and fixes the Makefile.am to automatically generate them, running autogen.sh apparently fixes the problem (however I'm a bit baffled, normally I think Makefile updates itself from Makefile.am).
(In reply to comment #13) > As was discussed on irc, the patch removes the previously committed > marshallers and fixes the Makefile.am to automatically generate them Could you submit that small change upstream, please, assuming that it's not corrected in master already.
Because the of the problems raised regarding applying downstream patches to tarballs in connection with generated marshaller files (i.e. patching marshallers.list in a tarball fails to rebuild the marshallers)... I've reformed the patchset on the openismus-work branch so that it no longer changes EDS to generate the marshallers. Perhaps I should rebase this off the new meego-pim branch ? (I presume the new meego-pim branch is based on gnome-2-32 branch anyway...)
Yes, please rebase on meego-eds branch.
Done.
This is now in the meego-eds branch: http://git.gnome.org/browse/evolution-data-server/commit/?h=meego-eds&id=2e1487f48fd4467cdff5d12f5eb17cca76e9829a (though the commit message is not ideal.) Please update this bug later when we have some idea of how much this improves performance for QtContacts-EDS. I guess we will then need to make some similar change to the new EClient API in master, though that might not have the same problem.
I updated the EDS package in MeeGo and it does not work for me, QtContacts-eds still receives full contact vCards in notifications. So I decided to run the unit tests and I get: FAIL: test-ebook-get-book-view-uid-only Both in eds-meego and openismus-work branches. I'm running the tests from inside a meego chroot.
After a hack-then-giveup-and-change-my-whole-environment nightmare caused by the dependency on an old version of the standalone gdbus-codegen: I can confirm that the test works in a normal Linux (jhbuild) environment, with this patch to make sure that the right services are actually used: http://mail.gnome.org/archives/evolution-hackers/2011-June/msg00039.html You won't need that in a meego environment, as you only have one install prefix anyway, of course. I guess we need to try using that to figure out what's wrong.
(In reply to comment #19) > I'm running the tests from inside a meego chroot. Are there some simple instructions to follow so I can get the same environment that you are using?
http://wiki.meego.com/SDK/Docs/1.2/Building_a_MeeGo_chroot_on_Linux I'm using the following image: http://download.meego.com/snapshots/latest-1.2/images/meego-tablet-ia32-pinetrail/meego-tablet-ia32-pinetrail-1.2.0.90.4.20110616.2.img
The unit test failure may be caused by something else: ** (process:24091): WARNING **: failed to add contact to addressbook: `local:/tmp/ebook-test-Z6K2WV/': Cannot add contact: db error 0x2 (No such file or directory) FAIL: test-ebook-get-book-view-uid-only
Actually, I guess that several of the other tests fail too. Maybe you do indeed need my (in-progress) patch for the unit test environment.
Yes, a few other tests fail too. However, I still believe there is something wrong with the patch since I can reproduce the problem in QtContact-EDS by simply adding an ASSERT in the callback: This fails: Q_ASSERT(e_contact_get_const(contact, E_CONTACT_FULL_NAME) == NULL);
I get the view as follows: EBookQuery *query = e_book_query_any_field_contains(""); GList *requested_field = g_list_append(NULL, (gpointer)e_contact_field_name (E_CONTACT_UID)); e_book_get_book_view(m_ebook, query, requested_field, // Optimization: we only need the uid 0, // All changes &m_ebookview, &gError); e_book_query_unref(query); g_list_free(requested_field);
I'll set up a meego environment on Monday. If you have a small test case that you know fails then that could save some time.
Tristan informed me that his patch currently only works for initial notifications, not for the later ones, which is why my QtContacts-EDS unit tests were not passing. The patch should be extended to support all later notifications. Initial notifications don't actually matter since we don't want to receive them at all.
Ok, I have a patch on 'openismus-work' branch that should fix things: ------------------------------------------------------------- commit 30035bd6454893583955f293d957450d6ea8de89 Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Fri Jun 17 17:47:56 2011 +0900 Make EBookBackend notify changes with only the UID if UID is the only requested field. Also extended the test-ebook-get-book-view-uid-only.c test case to test that future changes also come with only the UID. ------------------------------------------------------------- An implication of the change is that future changes in general are all checked and trimmed down regardless if the local file backend is in use or not... (I presume that is alright unless there is a reason to restrict the behavior to the local file backend...).
(In reply to comment #29) > An implication of the change is that future changes in general are all > checked and trimmed down regardless if the local file backend is in > use or not... (I presume that is alright unless there is a reason > to restrict the behavior to the local file backend...). That is exactly what we want, assuming you mean that the trimming is done if the current view has the "uid only flag" set. The goal always was to have this working as much as possible for other backends besides file, although that is where we need it first.
I confirm that the new patch is now working for me in QtContacts-EDS tests. Thanks Tristan.
Christophe, please remember also to avoid use of the EContact API on the EContacts that you get from the notification signals. If possible, use only the base class's EVCard API. That avoids some parsing and caching of the VCard, even though that VCard will now be small.
Created attachment 190699 [details] [review] Patch to add the E_BOOK_CLIENT_VIEW_DIRECTORY flag [in master] This patch accomplishes the same but in master. For an initial attempt I changed the api slightly because it seemed to make more sense now that e_book_client_get_view() comes with no 'requested_fields' parameter anymore. The attached patch basically does the same as the patch for meego-eds branch except that it introduces a new flag on the view (or in master, the EBookClientView) called E_BOOK_CLIENT_VIEW_DIRECTORY. If an EBookClientView is flagged as a 'directory' then only UIDs are sent (the idea is that you never have the full contents of the contacts when you are only 'browsing the directory'). Possible issues: o Perhaps we want another name than "DIRECTORY", I was thinking something like "INDEX" or "TABLE_OF_CONTENTS" was going to be a nicer name than "UIDS_ONLY" which is rather bland. o Currently only the file backend checks the flags, perhaps a more general solution with slightly more processing on the server side is possible by catching/translating the vCards at the libedata-book level.
For the record the above attached patch is commit 5f83fc43276181409b21688681658d1e9938f83f on 'openismus-work-master' branch. There is also commit 996ee2901e1a1c85e331198e4529e35ec1c1c392 on the same branch which adds tests/libebook/client/test-client-get-view-directory.c and tests the new flag. I noticed that the new EClient api takes a long time to establish connections and setup contacts in the test environment (or, maybe it's just my own environment/setup that is messed up).
(In reply to comment #32) > Christophe, please remember also to avoid use of the EContact API on the > EContacts that you get from the notification signals. If possible, use only the > base class's EVCard API. That avoids some parsing and caching of the VCard, > even though that VCard will now be small. Thanks, I fixed this in https://meego.gitorious.org/meego-middleware/qtcontacts-eds/commit/2f35225e56c2222afff63f738410c862be62b0dc
Review of attachment 189779 [details] [review]: This patch has been committed into the target meego-eds branch, only the patch for master branch remains.
I'm against these changes, they only makes API harder to read, with "if this, then that else if this then another thing else...". The intention of e_book_client_view_set_fields_of_interest() was just for your cases, you'll still receive the vcard, but it will be smaller, if possible. Use that API (and extend backends appropriately), please.
Ok great to get some feedback, I've looked over the master branch again and grasped the fields-of-interest api which is indeed perfect for two of our patches. (Also, I hope it's fine for returning UID+REV for cases like SyncEvolution, I suppose it's fine to force the user to use the EBookClientView api rather than adding more api bloat to both EBookClient and EBookClientView). I'm rewriting them right now and should have some new iterations very soon. Thanks again for reviewing this.
Created attachment 192634 [details] [review] New patch for uid-only-notifications in book views [targetting master] Here is a revised attempt using the fields-of-interest approach which I had completely missed last time around. It does 2 things basically: a.) Modifies the local file backend to notify with UID-only vCards when there is exactly one 'fields-of-interest' set which is the UID field. b.) Modifies e_book_backend_notify_update() to filter out anything that is not in the fields-of-interest generically (if there are any fields-of-interest set). Part 'b' here is optional but probably a decent idea, that code path if I understand correctly is called generically from EDataBook in response to any add/modify contact calls, so the backend does not have control over the content in those notifications I think (its a result of doing e_data_book_respond_create() or e_data_book_respond_modify()). Anyway, perhaps 'b' is a good idea or not... without doing it generically in EDataBook, the file backend would have to manually override some more vfuncs like EBookBackend->modify_contact() and filter the EContact from there before passing it to e_data_book_respond_modify(). On 'openismus-work-master' branch, this patch is now the following commit: commit eb226464c13cf505aad01e93e871dcaf892ef8ad Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Sun Jun 26 12:05:54 2011 -0400 Handle fields-of-interest for the UID field in the file backend. This patch modifies the local file addressbook backend to notify with shallow vcards bearing only the UID if e_book_client_view_set_fields_of_interest() has been called with only the UID field requested. Additionally, the patch adds a filtering algorithm in e_book_backend_notify_update() to automatically filter vcard notifications in response to a client adding or modifying a contact. And the test case has been modified a bit, that commit is now: commit 6dcd866a3cf362b5a5c50f53141a7431ee22174e Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Sun Jun 26 12:14:13 2011 -0400 Added test-client-uid-only-view.c test case. This test case ensures that an EBookClientView with the e_book_client_view_set_feilds_of_interest() set to only the E_CONTACT_UID field, notifies with shallow vcards holding only the contact UID.
Created attachment 192642 [details] [review] Patch to really handle 'fields-of-interest' with the local addressbook backend [targetting master]. Ok this patch for master is even better and more general and also addresses bug 652179 in the same patch. The patch does the following: o Makes local addressbook backend filter the notify vCards using the fields-of-interest provided by the user if any o The local addressbook backend makes a special case if only the UID is specified (because that's the only thing the local backend can really do in a more optimized way than filtering a contact). o e_book_backend_notify_update() also does the generic filtering o e_contact_copy_attributes() is added in this patch because it's more efficient and elegant than calling e_contact_get_attributes() and e_contact_set_attributes(). The commit: commit dd6a5777279c1d8f88a592d34c76a58d6d31f974 Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Sun Jun 26 12:05:54 2011 -0400 Handle fields-of-interest for local addressbook backend. This patch modifies the local file addressbook backend to notify with vcards holding only the fields requested in the EBookClientView's fields-of-interest. A special case it made if only the UID is requested, in that case the notification vCard is created from an id and no additional parsing and filtering is needed. Additionally, the patch adds a filtering algorithm in e_book_backend_notify_update() to automatically filter vcard notifications in response to a client adding or modifying a contact. This patch should address both bugs: https://bugzilla.gnome.org/show_bug.cgi?id=652179 https://bugzilla.gnome.org/show_bug.cgi?id=652172 Also, now there is an added test case for the revision clause, so the 2 test cases are: commit afa57da1bfbd5026cb8268fbc26ce3605fd476e8 Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Mon Jul 25 19:12:57 2011 -0400 Adding test-client-revision-view.c This test asserts that e_book_client_view_set_fields_of_interest() is working properly with the local addressbook backend with regards to views setup to only notify with the UID+REVISION. commit c2bc0eec517f47ed56476d0200abcbfd0558597a Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Sun Jun 26 12:14:13 2011 -0400 Added test-client-uid-only-view.c test case. This test case ensures that an EBookClientView with the e_book_client_view_set_feilds_of_interest() set to only the E_CONTACT_UID field, notifies with shallow vcards holding only the contact UID.
The intended logic behind set_fields_of_interest is written above the function: /** * e_book_client_view_set_fields_of_interest: * @view: An #EBookClientView object * @fields_of_interest: List of field names in which the client is interested * @error: A #GError * * Client can instruct server to which fields it is interested in only, thus * the server can return less data over the wire. The server can still return * complete objects, this is just a hint to it that the listed fields will * be used only. The UID field is returned always. Initial views has no fields * of interest and using %NULL for @fields_of_interest will unset any previous * changes. * * Some backends can use summary information of its cache to create artifical * objects, which will omit stored object parsing. If this cannot be done then * it will simply return object as is stored in the cache. **/ Maybe it's not the best intention, but I hoped to provide some API where can benefit all parts (server and client) without adding much new code into any of them. The last paragraph in the above comment describes it the best, I believe. Think of the SQLite DB cache/summary as well, then you might see what I'm aiming at.
(In reply to comment #41) > The intended logic behind set_fields_of_interest is written above the function: [...] > > Maybe it's not the best intention, but I hoped to provide some API where can > benefit all parts (server and client) without adding much new code into any of > them. The last paragraph in the above comment describes it the best, I believe. > Think of the SQLite DB cache/summary as well, then you might see what I'm > aiming at. I think it's the right intention, the key point being that you are not guaranteed that vCards will be filtered down to the fields-of-interest, meaning that the backends don't always implement it or might choose not to if it's judged inefficient and slow. I'm not sure I understand how this meshes in with the EBookBackendSummary though, in my understanding this is whats going on (with the file backend): o file backend loads up the summary (some important and frequently consulted fields of the vcards are in the summary) o The summary is constantly updated in the backend o The summary is consulted when notifying the initial content of the EBookClientView and only used for queries that are known to match a subset of the contacts in the summary and only require the fields for those contacts as well (I believe it's assumed that all contacts have an entry in the summary if the backend uses a summary anyway). (actually, during the initial content notifications is the only place where the file backend plays a direct role in notifications) o After that, we trim down the vcards for notification based on the 'fields-of-interest', we never run the risk of, for example, filtering out some fields down to the 'fields-of-interest' and erroneously adding that vcard to the summary. However, some things I noticed while writing this comment: o The file backend in its current state seems to misuse the summary, it populates the summary with contacts and updates the summary contacts at the appropriate places while updating the BDB, however when a summary query is detected during BookView initial notifications, it still fetches the whole vcard from the BDB instead of just pulling it from the summary. However, pulling the raw vcard from the BDB is questionably faster than calling e_book_backend_summary_get_summary_vcard(), so that could very well just be intentional. o It's also possible that we want to add the 'fields-of-interest' hash table as an argument to the function: e_book_backend_summary_is_summary_query () (deprecate that api for a new one that includes the 'fields-of-interest' hash table). I.e., when only some 'fields-of-interest' are required, then it's possible that the query really becomes a summary query, which means in theory the backend should be able to pull up the contact data more efficiently from the summary cache. Finally I admit I don't fully understand what is meant by this piece of documentation: * Some backends can use summary information of its cache to create artificial * objects, which will omit stored object parsing. If this cannot be done then * it will simply return object as is stored in the cache. When you say "... will omit stored object parsing"... you refer to the abstract method that a backend uses to "parse the stored object" from it's persistence ? And if a contact cannot be loaded from the summary for a said query, then "... it will simply return object as is stored in the cache" ? I'm afraid this bit has me confused, does my patch make an incorrect usage of the 'fields-of-interest' api ?
Long story short (and that yours is really long :) ) theactual concept of a cache and a summary in book backends is quite ancient and is supposed to be replaced with the recently added EBookBackendSqliteDB structure, which may cover both summary and cache in one object. The advantage of moving to sqlite is also in dropping dependency on DB, which, if I recall correctly, is not an option on all platforms. Currently, the backends are checking whether the used query can be used against summary (whether it contains only fields which the summary uses), and if so, then it uses summary to get UIDs of contacts, which are later used to get contacts from a cache. The summary doesn't contain real vCards, as far as I can tell, the cache is used for that. (imagine also double storing of a vCard, once in summary, second time in a cache - it doesn't make much sense.) The new approach is similar, only with one object interaction instead of two. Concept of fields-of-interest with respect of a summary or cache is that the backend calls e_book_backend_sqlitedb_search() and gets the result. The function may check the query, and fields-of-interest, and if both query and fields-of-interest are stored in a summary, then it can skip real vCard parsing and provide an artificially created contacts which will contain only fields-of-interest (plus some mandatory fields forced by RFC for a vCard). The same might be doable with the current summary+cache concept, the SqliteDB only hides the internals under one function. The thing is that the real vCard is never changed, fields are never removed from it even the client wishes only partial contacts, because that is too inefficient, once the vCard is in a memory and the summary cannot be used for populating the result. I'm not sure whether the above text will help you at all.
(In reply to comment #43) > Long story short (and that yours is really long :) ) theactual concept of a > cache and a summary in book backends is quite ancient and is supposed to be > replaced with the recently added EBookBackendSqliteDB structure, which may > cover both summary and cache in one object. The advantage of moving to sqlite > is also in dropping dependency on DB, which, if I recall correctly, is not an > option on all platforms. > > Currently, the backends are checking whether the used query can be used against > summary (whether it contains only fields which the summary uses), and if so, > then it uses summary to get UIDs of contacts, which are later used to get > contacts from a cache. The summary doesn't contain real vCards, as far as I can > tell, the cache is used for that. (imagine also double storing of a vCard, once > in summary, second time in a cache - it doesn't make much sense.) The new > approach is similar, only with one object interaction instead of two. Unfortunately this redundancy does indeed exist with the current code (if some backends indeed use it as you describe). The EBookBackendSummary caches the summary data on an EBookBackendSummaryItem structure every time e_book_backend_summary_add_contact() is called. Later that cache can be used to construct a vcard with e_book_backend_summary_get_summary_vcard(). The file backend, does consult the uids from the summary as you describe instead of iterating over the BDB, then if it's a summary query it will go and load the raw entire vcard from the main BDB (not a partial vCard in a separately managed cache). > Concept of fields-of-interest with respect of a summary or cache is that the > backend calls e_book_backend_sqlitedb_search() and gets the result. The > function may check the query, and fields-of-interest, and if both query and > fields-of-interest are stored in a summary, then it can skip real vCard parsing > and provide an artificially created contacts which will contain only > fields-of-interest (plus some mandatory fields forced by RFC for a vCard). The > same might be doable with the current summary+cache concept, the SqliteDB only > hides the internals under one function. Ok this reveals your intent pretty clearly, yes I think it can be done with the current EBookBackendSummary + fields-of-interest but would probably be better done with the new e_book_backend_sqlitedb_search() function. Thanks for taking the time to explain, this technique indeed should be faster since you dont need to parse and regenerate a vCard (you only generate a vCard from a cache but avoid the parse step). Unfortunately it means I might have to do 2 implementations, because back in gnome-2-32 days the sqlitedb object does not exist yet. Perhaps for the old branch I can add the functions: - e_book_backend_summary_is_summary_query_with_fields_of_interest(); This would check both the query and the fields of interest. - e_book_backend_summary_get_summary_vcard_by_fields_of_interest(); This would create the cached vcard normally but only populate it with fields from the fields-of-interest. Then the local file backend will need to be fixed to actually use the summary cache (which does exist but is ignored) and filter the results by fields-of-interest only if it is indeed a summary query. Furthermore, to satisfy bug 652179 I would have to ensure that at least in the meego-eds branch, the REV field is actually part of the summary (the sqlite db thing has a more sophisticated approach allowing the 'store_vcard' parameter, but I dont think it's worth the effort replicating all of that just for a temporary old branch). For master, I will probably start by proposing exactly the same patch using EBookBackendSummary with a couple enhancements, and then possibly follow up with a port of the local file backend to use the new sqlitedb stuff (that porting work while important, might be orthogonal to this actual fix). > The thing is that the real vCard is never changed, fields are never removed > from it even the client wishes only partial contacts, because that is too > inefficient, once the vCard is in a memory and the summary cannot be used for > populating the result. > > I'm not sure whether the above text will help you at all. Yes it was very helpful thanks, I'm still getting a grip on all the sources and had not yet discovered the use of e-book-backend-sqlitedb.c.
I noticed myself the REV field not being part of the SQLite db summary, and I'm about to add it there. All the SQLite db book cache is too new, and it seems unfinished for some usages too, which is just subject to change. With respect of old branch, you might not, technically, change API there. And even adding things into API is probably not the best thing, though sometimes doable. It's up to you how you'll do that. I care of master branch only, because it's too much to take care of all the branches for me.
(In reply to comment #44) > (In reply to comment #43) > Unfortunately it means I might have to do 2 implementations, because back > in gnome-2-32 days the sqlitedb object does not exist yet. You are talking about the "report UID + REV" feature here, right? For that it is sufficient to focus on the current master branch with sqlite. For "report UIDs" we have a hack in MeeGo which is good enough as an interim solution (custom API). "report UID + REV" is an improvement for SyncEvolution, but SyncEvolution in MeeGo 1.2 will not take advantage of that, so it is not needed there.
Created attachment 192892 [details] [review] Patch to e-book-backend-sqlitedb.c [master] This patch fixes up sqlitedb.c so that fields can be more easily added and remove (and so that the search function filters out any unrelated fields). commit 6892c20d18d1f9cc9fe3cfaa88b6cc2ff315abec Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Fri Jul 29 19:28:08 2011 -0400 Handle summary fields and fields of interest better in e-book-backend-sqlitedb.c This patch dramatically changes the sqlitedb cache code by introducing a table (array of structures) describing all of the fields which should be included in the (summary) cache. Thus, all code that treats the summary fields by hand previously now consults the cache generically. The REV field is added to the summary table, the UID is always returned in any results from e_book_backend_sqlitedb_search() and when 'fields_of_interest' is specified then the sqlite3 db will only be queried for the fields_of_interest + UID (thus only those fields will be present in any virtually created vcard objects).
Created attachment 192893 [details] [review] Patch to really handle 'fields-of-interest' with the local addressbook backend [targetting master, take 2]. This is the new version of the patch for master and uses the sqlitedb approach. The patch still contains the extra general filtering algorithm in e_book_backend_notify_update(), perhaps it should also be removed and the EBookBackendFile just implement the responses of ->add_contact() and ->modify_contact() using vcards from the sqlite cache. The commit on 'openismus-master-branch' (freshly rebased off of master today): commit 01f5a4102691f1e0eae4a8233859c1da0363e92a Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Sun Jun 26 12:05:54 2011 -0400 Handle fields-of-interest for local addressbook backend. This patch refactors the local addressbook backend to use the new sqlitedb api instead of the old summary apis. The result is that vcards are virtually built from the sqlitedb cache when fields-of-interest is set. Additionally, the patch adds a filtering algorithm in e_book_backend_notify_update() to automatically filter vcard notifications in response to a client adding or modifying a contact. This patch should address both bugs: https://bugzilla.gnome.org/show_bug.cgi?id=652179 https://bugzilla.gnome.org/show_bug.cgi?id=652172
*** Bug 652179 has been marked as a duplicate of this bug. ***
Created attachment 192928 [details] [review] Patch to e-book-backend-sqlitedb.c [master, take 2] Here is a re-iteration of the sqlitedb patch which changes the api slightly to make it possible to use the sqlitedb cache for notification vcard contruction in response to contact additions and modifications. The new version of the sqlitedb commit on 'openismus-work-master' is: commit bfadfb4466d9c8859117ff10ae196312acda77f2 Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Fri Jul 29 19:28:08 2011 -0400 Handle summary fields and fields of interest better in e-book-backend-sqlitedb.c This patch dramatically changes the sqlitedb cache code by introducing a table (array of structures) describing all of the fields which should be included in the (summary) cache. Thus, all code that treats the summary fields by hand previously now consults the cache generically. The REV field is added to the summary table, the UID is always returned in any results from e_book_backend_sqlitedb_search() and when 'fields_of_interest' is specified then the sqlite3 db will only be queried for the fields_of_interest + UID (thus only those fields will be present in any virtually created vcard objects). Additionally, e_book_backend_sqlitedb_get_vcard_string() and _get_contact() take a new 'GHashTable *fields_of_interest' argument for field filtering and e_book_backend_sqlitedb_is_summary_query() is an exported api which can be tested before calling e_book_backend_sqlitedb_get_vcard_string().
Created attachment 192929 [details] [review] Patch to really handle 'fields-of-interest' with the local addressbook backend [targetting master, take 3]. This I think is the ultimate approach using the sqlitedb cache completely to dish out virtually created vcards for notifications. Main differences from the last patch are: o EDataBookView does not parse and reconstruct the vcard o EBookBackend objects can implement the new ->notify_update vfunc in a custom way. o The file backend, when implementing ->notify_update() in response to a contact addition/modification, needs to notify all the views, so now for each view the file backend does: /* This is why it's important to export the 'is_summary_query()' api... */ if (is_summary_query (view_query, view_fields_of_interest)) { vcard = sqlitedb_get_vcard_string (db, uid, fields_of_interest); e_data_book_view_notify_update_prefiltered_vcard (view, vcard); } else e_data_book_view_notify_update_contact (view, contact); The new version of the commit on 'openismus-work-master' is: commit 8672ca1d2ae93056a097b6cfbfcfdf1547e2cb37 Author: Tristan Van Berkom <tristan.van.berkom@gmail.com> Date: Sun Jun 26 12:05:54 2011 -0400 Handle fields-of-interest for local addressbook backend. This patch refactors the local addressbook backend to use the new sqlitedb api instead of the old summary apis. The result is that vcards are virtually built from the sqlitedb cache when fields-of-interest is set. Additionally, the patch adds the EBookBackend->notify_update() vfunc which can be implemented for e_book_backend_notify_update(). The file backend uses this to notify with virtually created vcards from the sqlite cache in response to contact additions and modifications. This patch should address both bugs: https://bugzilla.gnome.org/show_bug.cgi?id=652179 https://bugzilla.gnome.org/show_bug.cgi?id=652172
Patch for sqlitedb looks good, I like the array of fields in the summary. There is one thing though, the e_book_backend_sqlitedb_is_summary_query(). It may not return TRUE if fields_of_interest are all in the DB, it might be considered as a special condition at the end, if the query fields are also part of the DB. Imagine a query which asks on a different fields then you are interested in. Patch for file book backend, looks good, I thought you'll remove most of the DB code, replacing it with sqlitedb, and only one place will be using the DB (if enabled in a compile time), when the sqlitedb is not populated, as you do it in the patch. I agree with this change: > o EDataBookView does not parse and reconstruct the vcard it was never intended to do that. Thanks for it. Overall, I ddin't test its compilation, but as long as it doesn't introduce any compiler warnings, and if you fix the first patch, then you can commit it to master. Thanks for a help with this.
(In reply to comment #52) > Patch for sqlitedb looks good, I like the array of fields in the summary. There > is one thing though, the e_book_backend_sqlitedb_is_summary_query(). It may not > return TRUE if fields_of_interest are all in the DB, it might be considered as > a special condition at the end, if the query fields are also part of the DB. > Imagine a query which asks on a different fields then you are interested in. Ok I refit this patch a little and am pushing it to master now. I tried initially as you said with an extra clause at the end but it was not sufficient, I ended up with something like this pseudo code: /* First check if the fields of interest are summary fields */ fields_are_summery = (fields && is_fields_of_interest (fields)); /* Possible early return for explicitly mentioned non-summary fields */ if (fields && !fields_are_summary) return FALSE; /* Treat the mystical sexp stuff with an exception: * * If its a general query with the special "x-evolution-any-field" * field it means it's a general query, in the case that you want * a book view with no special query and you want to report specific * fields of interest, we need to consider the special "any" field * to be a "summary field" */ while_interpreting_sexp_query () { if (fields_are_summary && this_field == "x-evolution-any-field") summary_query_is_also = TRUE; } /* At this point we return the result of the sexp code labyrinth, * if summary fields-of-interest are specified then the * 'x-evolution-any-field' is considered a valid summary field */ > > Patch for file book backend, looks good, I thought you'll remove most of the DB > code, replacing it with sqlitedb, and only one place will be using the DB (if > enabled in a compile time), when the sqlitedb is not populated, as you do it in > the patch. Sure I think we agree that that could be done orthogonally to this patch in the future. Another thing I suppose you would want to do is use a general sqlitedb database and not build a separate one for each EClient (the api seems to allow for this sharing but for the initial integration patch I figured this was not extremely important). > I agree with this change: > > o EDataBookView does not parse and reconstruct the vcard > it was never intended to do that. Thanks for it. > > Overall, I ddin't test its compilation, but as long as it doesn't introduce any > compiler warnings, and if you fix the first patch, then you can commit it to > master. Thanks for a help with this. Ok great thanks, I'm committing these to master now and closing the bug. Let me know if any problems turn up or if the above approach I described for the is_summary_query() is somehow wrong.
The described change looks good. If you search for the "x-evolution-any-field" term in sources then you may realize that there is some fixed check for this filter, which seems to work. I agree to consider this filter as being able to use with fields-of-interest.
You forgot to mention this bug number in your commit, which gave me more time to find this bug (somehow) ;) Anyway, the patch for sqlite part has a regression. I made a new constraints on the 'sexp' parameter of the e_book_backend_sqlitedb_search() and e_book_backend_sqlitedb_search_uids(). When this parameter is NULL or an empty string, then the caller asks for all available objects/uids stored inside the cache, but with your change the factory just crashes when evolution-mapi calls it this way. I fixed it in: Created commit 6241ddb in eds master (3.1.5+)
Ok I'm not sure this is the right fix, let's confirm. My primary concern right now is that changing this semantic will break how the sqlitedb_search() function is expected to react when called from the file backend. I suspect that when an EBookView setup with no query and no fields of interest set... will result in filtered vCards instead of the complete full vCards that one would expect in this condition. My understanding of the high-level EBook[Client]View apis is that: - a view setup with "no query" means "give me everything" (and to represent this under the hood I get the impression we are using a "(contains \"x-evolution-any-field\")" query, I might be wrong and real NULL queries actually make it to the backend). - a view setup with "no query" and "fields-of-interest" set means give me every vCard but only the fields-of-interest. Now when I implemented this in EBookBackendFile, I did so with this approach: if ((vcards = sqlitedb_search (query, fields-of-interest)) == NULL) vcards = load_vcards_from_real_bdb(); This is expected to work correctly because a NULL query and a NULL fields-of-interest should probably return NO vcards at all. (and we use the try/fallback approach because sqlitedb_search() does the is_summary_query() check internally and we dont want to run that check twice). This way we should properly fallback to loading *all data* for each vCard unless fields-of-interest was specified for a "no query" search. So I suspect that the right statement should be... instead of this: if (!sexp || e_book_backend_sqlitedb_is_summary_query (sexp, fields_of_interest))... It should rather be: if ((!sexp && book_backend_sqlitedb_is_summary_fields (fields_of_interest)) || (sexp && e_book_backend_sqlitedb_is_summary_query (sexp, fields_of_interest)))... I.e., it is indeed a summary query if there is no query AND fields-of-interest are indeed summary fields, or its a summary query if both the existing query and fields-of-interest (if present) are summary fields. And actually, maybe this whole api doesnt make much sense to begin with... if I search for: "every contact which has an email field that contains the text 'foo'" Why would we assume that we only want summary results for that query ? (what if I'm really after all the EContactPhoto fields for all the contacts that have the text 'foo' in an email field). A cleaner approach altogether would be to always return full vcards unless fields of interest is specified (i.e. query controls what cards to search for, fields-of-interest controls the type of vcard to return... having the query string effect the returned vCard is IMO a confusing api).
Actually after posting this comment it's became clear to me that the sqlitedb_search () api should probably change when considering this code: if ((vcards = sqlitedb_search (query, fields-of-interest)) == NULL) vcards = load_vcards_from_real_bdb(); It should rather include a 'gboolean *not_found' out parameter or such. gboolean not_found; if ((vcards = sqlitedb_search (query, fields-of-interest, ¬_found)) == NULL) { if (not_found == FALSE) vcards = load_vcards_from_real_bdb(); } The code in place does work, however in the case that the query & fields are indeed a summary query and the search fails to return any results we should not 'load_vcards_from_real_bdb()' (because we know the search has no results already so anything further is just redundant extra processing). I'll look into a patch for at least that tomorrow and reconsider this bug. Maybe I will reopen it or open a new one...
Tristan, you are writing too long posts, and I'm lazy to read it (but I did read it, of course). Basically, as you wrote at the end of the comment 56, (oh man, 56 comments...) * sexp is to filter for needed cards * fields_of_interest is to tell how rich the returned vCard can be, though it's rather "how poor", in a sense of the minimal fields expected to be returned. Having sexp with some cryptic predefined filter or NULL/empty-string for a usage inside backends should be equivalent. Of course, hiding this detail into two new API functions, to get only UIDS (maybe with rev) and full cards, for all stored objects in summary, might worth it, to not confuse API. I agree with you here. The issue with the code before my recent change was that the factory was crashing. What I didn't notice is the change in book_backend_sqlitedb_search_query(). I knew that Chen didn't make this complete, there are still some unfinished things, especially for a case where card is not stored in DB (where it might not be stored in DB for local backends, to give users a chance to recover after crash or a .db file corruption). Looking more deeply, I do not like that double processing in book_backend_sqlitedb_search_query(), it doesn't make sense to return only summary-driven artificial vCard when it is not stored in summary, and a full vCard when it is stored in summary. It should always return the same data, regardless where they are stored. It's making us trouble here, I believe. A new bug report will be better, to finish Chen's work. Let me know if you or me will do that. I'm fine with either option.
Milan, I'm not sure if you receive general bugmail for EDS, so in case you do not I'll mention it here. I've opened bug 656058 to track this and attached a problem statement followed by a couple patches.
(In reply to comment #59) > I'm not sure if you receive general bugmail for EDS, so in case you > do not I'll mention it here. You are right, I do not receive notifications of newly added bugs to eds, it would be just too much for me. Having here an explicit link to a follow-up bug is always a plus, from my point of view.