After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 686680 - Configuration of summary fields in the addressbook
Configuration of summary fields in the addressbook
Status: RESOLVED FIXED
Product: evolution-data-server
Classification: Platform
Component: Contacts
3.8.x (obsolete)
Other Linux
: Normal normal
: ---
Assigned To: evolution-addressbook-maintainers
Evolution QA team
Depends on:
Blocks: 687281
 
 
Reported: 2012-10-23 07:18 UTC by Tristan Van Berkom
Modified: 2013-09-14 16:56 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Fixed syntax errors in previous patch (9.82 KB, patch)
2012-11-07 07:31 UTC, Tristan Van Berkom
rejected Details | Review
Add string <--> enumeration converter utilities (needed by the ESourceAddressBookConfig extension) (3.26 KB, patch)
2012-11-07 10:43 UTC, Tristan Van Berkom
reviewed Details | Review
Add e_contact_field_type() introspection (with an added boxed type for string lists) (5.42 KB, patch)
2012-11-07 10:46 UTC, Tristan Van Berkom
committed Details | Review
Add ESourceAddressBookConfig extension (26.22 KB, patch)
2012-11-07 10:48 UTC, Tristan Van Berkom
reviewed Details | Review
Big refactoring of EBookBackendSqliteDB object (59.55 KB, patch)
2012-11-07 10:52 UTC, Tristan Van Berkom
reviewed Details | Review
Make EBookBackendFile consult the new extension and configure the SQLite object properly (6.49 KB, patch)
2012-11-07 10:54 UTC, Tristan Van Berkom
reviewed Details | Review
Test case using configurable summary fields (10.83 KB, patch)
2012-11-08 06:24 UTC, Tristan Van Berkom
accepted-commit_now Details | Review
Fix e_book_backend_sexp_match_contact() for multi-valued fields (2.25 KB, patch)
2012-11-08 08:27 UTC, Tristan Van Berkom
committed Details | Review
Benchmark: time to fetch contacts by UID (74.20 KB, image/png)
2012-11-08 10:21 UTC, Tristan Van Berkom
  Details
Benchmark: time to fetch contact by full name prefix (86.01 KB, image/png)
2012-11-08 10:26 UTC, Tristan Van Berkom
  Details
Benchmark: time to fetch contacts by email address prefix (83.87 KB, image/png)
2012-11-08 10:32 UTC, Tristan Van Berkom
  Details
Benchmark: time to fetch contacts by exact email address (83.40 KB, image/png)
2012-11-08 10:34 UTC, Tristan Van Berkom
  Details
Benchmark: time to fetch contacts containing a pattern in the email address (90.72 KB, image/png)
2012-11-08 10:38 UTC, Tristan Van Berkom
  Details
Add string <--> enumeration converter utilities (needed by the ESourceAddressBookConfig extension) (revision 2) (3.21 KB, patch)
2012-11-20 08:24 UTC, Tristan Van Berkom
committed Details | Review
Add utility function for string normalization (originally from EBookBackendSexp) (2.29 KB, patch)
2012-11-20 08:26 UTC, Tristan Van Berkom
committed Details | Review
Add ESourceBackendSummarySetup extension (26.80 KB, patch)
2012-11-20 08:30 UTC, Tristan Van Berkom
none Details | Review
Big refactoring of EBookBackendSqliteDB object (revision 2) (62.06 KB, patch)
2012-11-20 08:32 UTC, Tristan Van Berkom
committed Details | Review
Make EBookBackendFile consult the new extension and configure the SQLite object properly (revision 2) (3.15 KB, patch)
2012-11-20 08:34 UTC, Tristan Van Berkom
committed Details | Review
Test case using configurable summary fields (revision 2) (10.83 KB, patch)
2012-11-20 08:35 UTC, Tristan Van Berkom
committed Details | Review
EBookBackendSexp: use e_util_utf8_normalize() (3.36 KB, patch)
2012-11-20 08:37 UTC, Tristan Van Berkom
committed Details | Review
Add ESourceBackendSummarySetup extension (revision 3) (26.80 KB, patch)
2012-11-20 08:46 UTC, Tristan Van Berkom
committed Details | Review
Avoid errors stemming from conflicting summaries (12.00 KB, patch)
2012-11-23 09:53 UTC, Tristan Van Berkom
none Details | Review
Avoid errors stemming from conflicting summaries (revision 2) (12.82 KB, patch)
2012-11-23 10:22 UTC, Tristan Van Berkom
none Details | Review
Avoid errors stemming from conflicting summaries (revision 3) (18.67 KB, patch)
2012-11-23 11:37 UTC, Tristan Van Berkom
committed Details | Review

Description Tristan Van Berkom 2012-10-23 07:18:13 UTC
The field summaries in EDS use a hard-coded set of fields: 
   uid, rev, file_as, nickname, full_name, given_name, family_name, 
   email_1, email_2, email_3, email_4, is_list, list_show_addresses, 
   wants_html

Additionally, the indices are currently hard coded to be applied exclusively to
the "full_name" and "email_1" fields.

This list of fields stored for quick reference in the SQLite is useful for
email clients but ignores the needs of phone books or caller UIs.

As indices are the single-most important feature to improve raw database 
performance; mobile applications, for instance, would benefit from storing the 
phone number as an indexed field in the summary.

We would propose an ESourceExtension api for the local addressbook to configure
which EContactFields should be stored in their own columns in the summary,
and additionally configure which of those summary fields should be indexed for
even faster query results.

Will be attaching proposed patches shortly...
Comment 1 Tristan Van Berkom 2012-10-31 08:55:30 UTC
There's a minor problem with this I'd like to bring up for discussion.

The approach I have just about in place for this has the bad side
effect of breaking the "fields-of-interest" semantics, there are a
few ways to get around it but first here is the reasoning behind
why it's broken:

   a.) The SQLite indexes are only effective on case sensitive queries
       (actually the current code creates indexes that are never used in
       queries simply because the queries are case insensitive).

       I've found that "PRAGMA case_sensitive_like = ON" causes the
       indexes to be used and in this way, we have serious performance
       gains when querying address books with large number of contacts.

       To this end, in order to mimic the case insensitive searches I've
       simply normalized the data stored in the SQLite columns.

       However, this means the data stored in those columns can no longer
       be used to construct vcards.

   b.) It's important to support multi-valued EContactFields, so that
       queries can be done on those.

       For instance current SQLite code has special casing such as:

       if (query searches for 'email_1') try (email_2 && email_3 && email_4)

       So we don't want to be stuck with single valued EContactFields, as
       specially if we want indexed searches on phone numbers, we dont want
       to be stuck with only E_CONTACT_MOBILE and E_CONTACT_PHONE_HOME etc,
       rather to search for a single multivalued E_CONTACT_TEL gets you
       the result if the contact had "any telephone number matching
       the query".

       Currently this is straight forward enough, but it means we cant
       properly construct a VCard from the summary information if the
       fields-of-interest require inclusion of multi-attribute fields
       (because we lose the attribute TYPE=foo values for those values).


As far as I can see, the most important use case of the fields-of-interest
is to construct contacts with only the UID/REV fields... this is easily done
because we dont really need to normalize the UID/REV values as it's not so
important to have those particular fields in case insensitive queries
(so to clarify, queries which try to match the UID or REV fields would
always be case sensitive).

What I suggest:

   a.) If fields-of-interest is only the UID/REV... then construct the
       vcards from the results of the query directly (SELECT uid rev FROM ...)

   b.) If fields-of-interest contain other fields, then pull the full vcard
       from the DB... and filter out the unwanted data before returning
       the results.

       This is a little undesirable because it causes extra processing
       before returning the results, but I think the performance gained
       (in terms of query result time) clearly outweigh this concern.
Comment 2 Tristan Van Berkom 2012-11-07 07:31:25 UTC
Created attachment 228337 [details] [review]
Fixed syntax errors in previous patch

This patch is the same as the last one which adds
e_book_backend_sqlitedb_get/set_revision()

With a syntax error fixed, changed:
   UPDATE TABLE folders SET version = 2
to:
   UPDATE folders SET version = 2
Comment 3 Tristan Van Berkom 2012-11-07 07:32:32 UTC
Review of attachment 228337 [details] [review]:

Sorry for spam... I have many bug windows... this patch is for another bug.
Comment 4 Tristan Van Berkom 2012-11-07 10:43:09 UTC
Created attachment 228350 [details] [review]
Add string <--> enumeration converter utilities (needed by the ESourceAddressBookConfig extension)
Comment 5 Tristan Van Berkom 2012-11-07 10:46:44 UTC
Created attachment 228351 [details] [review]
Add e_contact_field_type() introspection (with an added boxed type for string lists)

This patch adds some introspection around EContactFields

This is needed for customizable summary fields to have their types recognized,
it also adds a boxed type to indicate which EContactFields are multivalued
(a list of allocated strings is returned by e_contact_get() for those types
of contact fields).

We now install a property which indicates this string list type, however it
does not effect the return value of e_contact_get(), it only advertises the
contact field with higher specificity.
Comment 6 Tristan Van Berkom 2012-11-07 10:48:21 UTC
Created attachment 228352 [details] [review]
Add ESourceAddressBookConfig extension

The new ESourceExtension is used to indicate which contact fields should be 
stored in the summary for quick lookups, and furthermore which contact fields
should be optimized for prefix or suffix lookups.
Comment 7 Tristan Van Berkom 2012-11-07 10:52:46 UTC
Created attachment 228353 [details] [review]
Big refactoring of EBookBackendSqliteDB object

The SqliteDB object now has a _new_full() constructor variant which allows specification of which fields should be in the summary.

Futhermore much work has been done towards optimizing searches, strings
are now normalized for fully indexed searches and maintained in reverse
for optimized suffix searches... multi-valued (attribute list) typed
contact fields are now also supported by the SQLite.

As a drawback, currently we do not construct vcards from the summary fields
but instead return the stored vcard directly... this is because the string
normalization optimizing the searches is no longer usable to create contact
vcards on the fly anymore... we may make an exception for the UID/REV case
in a later patch.
Comment 8 Tristan Van Berkom 2012-11-07 10:54:10 UTC
Created attachment 228354 [details] [review]
Make EBookBackendFile consult the new extension and configure the SQLite object properly

This patch makes the local file backend leverage the new features in the SQLite
object and consult the new configurations in the ESourceAddressBookConfig.
Comment 9 Tristan Van Berkom 2012-11-08 06:24:35 UTC
Created attachment 228436 [details] [review]
Test case using configurable summary fields

This test case adds some custom vcards and performs some queries on multi-attribute fields such as E_CONTACT_TEL and E_CONTACT_EMAIL
Comment 10 Tristan Van Berkom 2012-11-08 08:27:06 UTC
Created attachment 228440 [details] [review]
Fix e_book_backend_sexp_match_contact() for multi-valued fields

This patch fixes the fallback default comparisons concerning email & phone numbers

Those matches basically were hard coded to be restricted to the hard coded
value types defined in EContactField, so when querying for E_CONTACT_TEL
or E_CONTACT_EMAIL, if the vcard contains phone numbers or emails with 
TYPE attributes not accounted for in the hard coded list, it would not
return results.

Seems it was already properly done for chat addresses, just extending it
so that phone numbers and email address searches are also a bit more
flexible.
Comment 11 Tristan Van Berkom 2012-11-08 10:18:15 UTC
I ran some benchmarks today comparing performance of contact fetches
from EDS 3.4, 3.6 and with the features introduced by the attached patches
(those ones are marked as "Custom" in the charts I'll be attaching here).

We've previously been testing this with E_CONTACT_TEL since we're mostly
interested in better performance for telephone number lookups, but in
this case I swapped that out and used E_CONTACT_EMAIL instead.

This is the code snippet I used to configure the addressbook for
the "Custom" benchmarks:

e_source_address_book_config_set_summary_fields (config,
                                                 E_CONTACT_FULL_NAME,
                                                 E_CONTACT_FAMILY_NAME,
                                                 E_CONTACT_GIVEN_NAME,
                                                 E_CONTACT_EMAIL,
                                                 0);
e_source_address_book_config_set_indexed_fields (config,
                               E_CONTACT_GIVEN_NAME, E_BOOK_INDEX_PREFIX,
                               E_CONTACT_FAMILY_NAME, E_BOOK_INDEX_PREFIX,
                               E_CONTACT_FULL_NAME, E_BOOK_INDEX_PREFIX,
                               E_CONTACT_EMAIL, E_BOOK_INDEX_PREFIX,
                               E_CONTACT_EMAIL, E_BOOK_INDEX_SUFFIX,
                               0);
Comment 12 Tristan Van Berkom 2012-11-08 10:21:38 UTC
Created attachment 228451 [details]
Benchmark: time to fetch contacts by UID

This benchmark shows time to fetch contacts by UID

The main difference between this benchmark and previous versions is that
the index created for UID is actually used, and in previous versions
full table scans are used to fetch contacts by UID.
Comment 13 Tristan Van Berkom 2012-11-08 10:26:25 UTC
Created attachment 228452 [details]
Benchmark: time to fetch contact by full name prefix

This one shows that when searching for full name prefixes, the index is also
actually used.

I'll omit the "contains" searches, because they are rather similar to older
versions if EDS... this is because the E_CONTACT_FULL_NAME is in the summary
in older versions of EDS and with "contains" searches full table scans are needed
in any case, so there is no real difference there.

However this shows how it pays off to use prefix searches over contains searches.
Comment 14 Tristan Van Berkom 2012-11-08 10:32:22 UTC
Created attachment 228454 [details]
Benchmark: time to fetch contacts by email address prefix

This benchmark shows the performance gains when searching for emails by prefix
(when the email address is indeed configured to be in the summary and configured
to have a prefix index)

Here we see performance gains for 2 reasons, the searches are performed to
match *any* email address, i.e.:
  e_book_query_field_test (E_CONTACT_EMAIL, E_BOOK_QUERY_BEGINS_WITH, "pattern")

One reason is that such email searches never used the summary before
(actually there was code in SQLiteDB trying to match up "email" with 
"email_[1-4]" hard coded fields, but that code never ran because "email"
never qualified as a summary query).

The other reason is because the search is actually indexed.
Comment 15 Tristan Van Berkom 2012-11-08 10:34:36 UTC
Created attachment 228455 [details]
Benchmark: time to fetch contacts by exact email address

This one shows the times to fetch contacts exact matching emails.

It's a little bit more impressive than the email prefixes because
exact matches need to traverse less buckets in SQLite's btree to get
to the results, so the query time is much more stable.
Comment 16 Tristan Van Berkom 2012-11-08 10:38:37 UTC
Created attachment 228456 [details]
Benchmark: time to fetch contacts containing a pattern in the email address

This one is probably a more common search query used by Evolution.

It's a query of E_CONTACT_EMAIL with an E_BOOK_QUERY_CONTAINS search

While not as quick as other email searches (prefix & exact), it's still
a good performance gain when compared to parsing all the vcards and
searching for emails.

Furthermore, it will return results for ANY specified emails in the contact
vcards (not only emails marked as "work" or "home").
Comment 17 Milan Crha 2012-11-12 13:39:38 UTC
Review of attachment 228350 [details] [review]:

OK, doesn't look bad, except the coding style (we use tabs, not two spaces for indent), braces are not placed properly, and g_return_val_if_fail() should return gboolean, not gint. And at the end are added two new lines (I use to do that all the time myself too). Otherwise looks good, I suppose you copy&paste it from elsewhere?
Comment 18 Milan Crha 2012-11-12 13:44:26 UTC
Review of attachment 228351 [details] [review]:

Nice, I thought it'll be longer :) Feel free to commit. Also note that you can use g_list_free_full(), instead of a pair g_list_foreach() && g_list_free().
Comment 19 Milan Crha 2012-11-12 13:45:24 UTC
Review of attachment 228351 [details] [review]:

I forgot to add, please write to the public function comments also "Since: 3.8" at the end, thus it's known when it was added. Thanks.
Comment 20 Milan Crha 2012-11-12 16:40:36 UTC
Review of attachment 228352 [details] [review]:

You forgot to add config.h at the beginning of the new .c file (see e-book-client.c how it's done).

Would it be possible to set the same licence as is in other book files, like in the mentioned e-book-client.c?

There are used deprecated glib symbols, namely g_mutex_new() and g_mutex_free(). Could you replace them, please?

s/litteral/literal/ unless it's intentional

If you make the functions with variable parameters NULL terminated, instead of zero terminated, then you can use also G_GNUC_NULL_TERMINATED flag, which takes care and will warn if the caller doesn't end his/her list with the NULL. It's quite useful with gcc, I would say.

I'm not sure with the naming. The "Config" suffix is used in evolution itself for UI configuration, while this is non-UI thing. What about ESourceBackendSummarySetup? Is it good from your point of view too? Then you can place it into libedataserver/ and it'll be, maybe, once used for both book and calendar backends.

There should be some indication from the backend side that it supports this extension. I suggest to use some new static capability for it. (I didn't read other patches yet, if you've it done there, then even better.)

What will happen if there will be more applications, each requesting different set of summary fields? Will there be used a union of them, or a common fields, or... The thing is that the summary can get quite large in such situation, cannot it?
Comment 21 Milan Crha 2012-11-12 17:38:58 UTC
Review of attachment 228353 [details] [review]:

This doesn't apply cleanly on git master, unfortunately. I get:
13 out of 39 hunks FAILED -- saving rejects to file addressbook/libedata-book/e-book-backend-sqlitedb.c.rej
2 out of 2 hunks FAILED -- saving rejects to file addressbook/libedata-book/e-book-backend-sqlitedb.h.rej

I suppose you should always include UID and REV in the DB - in create_contacts_table()

String passed to g_set_error() and similar should be translated - either that, or use other mechanism how to warnd about an error (like g_print/g_message/g_warning). The thing is that GError-s can be user-visible strings.

g_utf8_strreverse() returns newly allocated buffer.

I know I added on various places constructions like this: (!error || (*error == NULL)), but I realized during the time that it is not correct. Better approach is to use a GError *local_error = NULL; and at the end, if it is not NULL, g_propagate_error (error, local_error); That way you do not depend on the callers will of passing in an error structure or not.

With respect of the string normalization, would it make sense to use chars_to_unistring_lowercase() from e-book-backend-sexp.c? It removes accents and makes the string lowercase, which was requested some time ago.

I didn't see anything else just from the code reading.

::: addressbook/libedata-book/e-book-backend-sqlitedb.c
@@ +572,3 @@
+		}
+
+		if ((ebsdb->priv->summary_fields[i].index & INDEX_SUFFIX &&

Include parenthesis here, like (a & b) != 0 &&... It seems to be shifted a bit in the statement

@@ +1487,2 @@
  *
+ * Deprecated: 3.6: Use e_book_backend_sqlitedb_check_summary_fields() instead.

I would keep there the "Since:" it didn't change. The Deprecated: should be 3.8 :)
Comment 22 Milan Crha 2012-11-12 17:42:19 UTC
Review of attachment 228354 [details] [review]:

Otherwise looks pretty straightforward.

::: addressbook/backends/file/e-book-backend-file.c
@@ +68,3 @@
+ * avoids a "statement with no effect" compiler warning.
+ * FIXME Use g_type_ensure() once we require GLib 2.34. */
+#define REGISTER_TYPE(type) \

eds does depend on the GLib 2.34, you can remove this and use the function directly.
Comment 23 Milan Crha 2012-11-12 17:45:32 UTC
Review of attachment 228436 [details] [review]:

Looks fine, though I didn't test it in runtime.
Comment 24 Milan Crha 2012-11-12 17:49:18 UTC
Review of attachment 228440 [details] [review]:

Looks fine, please commit (and a little nitpick).

::: addressbook/libedata-book/e-book-backend-sexp.c
@@ +185,3 @@
+
+	for (l = list; l; l = l->next) {
+		gchar *phone = l->data;

This can be 'const' too.
Comment 25 Tristan Van Berkom 2012-11-20 06:53:33 UTC
Comment on attachment 228351 [details] [review]
Add e_contact_field_type() introspection (with an added boxed type for string lists)

Committed as:

commit 58f38ab1969cb140167585cd8d8080489b49a339
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Oct 5 18:35:05 2012 +0900
Comment 26 Tristan Van Berkom 2012-11-20 06:56:52 UTC
Comment on attachment 228440 [details] [review]
Fix e_book_backend_sexp_match_contact() for multi-valued fields

Changed to use a const pointer and committed as:

commit 9346175a64f01dcafbe492cad838c5d8dda99c20
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Thu Nov 8 17:13:07 2012 +0900
Comment 27 Tristan Van Berkom 2012-11-20 08:24:20 UTC
Created attachment 229445 [details] [review]
Add string <--> enumeration converter utilities (needed by the ESourceAddressBookConfig extension) (revision 2)

This patch fixes the issues mentioned in the review (yes, it suffered from copy&pastitice)
Comment 28 Tristan Van Berkom 2012-11-20 08:26:31 UTC
Created attachment 229446 [details] [review]
Add utility function for string normalization (originally from EBookBackendSexp)

This adds e_util_utf8_normalize() and is added here so it can be shared
with e-book-backend-sqlitedb.c & e-book-backend-sexp.c
Comment 29 Tristan Van Berkom 2012-11-20 08:30:18 UTC
Created attachment 229449 [details] [review]
Add ESourceBackendSummarySetup extension

This patch replaces the ESourceAddressBookConfig with ESourceBackendSummarySetup

However I did not put it in libedataserver, it's api is libebook specific
(i.e. it works directly with EContactFields, which I think is a nice api
to have).

Comment added specifying that this is really only used to setup an addressbook,
it cannot modify addressbook configurations after creation (however that
may be subject to change).

Other review comments regarding the ESourceAddressBookConfig have also been
addressed (i.e. use GMutex in non-deprecated way, etc.).
Comment 30 Tristan Van Berkom 2012-11-20 08:32:52 UTC
Created attachment 229450 [details] [review]
Big refactoring of EBookBackendSqliteDB object (revision 2)

This patch replaces the old patch for sqlitedb

The patch review comments have been addressed in this patch, and
e_book_backend_sqlitedb_new_full() now takes an ESourceBackendSummarySetup
object instead of 5 additional parameters (which we thought would be a nicer
API here).
Comment 31 Tristan Van Berkom 2012-11-20 08:34:24 UTC
Created attachment 229451 [details] [review]
Make EBookBackendFile consult the new extension and configure the SQLite object properly (revision 2)

This patch makes EBookBackendFile call e_book_backend_sqlitedb_new_full()
with the extension as argument (and replaces the older patch, also addressing
review comments such as: use g_type_ensure())
Comment 32 Tristan Van Berkom 2012-11-20 08:35:56 UTC
Created attachment 229452 [details] [review]
Test case using configurable summary fields (revision 2)

This patch replaces the old test case patch (updated for new api, and
also uses g_type_ensure())
Comment 33 Tristan Van Berkom 2012-11-20 08:37:40 UTC
Created attachment 229453 [details] [review]
EBookBackendSexp: use e_util_utf8_normalize()

This patch makes EBookBackendSexp use the new e_util_utf8_normalize() function
which is also used in EBookBackendSqliteDB instead of g_utf8_casefold().
Comment 34 Tristan Van Berkom 2012-11-20 08:46:43 UTC
Created attachment 229455 [details] [review]
Add ESourceBackendSummarySetup extension (revision 3)

This is a last minute documentation fix... with all the renaming I missed
one comment in e-book-type.c mentioning the old ESourceAddressBookConfig name.
(replaces the previous patch, of course)
Comment 35 Milan Crha 2012-11-20 10:50:41 UTC
Review of attachment 229445 [details] [review]:

Fix these two coding style things, please, and commit. Thanks. Make sure the commit log will not contain ESourceAddressBookConfig - it's renamed now, right?

::: libedataserver/e-data-server-util.c
@@ +1296,3 @@
+	value = g_ascii_strtoull (string, &endptr, 0);
+	if (endptr != string) /* parsed a number */
+		*enum_value = value;

add comment below if, basically make it:
   if (endptr != string) {
      /* parsed a number */
      *enum_value = value;
   } else {

@@ +1339,3 @@
+	for (i = 0; i < eclass->n_values; i++) {
+
+		if (eval == eclass->values[i].value) {

no need for an empty line?
Comment 36 Milan Crha 2012-11-20 11:02:03 UTC
Review of attachment 229446 [details] [review]:

::: libedataserver/e-data-server-util.c
@@ -529,0 +529,28 @@
+ * e_util_utf8_normalize:
+ * @str: a UTF-8 string
+ *
... 25 more ...

what about calling:
    if (!g_utf8_validate ())
        e_util_utf8_make_valid()
    e_util_utf8_remove_accents()
    g_utf8_strdown()
rather than this 'for'? My idea is to provide as much value as possible, rather than just panic and return NULL on any input errors (like different encoding being used in vCard, which did happen in the past)

@@ +567,3 @@
+}
+
+

extra empty line
Comment 37 Milan Crha 2012-11-20 11:13:20 UTC
Review of attachment 229455 [details] [review]:

Right, as long as it uses only book-specific columns it belongs there, not to libedataserver.

I noticed you left the _set_ functions 0-terminated, not changed it to NULL-terminated with added compiler checks as I suggested above. I'm not forcing you, just consider change this - the compiler can catch simple code typos much easier than anyone during runtime.
Comment 38 Milan Crha 2012-11-20 12:50:06 UTC
Review of attachment 229450 [details] [review]:

Please address the below, or feel free to ping me about any of them, and commit.

::: addressbook/libedata-book/e-book-backend-sqlitedb.c
@@ +116,3 @@
+ * rely on this. Assuming that the frequency of matching on these would be higher than
+ * on the other fields like email_2, surname etc. email_1 should be the primary email */
+static EContactField default_indexed_fields[] = {

I'm afraid users do not take it this way, furthermore, this way you limit searches in contact lists to the first email, which has no meaning with respect of priority. You said in the commit comment that the E_CONTACT_EMAIL is supposed to replace E_CONTACT_EMAIL_X, and I agree with that. Use it here, and in default_summary_fields too, please. (You'll fix one extra bug with it too, right?) :)

@@ +126,3 @@
+};
+
+

extra empty line

@@ +129,2 @@
 static const gchar *
+summary_dbname_from_field (EBookBackendSqliteDB *ebsdb, EContactField field)

the second argument might be on the second line, aligned with the first

@@ +142,2 @@
 static gint
+summary_index_from_field_name (EBookBackendSqliteDB *ebsdb, const gchar *field_name)

argument alignment

@@ +156,3 @@
+}
+
+

extra empty line

@@ +580,1 @@
 

kind of unneeded empty line

@@ +591,3 @@
+		if (ebsdb->priv->summary_fields[i].type == G_TYPE_STRING &&
+		    (ebsdb->priv->summary_fields[i].index & INDEX_SUFFIX) != 0) {
+			g_string_append  (string, ebsdb->priv->summary_fields[i].dbname);

two spaces after "g_string_append", on both lines

@@ +594,3 @@
+			g_string_append  (string, "_reverse TEXT, ");
+		}
+

unneeded empty line

@@ +610,3 @@
+	/* Create indexes on the summary fields configured for indexing */
+	for (i = 0; success && i < ebsdb->priv->n_summary_fields; i++) {
+

empty line

@@ +643,1 @@
 

empty line

@@ +872,3 @@
+	new_field.field = field;
+	new_field.dbname = dbname;
+	new_field.type   = type;

as we spoke on IRC, extra spaces being used on the above line, same as in argument lists and variables.

@@ +924,3 @@
+ * @store_vcard: True if the vcard should be stored inside db, if FALSE only the summary fields would be stored inside db.
+ * @fields: An array of #EContactFields to be used for the summary
+ * @n_fields: The number of summary fields in @fields

replace parameters with the 'setup' extension :)

@@ +955,3 @@
+	EContactField *fields;
+	EContactField *indexed_fields;
+	EBookIndexType *index_types;

at least index_types should be initialized to NULL

@@ +961,3 @@
+	gboolean had_error = FALSE;
+	GArray *summary_fields;
+	gint n_fields, n_indexed_fields, i;

these n_* should be also initialized

@@ +967,3 @@
+
+	/* No specified summary fields indicates the default summary configuration should be used */
+	if (n_fields <= 0) {

I see that one cannot redefine indexes, unless he/she will also define used columns. Maybe write a little note about that in a documentation blob.

@@ +1004,3 @@
+	ebsdb = e_book_backend_sqlitedb_new_internal (path, emailid, folderid, folder_name,
+						      store_vcard,
+						      (SummaryField *)summary_fields->data,

add a space in front of summary_fields->

@@ +1052,3 @@
+	gint i;
+
+	/* Create the default summery structs */

s/summery/summary/

@@ +1066,3 @@
+	ebsdb = e_book_backend_sqlitedb_new_internal (path, emailid, folderid, folder_name,
+						      store_vcard,
+						      (SummaryField *)summary_fields->data,

add a space in front of summary_fields->

@@ +1125,3 @@
+			val = e_contact_get (contact, ebsdb->priv->summary_fields[i].field);
+
+			/* Special exception, never normalize the UID string */

would it make sense to never normalize REV as well? These two identify the contact and its version.

@@ +1139,3 @@
+
+				str = sqlite3_mprintf ("%Q", reverse);
+				g_string_append (string, ", ");

does it cope with NULL properly? Just wondering, I guess your test suit has there NULL too.

@@ +1210,3 @@
+			values = e_contact_get (contact, priv->summary_fields[i].field);
+
+			for (l = values; (!error || (*error == NULL)) && l != NULL; l = l->next) {

Use local_error

@@ +1232,3 @@
+				}
+
+				success = book_backend_sql_exec (priv->db, stmt, NULL, NULL, error);

same as here should be the local_error.

I know the first iteration expects no updates of the cache after it is created, but if you use here columns from the extension, then any change to it will break contact insertions, especially when there will be added a column which is not in the summary.

@@ +2109,3 @@
+				value = convert_string_value (query_term_input, FALSE, TRUE, MATCH_BEGINS_WITH);
+			else
+				value = convert_string_value (query_term_input, TRUE, TRUE, MATCH_BEGINS_WITH);

UID comparison might be always MATCH_IS
Comment 39 Milan Crha 2012-11-20 12:53:06 UTC
Review of attachment 229451 [details] [review]:

Otherwise looks fine. Commit when the rest is in.

::: addressbook/backends/file/e-book-backend-file.c
@@ +64,3 @@
 #define EDB_NOT_OPENED_ERROR      EDB_ERROR(NOT_OPENED)
 
+

extra empty line

@@ +1251,3 @@
 	backup   = g_build_filename (dirname, "addressbook.db.old", NULL);
 
+	g_type_ensure (E_TYPE_SOURCE_BACKEND_SUMMARY_SETUP);

I do not think it's required here, the below call returns created object, which holds the type definition
Comment 40 Milan Crha 2012-11-20 12:56:47 UTC
Review of attachment 229452 [details] [review]:

Looks fine
Comment 41 Milan Crha 2012-11-20 13:01:22 UTC
Review of attachment 229453 [details] [review]:

Looks fine. I see you used my initial implementation of chars_to_unistring_lowercase() in e_util_utf8_normalize(). It's interesting like opinion changes during the time :)
Comment 42 Tristan Van Berkom 2012-11-22 04:04:05 UTC
Comment on attachment 229445 [details] [review]
Add string <--> enumeration converter utilities (needed by the ESourceAddressBookConfig extension) (revision 2)

committed as:

commit ffc180a64f8ff01664e4b719e9bb38a6adfca86f
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Mon Oct 15 15:45:35 2012 +0900
Comment 43 Tristan Van Berkom 2012-11-22 04:05:33 UTC
Comment on attachment 229446 [details] [review]
Add utility function for string normalization (originally from EBookBackendSexp)

Changed function to remove the loop and use e_util_utf8_make_valid() as requested/discussed.

committed as:

commit ffc180a64f8ff01664e4b719e9bb38a6adfca86f
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Mon Oct 15 15:45:35 2012 +0900
Comment 44 Tristan Van Berkom 2012-11-22 04:06:44 UTC
Comment on attachment 229455 [details] [review]
Add ESourceBackendSummarySetup extension (revision 3)

committed as:

commit 844f1b7168f00ab5a720c99dfbd099f2b86d44eb
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Mon Oct 8 14:18:10 2012 +0900
Comment 45 Tristan Van Berkom 2012-11-22 04:07:28 UTC
Comment on attachment 229450 [details] [review]
Big refactoring of EBookBackendSqliteDB object (revision 2)

committed as:

commit ccaf242b99a27416bc9f382f7e674748e9bffe26
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Wed Nov 7 17:19:57 2012 +0900
Comment 46 Tristan Van Berkom 2012-11-22 04:08:05 UTC
Comment on attachment 229451 [details] [review]
Make EBookBackendFile consult the new extension and configure the SQLite object properly (revision 2)

Committed as:

commit 10f430b4c08805995fa2867c5c62e8a4d4154ca0
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Wed Nov 7 17:31:10 2012 +0900
Comment 47 Tristan Van Berkom 2012-11-22 04:08:47 UTC
Comment on attachment 229452 [details] [review]
Test case using configurable summary fields (revision 2)

committed as:

commit 4cedb4f1cfc9e31031ec5140f4a3fd1918d6ed15
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Mon Oct 8 16:43:06 2012 +0900
Comment 48 Tristan Van Berkom 2012-11-22 04:10:58 UTC
Comment on attachment 229453 [details] [review]
EBookBackendSexp: use e_util_utf8_normalize()

committed as:

commit 8d5ebec209206a8fdde4562de0528c296e8f45c1
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Tue Nov 20 15:40:59 2012 +0900
Comment 49 Tristan Van Berkom 2012-11-22 04:12:51 UTC
(In reply to comment #43)
> (From update of attachment 229446 [details] [review])
> Changed function to remove the loop and use e_util_utf8_make_valid() as
> requested/discussed.
> 
> committed as:
> 
> commit ffc180a64f8ff01664e4b719e9bb38a6adfca86f
> Author: Tristan Van Berkom <tristanvb@openismus.com>
> Date:   Mon Oct 15 15:45:35 2012 +0900

Oops, copy/paste error, this one was actually:

commit b963d3e26ffc32dfa0238ba930519dbe0330e17a
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Mon Nov 19 20:19:43 2012 +0900
Comment 50 Tristan Van Berkom 2012-11-22 04:15:31 UTC
I'll close this one for now.

If there are issues stemming from the new feature set then let's treat
them as separate bugs.
Comment 51 Milan Crha 2012-11-22 15:48:07 UTC
I'm reopening this due to regression:

evolution-ews is also using sqlitedb, same as evolution-mapi, for the address book. After the update I cannot use these, I get errors on console of the test-name-selector test from eds/tests/libedataserverui:
   (lt-test-name-selector:31014): e-data-server-ui-WARNING **: Cannot open
   book: near "=": syntax error

   (lt-test-name-selector:31014): e-data-server-ui-WARNING **: Cannot open
   book: unrecognized token: "9ed3338c"

   (lt-test-name-selector:31014): e-data-server-ui-WARNING **: Cannot open
   book: near "+": syntax error

   (lt-test-name-selector:31014): e-data-server-ui-WARNING **: Cannot open
   book: near "-": syntax error

When selecting Contacts book from EWS. If I run addressbook factory from a console, then I see there these errors:
   (evolution-addressbook-factory:30822): GLib-WARNING **: GError set over
   the top of a previous GError or uninitialized memory.
   This indicates a bug in someone's code. You must ensure an error is NULL
   before it's set.
   The overwriting error message was: no such table: main.AAMkAGFhNWJkY2Vj
   LTZkMGYtNDAxZi05MzJiLWM5Nzk5YjNlZWE1MQAuAAAAAACiAADM9ruVQ6Evwv91CZjQAQA
   cr3hHVWVAS5inRJae6omjAAAAg5CeAAA=_lists

   (evolution-addressbook-factory:30822): GLib-WARNING **: GError set over...
   The overwriting error message was: no such table: main.9ed3338c-d0a4
   -40cf-95e7-12a93cde388c:Global Address List_lists

   ...The overwriting error message was: no such table: main.global-address
   -list_lists

I thought there is nothing to be done to have this working from other users of addressbook's DB summary.
Comment 52 Tristan Van Berkom 2012-11-23 09:53:46 UTC
Created attachment 229709 [details] [review]
Avoid errors stemming from conflicting summaries

This patch derives the summary information from the running database
at initialization time, so if the DB was previously configured with
something other than the summary setup passed, the inserts/selects should
still work properly.

Also note, this patch adds 2 fields to the main folders table (holding
information about which fields should be stored in the multi value
table).

Since we are still in a development cycle I did not increment the DB revision
a second time...

So I suspect there would be errors if you run this against a DB that was
created after the e_book_backend_get/set_revision() functions were added.
Comment 53 Tristan Van Berkom 2012-11-23 10:22:33 UTC
Created attachment 229711 [details] [review]
Avoid errors stemming from conflicting summaries (revision 2)

Same as last patch, but upgrades to version 3
Comment 54 Tristan Van Berkom 2012-11-23 11:37:03 UTC
Created attachment 229717 [details] [review]
Avoid errors stemming from conflicting summaries (revision 3)

Like the last patch but with following changes:
  o Some added protection against weird folder names (use %Q)
  o Use table aliases in complex queries
  o Included new debugging scheme based on env var (by Milan)
Comment 55 Milan Crha 2012-11-23 12:20:59 UTC
Good, it works with this, I didn't spot any error on ews book. Please do not forget of the indent for debugging, and commit. Thanks.
Comment 56 Tristan Van Berkom 2012-11-23 13:11:30 UTC
Comment on attachment 229717 [details] [review]
Avoid errors stemming from conflicting summaries (revision 3)

committed as:

commit 4130235c9c43e50094135e59d012b6a69327dfb2
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Nov 23 18:45:33 2012 +0900