Bug 686681 – Direct read access to the addressbook.

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 686681 - Direct read access to the addressbook.


Summary:	Direct read access to the addressbook.


Status:	RESOLVED FIXED

Product:	evolution-data-server
Classification:	Platform
Component:	Contacts
Version:	3.8.x (obsolete)
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	evolution-addressbook-maintainers
QA Contact:	Evolution QA team

URL:
Whiteboard:

Depends on:	693779
Blocks:	687281

Reported:	2012-10-23 07:42 UTC by Tristan Van Berkom
Modified:	2013-09-14 16:56 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Add core support for direct read access (685.76 KB, patch) 2012-11-27 09:47 UTC, Tristan Van Berkom	none	Details \| Review
Make EBookBackendFile support DRA (Direct Read Access (4.28 KB, patch) 2012-11-27 09:49 UTC, Tristan Van Berkom	none	Details \| Review
Make test cases use DRA (2.58 KB, patch) 2012-11-27 09:50 UTC, Tristan Van Berkom	none	Details \| Review
Make EDataBook export direct read apis (41.71 KB, patch) 2013-02-14 15:09 UTC, Tristan Van Berkom	committed	Details \| Review
Patch to make EBookClient optionally use direct read access (14.52 KB, patch) 2013-02-14 15:14 UTC, Tristan Van Berkom	committed	Details \| Review
Add EBookBackend api for backends to support direct read access (3.21 KB, patch) 2013-02-14 15:17 UTC, Tristan Van Berkom	committed	Details \| Review
An EDataBookDirect object, which only wraps the dbus interface for backend convenience (4.23 KB, text/plain) 2013-02-14 15:19 UTC, Tristan Van Berkom		Details
EDataBookDirect header file (2.42 KB, text/plain) 2013-02-14 15:19 UTC, Tristan Van Berkom		Details
Patch to add the EDBusDirectBook dbus interface (2.87 KB, patch) 2013-02-14 15:20 UTC, Tristan Van Berkom	committed	Details \| Review
Make EBookBackendFile support DRA (Direct Read Access) - updated version (3.14 KB, patch) 2013-02-14 15:22 UTC, Tristan Van Berkom	committed	Details \| Review
Make test cases use DRA (1.89 KB, patch) 2013-02-14 15:23 UTC, Tristan Van Berkom	committed	Details \| Review
Make EDataBookView support dra (1.88 KB, patch) 2013-02-14 15:26 UTC, Tristan Van Berkom	committed	Details \| Review
Make EBookClientView support direct reads (8.30 KB, patch) 2013-02-14 15:29 UTC, Tristan Van Berkom	committed	Details \| Review
Test with concurrent views active (6.46 KB, text/plain) 2013-02-14 15:32 UTC, Tristan Van Berkom		Details

Description Tristan Van Berkom 2012-10-23 07:42:18 UTC

We would add additional but optional code paths to EBookClient and 
EBookClientView to directly access EDataBook and EDataBookView methods for reading contacts, bypassing D-Bus.

This will reduce the overheads of GVariant serialization and deserialization 
as well as the overhead of pushing these serialized vcards through the bus,
at least while accessing the addressbook for reading.

The code paths for write access would remain unaffected and would still use D-Bus. Write access to EDS would become exclusive for a single process, with concurrent direct read access for the other processes.

In future releases of EDS this could become the default mode, once we have
proven that possible limitations are insignificant for the important
use-cases.

Comment 1 Tristan Van Berkom 2012-11-16 10:47:44 UTC

As of now there is a functional implementation of this on 'openismus-work'
branch (based on the gnome-3-6 branch).

Before porting this to master and publishing patches, I wanted to raise
some API concerns, currently I have an API in place but it's probably not
good enough.

Firstly, let's explain a little bit what is needed in terms of inputs
in order to get direct read access up and running:

  o The full path to the shared library holding the given backend

    This actual file needs to be dlopened by the client directly in
    order to load the actual backend code needed for direct read access

  o The name of the EBackendFactory object... this is basically our
    entry point, when we call e_module_new() on the given backend
    library, this type is already registered... it's class provides
    us with access to the actual EBookBackend that we're going to run

What I currently have in place as user facing API is:

  o ESourceDirectAccess: An ESourceExtension which allows configuration
    of the first said inputs, if an addressbook is configured correctly
    with this extension... then it is available for direct read access.

  o e_book_client_new_direct():

    This api basically opens the addressbook in direct read access mode,
    if the ESourceDirectAccess extension is configured correctly then it
    will succeed to open the addressbook in direct read access mode.

Concerns
~~~~~~~~

  o Currently the APIs allow configuration of an addressbook for direct
    read access mode, when the underlying addressbook might not be safe
    to use with direct read access.

    I.e. currently this works with the SQLite local file backend because
    we enable WAL journal mode in the SQLite and rely on SQLite for concurrent
    read access to work... this might not be safe for other backends.

    So in other words, backends should be able to explicitly "opt-in" to
    this mode, perhaps with a specific D-Bus api used at initialization
    or by adding a "capability" to the backend capability properties.

  o e_book_client_new_direct() is explicit.

    Perhaps this should not be explicit, but rather another configuration,
    maybe this should be implied by the ESourceDirectAccess extension, even.

    The rationale is that, in the future we might want to automatically
    enable direct read access mode for clients without having to change
    any code on the client/caller side.

  o Environments and paths

    Here is another tricky issue, I'm not sure how evolution usually is
    setup to work on a system, but I think usually the data server processes
    run on behalf of the active user, in which case the data server persists
    data in $HOME relative directories (or XDG_*_HOME dirs).

    It is also conceivable to have a system setup where the data server
    runs with it's own user in a safe/walled account and manages data
    in that directory (possibly world readable at least where contact
    photo uris are stored).

    However; in the current setup with Direct Read Access mode enabled,
    the XDG env variables or HOME directory must be the _same_ for both
    the data server and the client accessing the data in direct read
    mode.

    This implies there may be some need to force the directly accessed
    backend (in the client's user space) to use some specific directories
    to access data owned by the data server.

API Proposal
~~~~~~~~~~~~

To address the afore mentioned concerns, this is a basic idea of what
we could do in terms of API.

  o ESourceDirectAccess extension

    Using the same name as the current extension, this extension would
    simply hold a single boolean configuration, i.e. whether this addressbook
    should be accessed with Direct Read Access or not.

  o Added D-Bus API

    If a given ESource is configured to desire Direct Read Access then
    it should go through an extra D-Bus api while e_client_open() is
    called on the given EBookClient.

    In addition to calling "open" over D-Bus, we could ask the backend
    to report the relevant information to the client over the bus:

       o Full path to the backend module

         (it's handy to have the backend itself report this because at
         least the backend knows what prefix it was compiled with and can
         reliably report the location where it is installed).

       o Name of the EBookBackendFactory type

         (i.e. EBookBackendFileFactory or other)

       o Full path to where the actual backend has it's sandbox

         (perhaps this third string should be some opaque configuration
         data that the given EBookBackend knows what to do with, for
         the local SQLite backend it would be a local directory holding
         the SQLite DB, for another backend, like the google backend
         it might be a URI or such)

    If this D-Bus method fails to report anything, then the given backend
    simply does not support Direct Read Access mode and the regular D-Bus
    pathways should be used to read the addressbook.

    Otherwise, the backend is loaded directly with the given full path
    and factory name, and then some added API such as:

      e_book_backend_configure (EBookBackend *backend, const gchar *string);

    is called to configure the backend with the last string returned in the
    D-Bus API... for the local file backend this string would of course
    be the location of the actual SQLite which must be accessed (this would
    of course override any XDG/HOME dirs which the local file backend
    would otherwise try to use to access the SQLite).

Well, those are my thoughts so far... any input on this is appreciated.

Thanks.

Comment 2 Matthew Barnes 2012-11-16 11:58:45 UTC

Per your API proposal, I think a better approach is to export a DirectAccess D-Bus interface for those backends that support it.  It would live alongside the "org.gnome.evolution.dataserver.AddressBook" interface at the same object path, and could be tested for on the client-side.  Instead of explicitly querying the backend with a method call for the information you listed above, it could simply be exported as read-only properties on this DirectAccess interface.

This approach has worked well for ESources.  In addition to a basic "Source" interface, the registry may export additional interfaces such as "Writable" and "Removable" which simultaneous advertises and enforces ESource capabilities, avoiding the need for a "capabilities" property.

I'm in the process of overhauling our backend APIs and converting the GDBus classes to be generated by gdbus-codegen (as ESource inferfaces already are), so they're much easier to maintain.  When that's done, I plan to push this approach further by splitting up the backend interfaces according to their capabilities and reducing or eliminating entirely the need for a static "capabilities" property.

Comment 3 Tristan Van Berkom 2012-11-27 09:47:32 UTC

Created attachment 229976 [details] [review]
Add core support for direct read access

This is a mega-huge patch which does the following things:

  o Splits up libebook into 2 libraries, libebook & libebook-contacts

    This is needed because now libebook will depend on libedata-book, instead
    of the other way around. In order for Direct Read Access to work
    the EBookClient needs to own an EDataBook

  o Added sync/async APIs for read operations to EDataBook

    These operations interact with the EDataBook thread pool,
    sync variants wait on queued operations to complete using
    a Mutex/Condition

  o Added e_book_client_new_direct()

    This attempts to create a direct read access mode client, if
    direct read access is not available for the given backend then
    a warning is printed and regular D-Bus methods are used instead.

  o Added EgdbusBookDirect

    A GDBus object with read only properties:
      - backend-path: The full path to the module holding the backend code
        (eg: libebookbackendfile.so)
      - backend-name: The name of the factory type loaded by the given
        module (eg: EBookBackendFileFactory)
      - config: A configuration string appropriate for the given backend
        (for the file backend, this is used to tell a directly loaded copy
        of the backend what is the directory containing the SQLite db, but
        can be used for other things depending on the backend supporting
        Direct Read Access).

  o Added EDataBookDirect

    A wrapper for the EgdbusBookDirect (since the actual GDBus types
    are private, the EDataBookDirect is creatable by the backends).

  o Added vfuncs to EBookBackendClass:

    ->get_direct_book(): If a backend implements DRA, then an EDataBookDirect
                         should be created and returned in this vfunc

    ->configure_direct(): This optional vfunc is called on a direct copy of
                          a given backend which supports DRA, the "config"
                          property is given to the direct copy using this
                          vfunc.

Comment 4 Tristan Van Berkom 2012-11-27 09:49:10 UTC

Created attachment 229977 [details] [review]
Make EBookBackendFile support DRA (Direct Read Access

This patch makes the EBookBackendFile support DRA by implementing the
extended vfuncs in EBookBackendClass

Comment 5 Tristan Van Berkom 2012-11-27 09:50:04 UTC

Created attachment 229978 [details] [review]
Make test cases use DRA

This patch makes the test cases for EBookClient all use direct read access.

Comment 6 Milan Crha 2012-11-27 10:02:59 UTC

(In reply to comment #2)
> When that's done, I plan to push this
> approach further by splitting up the backend interfaces according to their
> capabilities and reducing or eliminating entirely the need for a static
> "capabilities" property.

Replace one bool check with a type checking? It's no advantage, it's like using Javascript for a core desktop environment.

Comment 7 Milan Crha 2012-11-27 10:38:57 UTC

For the request itself, I would not do that, basically because of:
a) maintainability - who will take care of changes and possible bugs;
   it's always easier to take care of one interface, than many different
b) all described above sounds like a hack
c) I do not think the gain would be that significant, with compare of effort
   needed for maintainability. Tristan told me about ~5ms for fetch a contact
   with compare of <1ms with direct access. Well, on one side this change is
   unnoticeable by regular users, when fetching one contact; on the other
   side, it can be noticeable when fetching 1000 contacts, still they will
   be delivered in chunks in views, which is a preferred way to access
   book content in evolution's UI. If your application is using
   e_book_client_get_contacts(), then yes, you are "slowed down".

Comment 8 Mathias Hasselmann (IRC: tbf) 2012-11-27 14:20:25 UTC

5 ms * 30 contacts = 150 ms seconds per small desktop screen. that is way too slow for smooth scrolling. even for 10 contacts that's way too slow for smooth flick scrolling: when aiming for 60 fps we have just 16 ms for rendering a single frame. users notice if touch uis don't reach 60 fps.

Comment 9 Patrick Ohly 2012-11-27 16:45:41 UTC

(In reply to comment #8)
> 5 ms * 30 contacts = 150 ms seconds per small desktop screen. that is way too
> slow for smooth scrolling. even for 10 contacts that's way too slow for smooth
> flick scrolling: when aiming for 60 fps we have just 16 ms for rendering a
> single frame. users notice if touch uis don't reach 60 fps.

Scrolling through a long list of contacts without loading and buffering is one use case, although this probably will depend on some way of obtaining a sorted list of contacts first and then browsing through them with e_book_client_get_contacts_uids().

A more realistic use case is looking up a caller ID once in the current set of contacts, without bothering about future updates of that data. I'm about to implement that. The response time from EDS definitely matters.

Comment 10 Mathias Hasselmann (IRC: tbf) 2013-01-18 14:03:21 UTC

Philip van Hoof just reminded me of another reason why we need direct access __badly__.

With the current client-server architecture each read access wakes up the addressbook factory process, which contributes to battery drain.

Even worse this dependency on the factory process makes it harder, if not impossible to prioritize processes: For instance phone call handlers would run on highest priority, so that phone calls are handled instantly. Still its dependency on low-priority processes like the addressbook factory would reduce the phone call application's effective priority to the factory's priority - rendering all priorizations void. Now an obvious solution would be to also increase the addressbook factory's priority, but doing so will cause even bigger harm: Suddenly the addressbook factory would dominate all other clients, leading to slow or even stalled clients while the highly prioritized addressbook factory hammers the dbus socket.

Comment 11 Philip Van Hoof 2013-01-18 14:10:19 UTC

(In reply to comment #10)
> Philip van Hoof just reminded me of another reason why we need direct access
> __badly__.


Philip was actually trying to say that instead of copying what tracker-store has been doing for a few years now, why not use tracker-store for the purpose? It has ontology for contacts and calendar items, it still is faster than EDS (used to be magnitudes faster as proven by a study that we did in response to incorrect EDS propaganda during the few Intel-MeeGo months) and on a production phone (the N9) it is being used for that purpose too. On top is the KDE world also going in the direction of Nepomuk based storage for such data. So this step would make GNOME and KDE more compatible with each other (architecturally seen, but also with relation to eventually sharing data between applications and among applications).

I think that vertical applications with vertical storage databases, that don't or can't communicate or share their data horizontally with the other applications is the nineties.

Because copy-paste doesn't work well on touch screen, should applications consume other applications' data directly. Meaning you need infrastructure (sparql) and an ontology (a schema) to share this data.

Unfortunately is the Tracker project usually only seen as a file indexer. That's incorrect and not how its solely used on for example the N9. It serves a much bigger role in allowing all applications on the device to share each other's data directly.

Comment 12 Philip Van Hoof 2013-01-18 14:16:49 UTC

Btw. Our main reason for the direct access support (through SQLite's WAL journaling) was process prioritization (in particular for the N9's phone app, which needed absolute priority - also above tracker-store). Because if read queries would have to go through tracker-store, the phone app would have a prioritization issue / dependency on / with tracker-store. Meaning that tracker-store would have to run at a higher priority than the phone app (for the phone app to receive answers to its queries at the right prioritization).

Instead we moved the querying to the processes themselves. For writing, as SQLite isn't MVCC, we implemented FD passing (over D-Bus) between the app and tracker-store and tracker-store. Which handles prioritization, queuing, grouping, transaction logic, writing, WAL journal size containment, etc.

If EDS ever wants to be as efficient as the libtracker-sparql and tracker-store combination with its SQLite usage, it'll probably have to copy each and every aspect of what we did for Tracker.

or just use tracker. nih I mean meh.

Comment 13 Patrick Ohly 2013-01-18 17:23:30 UTC

Hello Philip,

I'm not going to enter another round of explanations why someone might prefer to use EDS instead of Tracker. The two systems are different enough to make someone prefer one or the other depending on requirements.

But as you are quoting out-dated performance results once more, let me augment that with some more recent information.

(In reply to comment #11)
> it still is faster than EDS
> (used to be magnitudes faster as proven by a study that we did in response to
> incorrect EDS propaganda during the few Intel-MeeGo months)

As you might remember, that orders of magnitude came from a performance bug in EDS which was fixed shortly after you highlighted that problem.

For a more thorough analysis of EDS performance, have a look at Mathias' benchmarks:
http://taschenorakel.de/mathias/2012/06/20/performance-and-memory-usage-of-evolution-data-server/

It would be interesting to implement the same benchmarks for QtContacts-Tracker and then run a comparison on the same hardware, in particular now that EDS has the much faster read performance.

Comment 14 Philip Van Hoof 2013-01-20 11:43:57 UTC

QtContacts-Tracker is an abstraction layer that isn't needed to do such a fair comparison. You should compare with libtracker-sparql which gives you direct usage of the SQLite cursor and makes it possible to optimize your queries. You could also use QSparql but even that would be a needless abstraction that would add an unfair advantage to EDS.

Given that EDS's libraries require a copy to a GList, it really can't be fast. Returning lists instead of iterators is almost always a design mistake whenever the to be returned data comes from a database-like source. For EDS to be fast, its client api must be redone (completely). Let me explain:

You are copying the entire dataset. If EDS's API would be interator (or enumerator or cursor) based, it could in the next() call use sqlite_step and then point to the const data as returned by SQLite's API. This const data needs no copy (but is invalid after the next sqlite_step).

For me it's not even necessary to do performance benchmarks as your copy can theoretically not be faster: you need a malloc and a memcpy. Not only will that always be slower, that malloc will also create memory fragmentation.

Note that Mathias and me already discussed in private what could be done about that in EDS; there are some things you could do to reduce the malloc overhead (if you know the exact size needed beforehand, like gslice and other magazine or other pre-allocated string bags) and to reduce memory fragmentation you could use GStringChunk and/or something like CamelString (or the pre-allocated string bag - with reference counting per child-string).

Plus, without these direct-access patches (which is actually an exact copy of what tracker-store and libtracker-sparql does) EDS is also copying all that data of the dataset, often in multiple directions, over IPC. Plus serialization and deserialization using slow DBus serialization which ALSO does unneeded extra mallocs all over the place. To me it's obvious why stuff is slow in EDS. With FD passing things would be a little bit better, except that you can't have process prioritization right and that server and clientside you are needlessly copying.

You can of course repeat the benchmark that we did. I posted its source code (for both the EDS based and the Tracker based application) in the E-mails.

Comment 15 Philip Van Hoof 2013-01-20 12:26:03 UTC

ps. If somebody is going to repeat the benchmark I need to point out that with domain indexes in ontology tracker's SPARQL endpoint can also still be made faster for for example contact purposes. Previous benchmark I didn't do this yet.

Some information here:
http://pvanhoof.be/blog/index.php/2010/06/30/domain-specific-indexes
http://pvanhoof.be/blog/index.php/2010/07/01/working-on-domain-specific-indexes
http://pvanhoof.be/blog/index.php/2010/07/07/domain-indexes-finished-technical-conclusions

Right now the Nepomuk / SPARQL endpoint isn't optimized a lot for contact-lookup purposes. It's already possible, for example, to fully avoid that the equivalent queries (equivalent to EDS's service operation contracts) cause full table scans. At that point the only overhead compared to raw SQLite use using cursor sqlite_step access is the SPARQL query parser (which is fast, doesn't use an AST), as the denormalization overhead (joins between tables) would be minimized.

Domain-indexes where not used a lot, as we optimized most queries to the point that it wasn't necessary anymore (we also sometimes optimized by changing the Nepomuk ontology).

Comment 16 Patrick Ohly 2013-01-20 13:44:06 UTC

(In reply to comment #14)
> QtContacts-Tracker is an abstraction layer that isn't needed to do such a fair
> comparison.

And leaving out that abstraction layer gives an unfair advantage to Tracker, because EDS includes that abstraction as part of its API [yes, that's less flexible, but that's not the point here].

"Find and read a contact" from an application perspective must include the work of turning the raw data in the database into something that the application can work with easily. For EDS, that is EContact. For Tracker, something like QContact.

I'm aware that adding that layer makes it very hard if not impossible to implement the performance improvements that you described (like avoiding memory copies). But for the current work (unified address book in IVI), the abstraction was explicitly requested, including separation of PIM storage from the UI process via IPC.

Comment 17 Philip Van Hoof 2013-01-20 21:02:53 UTC

I wonder what API? I guess I've been spoiled with SPARQL's flexibility and elegance a bit. Let me illustrate:

$ tracker-sparql -uq "insert { _:1 a nco:PersonContact; nco:nameFamily 'Van Hoof'; nco:nameGiven 'Philip' }"

$ tracker-sparql -q "select fn:concat('BEGIN:VCARD\nVERSION:2.1\nN:',?n,';',?s,'\nFN:',?n,' ',?s,'\nEND:VCARD') { ?c a nco:Contact; nco:nameFamily ?n; nco:nameGiven ?s } "
Results:
BEGIN:VCARD
VERSION:2.1
N:Van Hoof;Philip
FN:Van Hoof Philip
END:VCARD
$

I don't even need to write any software to get VCARDs out of Tracker's SPARQL endpoint, just use a command line using fn:concat.

Due to EDS' API (abstractions or clientside API design, whatever you call it); you can't get much better performance.

Whereas with tracker's SPARQL endpoint you can get better performance (lower level access, if you want to call it that way). You can also get higher level abstractions (using QSparql and QContacts-Tracker) and they are atm still faster than EDS, too. And if not, they can be made faster based on necessity thanks to the lower level possibilities.

Again. I don't call it lower-level API as libtracker-sparql is THE official API. The ones on top, are just that: APIs on top. So in my opinion is libtracker-sparql the equivalent to EDS's client APIs (the lowest level available). In fact I'd be glad to compare Tracker's FD passing and WAL direct-access with EDS's IPC (with WAL) if they were practical to use. Fine. We optimized Trcaker's FD passing and WAL too.

There is also still a lot of room in optimizing tracker's SPARQL endpoint itself and in optimizing Nepomuk's ontology. Plus you get multiple domains of metadata and storage, with linkage and relations between those domains and queries on those links. For example: which faces are on this photo, return them as contacts in a VCARD (like in example above).

Another example: perform a query that shows VCARDs of people on the photos that I took last week in Paris and that show the Eiffel tower in the center. This is a query that can actually be made with tracker if all software correctly integrates. Which on the N9, for example, a lot of applications do (install N9's face recognition, etc etc). This is the kind of use-cases that it got designed for.

Vertical databases which are inaccessible horizontally to applications, is in my opinion the nineties. Even the web is slowly migrating to cross domain querying (show me the people that on FB said that they watched this documentary on the BBC - note that the BBC has a SPARQL endpoint too for such purposes).

EDS is a vertical database that makes three domains of data available separately. It can't even do queries between the domains that it does support (calendars, todo, contacts) very well.

By the way, with a cursor (iterator, enumerator) API you could deliver a very nice API that doesn't need to copy the entire dataset. With a doubly linked list you always need to somehow copy the entire dataset. Copying the entire dataset, is slow and needless (if you show 5 contacts out of a 100,000 long list, you copied 100,000 contacts just to show 5. = silly).

ps. The dream of tracker was to expose this data and query capability to authorized external devices. For example your in-car infotainment system.

Comment 18 Tristan Van Berkom 2013-01-21 02:24:30 UTC

Philip,

(In reply to comment #14)
> You are copying the entire dataset. If EDS's API would be interator (or
> enumerator or cursor) based, it could in the next() call use sqlite_step and
> then point to the const data as returned by SQLite's API. This const data needs
> no copy (but is invalid after the next sqlite_step).

Thanks for your insightful and constructive comments, it's great that you've
taken an interest in EDS performance and went to such a great length to help
us improve things here.

I doubt that the iterator API approach could land in EDS any time soon as it would not be useful to networked clients, but it's certainly something to bear in mind for the direct access APIs.

Comment 19 Philip Van Hoof 2013-01-21 07:43:12 UTC

Unless you don't care about latency at all should a remote and a local API of course not be the same. A remote cursor is a rather bad I think. When not using direct-access for read queries (non-default, but possible) will Tracker's FD passing also pass the entire dataset between client and tracker-store and is the iterator api emulated using a simple array index++. The main reason for that was to allow tracker-store to remain mostly stateless. I would still avoid as much as possible to pass entire datasets arounds, though. Pass what is visible and pre-fetch what is expected to become visible soon.

Comment 20 Patrick Ohly 2013-01-21 12:47:50 UTC

(In reply to comment #17)
> I wonder what API?

Primarily I am thinking of the "read contact, modify it in memory, write back" cycle. Apps expect some kind of convenient to use in-memory representation of contact data for that.

> I guess I've been spoiled with SPARQL's flexibility and
> elegance a bit. Let me illustrate:
> 
> $ tracker-sparql -uq "insert { _:1 a nco:PersonContact; nco:nameFamily 'Van
> Hoof'; nco:nameGiven 'Philip' }"

I don't doubt that one can do that, but I can't envision normal app developers writing such code.
 
> $ tracker-sparql -q "select
> fn:concat('BEGIN:VCARD\nVERSION:2.1\nN:',?n,';',?s,'\nFN:',?n,'
> ',?s,'\nEND:VCARD') { ?c a nco:Contact; nco:nameFamily ?n; nco:nameGiven ?s } "
> Results:
> BEGIN:VCARD
> VERSION:2.1
> N:Van Hoof;Philip
> FN:Van Hoof Philip
> END:VCARD
> $
>
> I don't even need to write any software to get VCARDs out of Tracker's SPARQL
> endpoint, just use a command line using fn:concat.

That's a bold statement. Can it also handle special characters and escaping them? How about including PHOTO data? My expectation is that for standard compliant output, something on top of sparql will be needed.

Comment 21 Mathias Hasselmann (IRC: tbf) 2013-01-21 19:18:07 UTC

This bug report obviously isn't a good place to discuss Tracker or SPARQL. Please stay on topic guys.

Comment 22 Tristan Van Berkom 2013-02-14 15:09:47 UTC

Created attachment 236073 [details] [review]
Make EDataBook export direct read apis

This is at the core of direct read access and probably the most significant patch
introduced by direct read support.

It adds sync/async variants for read operations (as well as the 
open/close semantics, currently open is only done synchronously).

The patch contains some data structures for handling the direct calls
and interfacing with the EOperationPool (some semantics for delivering
async replies as well as blocking on a GCond structure for sync calls).

Comment 23 Tristan Van Berkom 2013-02-14 15:14:49 UTC

Created attachment 236075 [details] [review]
Patch to make EBookClient optionally use direct read access

This patch adds e_book_client_connect_direct_sync().

Currently I left out the async variant because all of the crazy chaining
of calls is a little complex to be introducing this late in the game, while
the older e_book_client_new_direct() has been tested for months.

If a book client is successfully opened in direct read access mode, then
the read APIs (get_contact() etc) will be forwarded to use EDataBook apis
directly.

This contains some data structures to properly deliver the async replies
to the calling thread context.

Comment 24 Tristan Van Berkom 2013-02-14 15:17:31 UTC

Created attachment 236076 [details] [review]
Add EBookBackend api for backends to support direct read access

This adds EBookBackend->get_direct_book() and EBookBackend->configure_direct()

As will be apparent from reading through the EBookBackendFile patch, these
interfaces allow the backend to report whether the backend supports direct
read access, and support to setup a backend instantiated by the client so that
it is sure to access the same data as the server.

Comment 25 Tristan Van Berkom 2013-02-14 15:19:23 UTC

Created attachment 236077 [details]
An EDataBookDirect object, which only wraps the dbus interface for backend convenience

This file is just how the backends interface with the configuration data related to direct read access.

As the D-Bus apis themselves are generally private, a wrapper is needed to expose
this API to backends.

Comment 26 Tristan Van Berkom 2013-02-14 15:19:53 UTC

Created attachment 236078 [details]
EDataBookDirect header file

Comment 27 Tristan Van Berkom 2013-02-14 15:20:58 UTC

Created attachment 236079 [details] [review]
Patch to add the EDBusDirectBook dbus interface

This adds the generated interface used to propagate the configuration data.

Comment 28 Tristan Van Berkom 2013-02-14 15:22:27 UTC

Created attachment 236080 [details] [review]
Make EBookBackendFile support DRA (Direct Read Access) - updated version

This patch implements the EBookBackend methods to advertise and configure
direct read access.

Comment 29 Tristan Van Berkom 2013-02-14 15:23:57 UTC

Created attachment 236081 [details] [review]
Make test cases use DRA

With this patch, setting DEBUG_DIRECT=1 in the environment will cause test
cases to run in Direct Read Access mode.

Comment 30 Tristan Van Berkom 2013-02-14 15:26:45 UTC

Created attachment 236083 [details] [review]
Make EDataBookView support dra

This patch makes EDataBookView only send notifications with the UID
over D-Bus.

The intention here is to reduce as much D-Bus overhead as possible in views.

Currently this is done by adding a special "x-evolution-uids-only" to
the fields of interest instead of expanding on API at all.

Comment 31 Tristan Van Berkom 2013-02-14 15:29:51 UTC

Created attachment 236084 [details] [review]
Make EBookClientView support direct reads

This one goes hand in hand with the EDataBookView patch...

With this patch, if an EBookClientView is created for an EBookClient
which is in Direct Read Access mode, then the BookView will also behave
a bit differently:

  a.) It will request that only UIDs be sent to the view using
      the special "x-evolution-uids-only" field of interest

  b.) Upon receiving any notifications, it will pull the data directly
      using the EDataBook read apis before publishing the results
      in the EDataBookView signals.

Comment 32 Tristan Van Berkom 2013-02-14 15:32:33 UTC

Created attachment 236085 [details]
Test with concurrent views active

This test case spawns 20 threads and creates an EBookClient & EBookClientView
in each thread all accessing the same addressbook.

It asserts that each view receives all 5 contacts in the test book.

(This test proved useful to find the concurrency problems in
EBookBackendSqliteDB, it will lockup if the EBookBackendSqliteDB patches
are not applied).

Comment 33 Tristan Van Berkom 2013-02-14 15:35:35 UTC

Something I should mention, is that the patch for EBookClient and
EBookClientView move the error definitions out of libebook (and into
the splitup library libebook-contacts which is created).

This is not ideal, but currently some references from EDataBook still
exist, which should be fixed before we can safely move those error
definitions to their appropriate places in libebook.

Actually, recent modifications in master's e-data-book.c make it
needed to pull the EBookClientError down below libedata-book in
the stack (if it were not for those recent modifications then
it could safely stay in libebook).

Comment 34 Milan Crha 2013-02-14 19:01:02 UTC

Due to the file movements I wasn't able to apply the patches as they are, I rather picked the commits from your openism branch. It also showed that you split patches, but not commits, which is fine.

Please add the e-destination.c/.h to libebook-contacts too, thus those files are together. I like how you added the libebook-contacts.h into libebook.h, that should make the transition transparent to others.

The change works fine with evolution, no extra changes were needed there. Let me try to read the patches briefly (too large for any real code reading).

Comment 35 Milan Crha 2013-02-14 19:05:33 UTC

Review of attachment 236073 [details] [review]:

OK

Comment 36 Milan Crha 2013-02-14 19:09:11 UTC

Review of attachment 236075 [details] [review]:

Overall fine. The patch doesn't show where the error codes came, I suppose they were moved during the file reorganization.

::: addressbook/libebook/e-book-client.c
@@ +275,3 @@
+				   gpointer user_data)
+{
+  GSimpleAsyncResult *result;

tabs please

Comment 37 Milan Crha 2013-02-14 19:12:23 UTC

Review of attachment 236076 [details] [review]:

I see there's no padding, thus you should update LIBEDATABOOK_CURRENT in configure.ac, the structure size changed

Comment 38 Milan Crha 2013-02-14 19:15:59 UTC

Comment on attachment 236077 [details]
An EDataBookDirect object, which only wraps the dbus interface for backend convenience

It's a file, not a patch :)

Comment 39 Milan Crha 2013-02-14 19:17:02 UTC

Comment on attachment 236078 [details]
EDataBookDirect header file

#ifndef __E_DATA_BOOK_DIRECT_H__

please remove those leading underscores from both sides.

Comment 40 Milan Crha 2013-02-14 19:18:12 UTC

Review of attachment 236079 [details] [review]:

OK

Comment 41 Milan Crha 2013-02-14 19:20:30 UTC

Review of attachment 236080 [details] [review]:

Nothing blocking for commit.

::: addressbook/backends/file/e-book-backend-file.c
@@ +1622,3 @@
+
+	priv = E_BOOK_BACKEND_FILE_GET_PRIVATE (backend);
+	priv->base_directory = g_strdup (config);

What if this is called multiple times? Like multiple clients accessing the book from multiple processes in the Direct Access Mode?

Comment 42 Milan Crha 2013-02-14 19:21:47 UTC

Review of attachment 236081 [details] [review]:

Looks fine

Comment 43 Milan Crha 2013-02-14 19:25:53 UTC

Review of attachment 236083 [details] [review]:

I guess this part will need a little polishing in the future, but nothing to be afraid of right now.

::: addressbook/libedata-book/e-data-book-view.c
@@ +833,3 @@
 	}
 
+	if (view->priv->send_uids_only == FALSE) {

I'd make it rather if (!view->priv->send_uids_only), though it's rather a personal preference. Comparing to FALSE is fine, comparing to TRUE is not (which you do not do here).

Comment 44 Milan Crha 2013-02-14 19:26:17 UTC

Review of attachment 236083 [details] [review]:

I guess this part will need a little polishing in the future, but nothing to be afraid of right now.

::: addressbook/libedata-book/e-data-book-view.c
@@ +833,3 @@
 	}
 
+	if (view->priv->send_uids_only == FALSE) {

I'd make it rather if (!view->priv->send_uids_only), though it's rather a personal preference. Comparing to FALSE is fine, comparing to TRUE is not (which you do not do here).

Comment 45 Milan Crha 2013-02-14 19:26:37 UTC

Review of attachment 236075 [details] [review]:

Overall fine. The patch doesn't show where the error codes came, I suppose they were moved during the file reorganization.

::: addressbook/libebook/e-book-client.c
@@ +275,3 @@
+				   gpointer user_data)
+{
+  GSimpleAsyncResult *result;

tabs please

Comment 46 Milan Crha 2013-02-14 19:32:07 UTC

Review of attachment 236084 [details] [review]:

Basically fine.

::: addressbook/libebook/e-book-client-view.h
@@ -71,3 @@
-	E_BOOK_CLIENT_VIEW_FLAGS_NONE           = 0,
-	E_BOOK_CLIENT_VIEW_FLAGS_NOTIFY_INITIAL = (1 << 0),
-} EBookClientViewFlags;

Where did these go?

Comment 47 Milan Crha 2013-02-14 19:33:35 UTC

Comment on attachment 236085 [details]
Test with concurrent views active

Not a patch :)

Comment 48 Milan Crha 2013-02-14 19:37:30 UTC

OK, I checked them all, (some even twice, I'm sorry, the bugzilla reported them as Draft, while it was already published). The attached patches seem a bit inconsistent, but basically fine. Feel free to commit, with minor changes I suggested above. IF some questions are obsolete (comment #46 for example) due to the mentioned inconsistency, then OK, no problem with that. As I said, I tries some basic book functionality from within evolution and it works, thus I suppose the flags didn't disappear, they are just in a different file, moved in a different patch/commit.

Comment 49 Tristan Van Berkom 2013-02-15 10:00:19 UTC

I'm pasting the commit log of the individual commits which led up to this below.

Each commit should follow eachother and build properly one after the
other (making it possible to bisect if anything did go wrong).

(Except for the minor vala build fixup which follows immediately after
the libebook / libebook-contacts library splitup, that commit only
fixes the vala build after the split, though).

I did address all of the stylistic comments which were made, also to
answer some questions (and make a couple comments):

  o EBookClientViewFlags was migrated into libebook-contacts library,
    this is because EDataBookView uses that type directly

  o EBookClientError was also migrated to libebook-contacts library,
    this is perhaps less than ideal, the only thing(s) holding this
    outside of the libebook library is a newly introduced function
    in e-data-book.c: data_book_convert_to_client_error().

  o The g_strdup() in EBookBackendFile's implemntation of
    EBookBackend->configure_direct() is not a problem as an EBookClient
    receives it's own copy of the backend (so that cannot be called
    multiple times, also note that that is never called on the server
    side, the server side provides the EDataBookDirect configuration
    and the client side backend uses that info to configure itself
    for direct access).

    I did consider creating some global hash table of EDataBooks to
    be shared (where possible) by the same process in the case
    multiple direct read access clients are created to access the
    same book, the complexity of doing that however proved to not
    be worth the effort (it could be considered though as a future
    improvement).

Here is the list of relevant commits:

commit 4fe9f583fc00be7f613467343aa3d12e808c04d1
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 18:34:05 2013 +0900

    Added concurrent view test: test-client-view-operations.c
    
    This tests concurrent access to EBookClientView, 20 threads each
    race to receive notifications from the same addressbook and the
    test asserts that all 5 contacts in the test addressbook are found.
    
    This is particularly interesting with DEBUG_DIRECT set (Direct Read Access),
    as it ensures that concurrent direct reads of the addressbook do not
    cause any lockups.

commit 8d80b9c69269c2723a572c09ea4ed2d2bfa5d63d
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 18:23:25 2013 +0900

    e-test-server-utils: Added DEBUG_DIRECT env var
    
    This is a temporary workaround to test Direct Read Access,
    a later patch will ensure all tests check both Direct/Normal read access
    modes.

commit 093d7900df82edc6dea261d1faabed1680c7ea67
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 18:26:14 2013 +0900

    EBookBackendFile: Add support for Direct Read Access.
    
    This patch simply implements the added backend vfuncs to advertize
    and configure itself in direct read access mode.

commit 87657065a524d9374eb2ca0cea65d94480dca2e8
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 18:15:18 2013 +0900

    Added Direct Read Access support to EBookClient
    
    Currently the only added api is e_book_client_connect_direct_sync().
    
    If the backend associated to the given ESource does not support direct
    read access, or the backend fails to open for any reason, then the
    client falls back to regular read access.
    
    EBookClientViews created from an EBookClient that is in direct read
    access mode differ in functionality, they only receive UID notifications
    over D-Bus and fetch the contact data directly.

commit ac48694373648fca68a87f2f829d65b2560a4e99
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 18:02:59 2013 +0900

    EDataBook / EDataBookView: Added Direct Read Access APIs / support
    
    EDataBook:
      Now can be created by the client and offers read only APIs for
      reading back contact data directly from a client side instantiated
      backend
    
    EDataBookView:
      Now use a special "x-evolution-uids-only" fields of interest in order
      to reduce D-Bus traffic, this makes it possible for EBookClientView
      to fetch contact data directly upon receival of notifications.

commit 384d6f5684e2b83b8582642867cc8809c26ae3c8
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 17:54:24 2013 +0900

    EBookBackend: Added apis for backends to support Direct Read Access
    
    e_book_backend_get_direct_book:
      A backend can return an EDataBookDirect from here to indicate
      that it supports direct read access, the properties of the EDataBookDirect
      are used by the client to access the book directly
    
    e_book_backend_configure_direct:
      A backend opened in direct read access mode can be configured with
      the string that it's server side counterpart provided, this string
      belongs to the backend; for the local file backend it is used as
      a path to ensure the same SQLite DB is opened by the client as the
      one running in the server.

commit 4be7d7511a36adbd28f4a067862d5c742714dd8d
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 17:49:13 2013 +0900

    Added EDataBookDirect & generated D-Bus counterpart.
    
    The EDataBookDirect is the server side wrapper allowing backends
    to report information about thier direct read access support.

commit 510b1f7c85a94b1c6a51f54e1cb2d45cad207d7b
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 18:46:05 2013 +0900

    Fixing broken vala makefile.
    
    My last commit pulled back in the ECalendar vala api, which is for
    some reason removed in master, fixed this by re-removing it from
    the vala Makefile.am.

commit 70fa6d5902dea98811648967f1a031229a83cdf6
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 17:26:30 2013 +0900

    Splits libebook into two libraries.
    
    This clears the way for Direct Read Access APIs to be integrated,
    since EBookClient now needs to access EDataBook apis, a common
    library is needed to share all the addressbook related objects
    related to EContact so that it can be used by both the server
    side addressbook backends and EDataBook, as well as EBookClient.
    
    Essentially this properly splits things up and avoids a nasty
    circular dependency issue.

commit e387605c40c987213ea1a5d68879fd8dbf858397
Author: Tristan Van Berkom <tristanvb@openismus.com>
Date:   Fri Feb 15 17:35:55 2013 +0900

    Fix EDataBook error conversions to include E_CLIENT_ERROR_OUT_OF_SYNC
    
    This causes the test-client-write-write test to pass again.