GNOME Bugzilla – Bug 725929
Long database pauses
Last modified: 2014-10-24 00:30:23 UTC
I've noticed for some time that Geary seems to pause "randomly" when I try to perform an operation -- press Archive and it takes a while for the conversation to be removed, or click on a conversation and the spinner spins for too long to load a single email. Geary doesn't hang, it's still responding to UI events, it's just that operations take too long to complete. After investigation we've found at least three database activities that are taking an inordinate amount of time, effectively locking the database: * In places in the code we're looping to translate identifiers into rowids. With large lists, this can take way too long. Rather, we should translate all of them in a single SELECT. * The background email prefetcher uses a particular database list operation which requires an INNER JOIN. On my database this particular operation takes 8 seconds to complete. We should either optimize this query or see if the email prefetcher can be modified not to require an INNER JOIN operation. * We've discovered that our id column in the search table is not aliased to the internal rowid, meaning that id lookups are taking O(n) time. (This also takes 8 seconds, interestingly). Charles has looked into this and developed a solution to get the performance we want here.
I've created a branch at wip/725929-db-opt with most of issue #1 taken care of. There are still some loops that can be unrolled (which I've marked with comments) but I'm less concerned about those because they're less-travelled code paths.
I've merged the search table improvements into wip/725929-db-opt.
After playing around with EXPLAIN QUERY PLAN and creating various indices for a while, we think we have a solution to the last slow query in the prefetcher. It hopefully fixes the problem if we have a covering index for that query. I've pushed the solution up to the same branch. Let's run this for a while and see if we feel comfortable with it.
All these problems are fixed, squashed in commit c3e5270.
Unfortunately, this one is back. I'm still seeing (on occasion) an 8s delay on the INNER JOIN operation. I also see a pause of similar length when the search populate check verifies that the FTS is sync'd, but this only happens once after startup and isn't as onerous. Regarding the INNER JOIN, I'm now inclined to try removing the code path that requires using it.
The INNER JOIN problem is taken care of by manually performing it (see commit 28629c). That commit also performs email create/merge operations in the database in chunks with a slight delay between chunks to keep the db available for other operations. It's not a complete solution for this problem (there are still some delays when indexing at startup, for example, and the manual INNER JOIN can take a long time with large transactions), but the situation is improved by this commit.
*** Bug 734916 has been marked as a duplicate of this bug. ***
Further discussion of the issue can be found at bug #734916. I believe I tried this before, but I'm going to give it a whirl again: on wip/725929-startup is a patch that reduces the number of rows the "manual INNER JOIN" has to consider when performing the SELECT. I thought I went down this route and saw no performance differences, but it's worth trying again. I'd like to try this out on my work and home machines before making a determination about a merge; anyone else feeling adventurous can give it a try too.
I've pushed the branch to master. Testing on my systems showed some improvement, enough to keep it. The other pauses discussed above remain. Pushed to master, commit 6672f8
I attacked the startup search indexing pause today and realized we can use a technique we've used elsewhere to increase database performance. Not only is startup indexing much snappier now, the usual use case -- where no indexing must be performed -- is almost instantaneous. I'm closing this ticket, as for me this was the final database pause of concern. If anyone finds another one, please open a new ticket for the specific problem. Pushed to master, commit 20c0bd