Bug 725929 – Long database pauses

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 725929 - Long database pauses


Summary:	Long database pauses


Status:	RESOLVED FIXED

Product:	geary
Classification:	Other
Component:	engine
Version:	master
Hardware:	Other Linux

Importance:	High normal
Target Milestone:	0.10.0
Assigned To:	Geary Maintainers
QA Contact:	Geary Maintainers

URL:
Whiteboard:

Duplicates:	734916 (view as bug list)
Depends on:
Blocks:

Reported:	2014-03-08 02:36 UTC by Jim Nelson
Modified:	2014-10-24 00:30 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Jim Nelson 2014-03-08 02:36:55 UTC

I've noticed for some time that Geary seems to pause "randomly" when I try to perform an operation -- press Archive and it takes a while for the conversation to be removed, or click on a conversation and the spinner spins for too long to load a single email.  Geary doesn't hang, it's still responding to UI events, it's just that operations take too long to complete.

After investigation we've found at least three database activities that are taking an inordinate amount of time, effectively locking the database:

* In places in the code we're looping to translate identifiers into rowids.  With large lists, this can take way too long.  Rather, we should translate all of them in a single SELECT.

* The background email prefetcher uses a particular database list operation which requires an INNER JOIN.  On my database this particular operation takes 8 seconds to complete.  We should either optimize this query or see if the email prefetcher can be modified not to require an INNER JOIN operation.

* We've discovered that our id column in the search table is not aliased to the internal rowid, meaning that id lookups are taking O(n) time.  (This also takes 8 seconds, interestingly).  Charles has looked into this and developed a solution to get the performance we want here.

Comment 1 Jim Nelson 2014-03-08 03:05:36 UTC

I've created a branch at wip/725929-db-opt with most of issue #1 taken care of.  There are still some loops that can be unrolled (which I've marked with comments) but I'm less concerned about those because they're less-travelled code paths.

Comment 2 Charles Lindsay 2014-03-10 18:52:03 UTC

I've merged the search table improvements into wip/725929-db-opt.

Comment 3 Charles Lindsay 2014-03-11 18:57:42 UTC

After playing around with EXPLAIN QUERY PLAN and creating various indices for a while, we think we have a solution to the last slow query in the prefetcher.  It hopefully fixes the problem if we have a covering index for that query.  I've pushed the solution up to the same branch.  Let's run this for a while and see if we feel comfortable with it.

Comment 4 Charles Lindsay 2014-03-12 23:45:44 UTC

All these problems are fixed, squashed in commit c3e5270.

Comment 5 Jim Nelson 2014-03-24 23:49:57 UTC

Unfortunately, this one is back.  I'm still seeing (on occasion) an 8s delay on the INNER JOIN operation.  I also see a pause of similar length when the search populate check verifies that the FTS is sync'd, but this only happens once after startup and isn't as onerous.

Regarding the INNER JOIN, I'm now inclined to try removing the code path that requires using it.

Comment 6 Jim Nelson 2014-06-12 00:55:01 UTC

The INNER JOIN problem is taken care of by manually performing it (see commit 28629c).  That commit also performs email create/merge operations in the database in chunks with a slight delay between chunks to keep the db available for other operations.  It's not a complete solution for this problem (there are still some delays when indexing at startup, for example, and the manual INNER JOIN can take a long time with large transactions), but the situation is improved by this commit.

Comment 7 Jim Nelson 2014-08-19 23:18:53 UTC

*** Bug 734916 has been marked as a duplicate of this bug. ***

Comment 8 Jim Nelson 2014-08-19 23:23:04 UTC

Further discussion of the issue can be found at bug #734916.

I believe I tried this before, but I'm going to give it a whirl again: on wip/725929-startup is a patch that reduces the number of rows the "manual INNER JOIN" has to consider when performing the SELECT.  I thought I went down this route and saw no performance differences, but it's worth trying again.  I'd like to try this out on my work and home machines before making a determination about a merge; anyone else feeling adventurous can give it a try too.

Comment 9 Jim Nelson 2014-08-22 23:04:14 UTC

I've pushed the branch to master.  Testing on my systems showed some improvement, enough to keep it.  The other pauses discussed above remain.

Pushed to master, commit 6672f8

Comment 10 Jim Nelson 2014-10-24 00:30:23 UTC

I attacked the startup search indexing pause today and realized we can use a technique we've used elsewhere to increase database performance.  Not only is startup indexing much snappier now, the usual use case -- where no indexing must be performed -- is almost instantaneous.

I'm closing this ticket, as for me this was the final database pause of concern.  If anyone finds another one, please open a new ticket for the specific problem.

Pushed to master, commit 20c0bd