GNOME Bugzilla – Bug 740879
SQlite Query Plan reduces performance of banshee
Last modified: 2016-12-29 18:14:40 UTC
Sqlite uses a new query plan starting with 3.8.6. This leads to an unusable banshee, because suddenly query to fill CoreCache take 50 seconds. Using the likelihood function in Sqlite this can be optimized. I'll attach a patch to fix this issue.
Created attachment 291765 [details] [review] Using UNLIKELY Statement to improve the performance
Not sure if it is possible to backport this also to Banshee 2.6, that would make Banshee usable on fedora again.
Thanks for this patch ! I'm affected by the issue, tracked it down to those queries, but then I never had time to find a solution. The UNLIKELY function was added in SQLite 3.8.1, and we currenlty only require sqlite >= 3.4. I'd have no problem bumping the dependency in git master, but I think doing it in the 2.6 branch would make harder to get this fix in distros currrently released. Do you think there would be a way to fix this issue with something else than UNLIKELY ?
I don't really now sqlite well enough to understand their optimizations. I found some first hints on Joins, that on some joins it would optimize, on some not. But no Join in this query. We could perhaps use it in git master, and people could perhaps backport the fix, if they have the new sqlite installed or we release a version just with this fix. Because distros need this fix only if the serve sqlite above 3.8.6
(In reply to comment #3) > The UNLIKELY function was added in SQLite 3.8.1, and we currenlty only require > sqlite >= 3.4. (In reply to comment #4) > We could perhaps use it in git master, and people could perhaps backport the > fix, if they have the new sqlite installed or we release a version just with > this fix. Because distros need this fix only if the serve sqlite above 3.8.6 What about using UNLIKELY conditionally, if SQLite version found is higher(or equal) to 3.8.1, use UNLIKELY, don't use it if not? With such a patch we could backport it to the stable branch (and later, in git master, we could simply raise the SQLite dependency, to simplify the code).
You think about a compile time decision or in runtime? How do we check Version Numbers efficiently? One can get a string from HyenaSqliteConnection:ServerVersion, but how to compare the string, quite complicated. So if anyone else, has an idea how to do that nicely. Feel free.
Oh I did find a nice way of doing it: SELECT sqlite_version ()>'3.8.9'; and you get 1 or 0;
But I don't have enough time at the moment.
Created attachment 292128 [details] [review] Add a condition on the UNLIKELY Statement This adds a conditional option to the UNLIKELY Statement. It also needs the foregoing patch. I suggest master only uses the first patch, while we backport this patch together with the other one for older versions.
Review of attachment 292128 [details] [review]: Made a mistake on the Version. New Patch comming.
Created attachment 292130 [details] [review] Add the correct condition on the UNLIKELY Statement
Hey Samuel thanks for the work! Small review: first, can you merge both patches in one, please? + var support_unlikely = connection.Query<Boolean> ("SELECT sqlite_version () >= '3.8.1'"); If you do it this way, this query would be run many times, how about doing it just once and storing it in a static field/property? + WHERE "+(support_unlikely?"UNLIKELY(":"")+"CoreCache.ModelID = {1}"+(support_unlikely?")":"") + @" AND Note: we use spaces normally to separate these things. Also, for this case, it's better to leave the parenthesis always there, and place a {4} token, which you can then replace with "" or "UNLIKELY" at the end of the String.Format() call.
Hey Andrés, I thought to keep the patches separate, as git master I think should not get the second patch. They are only run 4 times, for each Model once, so I thought it's no use to put them in a static field of the Model and I didn't want to change the DBConnection Object for such a static assertion. It already supports ServerVersion, but comparing Version Strings myself was to tedious.
(In reply to comment #13) > Hey Andrés, > > I thought to keep the patches separate, as git master I think should not get > the second patch. We normally don't do things this way. Master gets all the patches. It is the branches which may not get cherry-picked patches from master. So the correct way to do it is merge your 2 patches in 1. And then after we have applied that one to both branches, do a 2nd one which hardcodes "UNLIKELY" always, and raises SQLite version dependency on configuration phase, which will only be committed to master. > They are only run 4 times, for each Model once, so I thought it's no use to put > them in a static field of the Model Still, you're repeating the same line of code 4 times, surely the query can be put in a common place. > and I didn't want to change the DBConnection Object for such a static assertion. Why not? > It already supports ServerVersion, but comparing Version Strings myself was to tedious. True, but you could place the 'connection.Query<Boolean> ("SELECT sqlite_version () >= '3.8.1'");' line of code inside a property called UnlikelySupport or something.
Ok, I fixed this requests. Now we have a Patch against Hyena and one against Banshee. I'm traveling right now and I weren't able to build Banshee because I don't have the correct gstreamer bindings here ;). But building just the modules where the patch changes something worked. But perhaps still start Banshee after applying the patches. Sorry for that. Not sure how this submodule stuff works.
Created attachment 292181 [details] [review] A merged Patch with the corrections suggested It needs also the Hyena with the other patch applied
Created attachment 292182 [details] [review] Adds LikelihoodSupport Property to Database Connection
Managed to check the patches now, (back at home). They apply and work. I think bug #740879 is another person affected by the bug.
Sorry thought about bug #741274
*** Bug 741274 has been marked as a duplicate of this bug. ***
Comment on attachment 292181 [details] [review] A merged Patch with the corrections suggested > From 3e7a2b71db52012693c8fdf7cd56183872e91104 Mon Sep 17 00:00:00 2001 > From: Samuel Gyger <samuel@gyger.at> > Date: Sat, 29 Nov 2014 10:26:54 +0100 > Subject: [PATCH] Database: Hints for the query planner of sqlite. bgo#740879 Take a look at our convention for our commit messages please (we place the bug number at the end, but with parenthesis) > > Starting with sqlite3 3.8.6 it uses a new query plan. > To get back the performance on filling the CoreCache Table, > this commit provides hints to sqlite (using the UNLIKELY statement) > To not raise the dependency on sqlite we conditionaly use UNLIKELY > --- > .../Banshee.Collection.Database/DatabaseAlbumArtistListModel.cs | 2 +- > .../Banshee.Collection.Database/DatabaseAlbumListModel.cs | 2 +- > .../Banshee.Collection.Database/DatabaseArtistListModel.cs | 2 +- > .../Banshee.Collection.Database/DatabaseFilterListModel.cs | 5 ++++- > .../Banshee.Collection.Database/DatabaseQueryFilterModel.cs | 4 ++-- > .../Banshee.Collection.Database/DatabaseYearListModel.cs | 2 +- > 6 files changed, 10 insertions(+), 7 deletions(-) > (...) > diff --git a/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseFilterListModel.cs b/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseFilterListModel.cs > index 018a0f4..0fc510b 100644 > --- a/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseFilterListModel.cs > +++ b/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseFilterListModel.cs > @@ -124,7 +124,10 @@ namespace Banshee.Collection.Database > "{0}.{1} AND CoreTracks.TrackID = {0}.{2}", > FilteredModel.JoinTable, FilteredModel.JoinPrimaryKey, FilteredModel.JoinColumn) > : "CoreTracks.TrackID", > - filtered ? GetFilterFragment () : "" > + filtered ? GetFilterFragment () : "", > + connection.LikelihoodSupport > + ? "UNLIKELY" > + : "" I've rather placed these 3 lines in the same line, to follow the same pattern as with the "filtered" line. > ); > } > > diff --git a/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseQueryFilterModel.cs b/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseQueryFilterModel.cs > index a4c00fb..20ba42f 100644 > --- a/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseQueryFilterModel.cs > +++ b/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseQueryFilterModel.cs > @@ -56,10 +56,10 @@ namespace Banshee.Collection.Database > { > this.field = field; > this.select_all_fmt = select_all_fmt; > - > + Whitespace change here, I've removed it. > ReloadFragmentFormat = @" > FROM CoreTracks, CoreCache{0} > - WHERE CoreCache.ModelID = {1} AND CoreCache.ItemID = {2} {3} > + WHERE {4}(CoreCache.ModelID = {1}) AND CoreCache.ItemID = {2} {3} > ORDER BY Value"; > > QueryFields = new QueryFieldSet (query_filter_field); > diff --git a/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseYearListModel.cs b/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseYearListModel.cs > index b58d1df..5f96d2f 100644 > --- a/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseYearListModel.cs > +++ b/src/Core/Banshee.Services/Banshee.Collection.Database/DatabaseYearListModel.cs > @@ -47,7 +47,7 @@ namespace Banshee.Collection.Database > FROM (SELECT MIN(CoreTracks.TrackID) AS TrackID, CoreTracks.Year FROM CoreTracks GROUP BY CoreTracks.Year) AS CoreTracks > WHERE CoreTracks.Year IN > (SELECT CoreTracks.Year FROM CoreTracks, CoreCache{0} > - WHERE CoreCache.ModelID = {1} AND > + WHERE {4}(CoreCache.ModelID = {1}) AND > CoreCache.ItemID = {2} {3}) > ORDER BY Year"; > } > -- > 1.9.3 (In reply to comment #16) > Created an attachment (id=292181) [details] [review] > A merged Patch with the corrections suggested > > It needs also the Hyena with the other patch applied If you do this change in BansheeDbConnection instead of HyenaSqlConnection, then we don't need to split this fix in two patches. I've just done this. So I've committed the improved patch, with your name, and pushed it. Can you now create a patch that raised the dependency? (which will be committed only to master) Thanks!
Comment on attachment 292182 [details] [review] Adds LikelihoodSupport Property to Database Connection (In reply to comment #17) > Created an attachment (id=292182) [details] [review] > Adds LikelihoodSupport Property to Database Connection > From 94497b43c15650f9715aa51593512ea3496c30c8 Mon Sep 17 00:00:00 2001 > From: Samuel <samuel@gyger.at> > Date: Fri, 5 Dec 2014 14:17:24 +0100 > Subject: [PATCH] SqliteDatabase] Property for Likelihood Support > > Sqlite added the Possibility to give Hints > to the Query Planner by using LIKELIHOOD, > UNLIKELY and LIKELY. The support starts > with Version 3.8.1. > --- > Hyena.Data.Sqlite/Hyena.Data.Sqlite/HyenaSqliteConnection.cs | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/Hyena.Data.Sqlite/Hyena.Data.Sqlite/HyenaSqliteConnection.cs b/Hyena.Data.Sqlite/Hyena.Data.Sqlite/HyenaSqliteConnection.cs > index 51f89bd..7ab53ef 100644 > --- a/Hyena.Data.Sqlite/Hyena.Data.Sqlite/HyenaSqliteConnection.cs > +++ b/Hyena.Data.Sqlite/Hyena.Data.Sqlite/HyenaSqliteConnection.cs > @@ -106,6 +106,17 @@ namespace Hyena.Data.Sqlite > } > > public string ServerVersion { get { return Query<string> ("SELECT sqlite_version ()"); } } > + > + private bool? likelihood_support = null; > + public bool LikelihoodSupport { No need to be public here, internal is enough (fixed this in a second commit, because I just realised now..., sorry) > + get { > + if (likelihood_support.HasValue) { > + return likelihood_support.Value; > + } else { > + likelihood_support = Query<bool> ("SELECT sqlite_version () >= '3.8.1'"); > + return likelihood_support.Value; > + } Next time you need to do a lazy property like this, take in account the following: there's no need to write two return statements :) I've fixed this in the commit that is pushed: https://github.com/GNOME/banshee/commit/fd3f08ec702f3a38918e0a96d2fc01d74af64ee4#diff-3659df45f6bc44e5496a87b6b3b7b2d3R61 Thanks for your work Samuel! (BTW, I've also pushed this work to the stable branch)
Hi everyone! I just discovered this bug here via the Fedora bug report. I posted a similar (if not duplicate) bug under 740166 several weeks ago. Could anybody look into it, if it is the same bug? I don't really know and for me, the problem fixed with RedHat Bug 11611844 (https://bugzilla.redhat.com/show_bug.cgi?id=1161844) is not yet fixed. Thanks for your help!
(In reply to comment #23) > Could anybody look into it, if it is the same bug? Hey Philip, there's already a fix committed for this bug, so the best way to know if it's a duplicate or not, is testing the last banshee master (or the last stable branch) and check if it fixes the problem for you. Thanks
(In reply to comment #23) > I don't really know and for > me, the problem fixed with RedHat Bug 11611844 > (https://bugzilla.redhat.com/show_bug.cgi?id=1161844) is not yet fixed. Did you try the new packaged banshee from fedora-testing? See here how to install it: https://bugzilla.redhat.com/show_bug.cgi?id=1161844#c16
Hi Samuel! (In reply to comment #25) > Did you try the new packaged banshee from fedora-testing? > See here how to install it: > https://bugzilla.redhat.com/show_bug.cgi?id=1161844#c16 Thanks! The testing package worked out fine! Banshee is fast again. I will close the other bug.
*** Bug 740166 has been marked as a duplicate of this bug. ***
The fix does not address the root problem, which is badly written SQL. Banshee executes this query, which on my Thinkpad T420 with a fast SSD takes 77681 ms: INSERT INTO CoreCache (ModelID, ItemID) SELECT 971, CoreTracks.TrackID FROM ( SELECT MIN (CoreTracks.TrackID) AS TrackID, CoreTracks.Year FROM CoreTracks GROUP BY CoreTracks.Year ) AS CoreTracks WHERE CoreTracks.Year IN ( SELECT CoreTracks.Year FROM CoreTracks, CoreCache WHERE CoreCache.ModelID = 55 AND CoreCache.ItemID = CoreTracks.TrackID ) ORDER BY Year; Rewriting the query in a sensible way makes it execute in 62 ms: INSERT INTO CoreCache (ModelID, ItemID) SELECT DISTINCT 971, min (CoreTracks.TrackID) FROM CoreTracks, CoreCache WHERE CoreTracks.TrackID = CoreCache.ItemID AND CoreCache.ModelID = 55 GROUP BY CoreTracks.Year; Checked results with sqlite browser. There are more queries like this one, but they can all be rewritten quite easily. No need to fiddle with the execution plans (which will probably break again on the next sqlite update), no need to test for sqlite versions, no need to depend on cutting-edge sqlite versions, just write sensible SQL. Regards
Is this bug resolved or still needs optimization in SQlite ..?
using Arch Linux x64 3.18.2-2-ARCH I received an sqlite update - SQLite version 3.8.8.1 2015-01-20 16:51:25 Banshee is now working properly!
*** Bug 748444 has been marked as a duplicate of this bug. ***
*** Bug 748432 has been marked as a duplicate of this bug. ***
It should be noted that 748432 is a current issue with Banshee 2.6 on Ubuntu 15.04, in which 3.8.7.4 is standard
If I am currently running 2.6, what is the best way for me to get this fix so that I can actually use my music again? The discussion above mentioned backporting to 2.6, but it was not apparently done. I would prefer to not have to roll back sqlite, because then I would have to always remember to not update it. My music library is currently unusable (but videos work fine). If possible, I would also prefer to not be on an unstable version of Banshee, but if that's the only way, then so be it - but for a mission-critical bug like this, I am suprirsed that it hasn't been backported, or at least the dependencies for Banshee 2.6 to be modified to put a cap on the sqlite version.
Oh sweet jeebus yes. With 2.6.2-ubuntu5 that I just installed a few minutes ago, the speed issue is fixed. Thank you so much!
This bug is back (in banshee master) with the latest SQLite (3.10.0). Reason is the check whether SQLite supports UNLIKELY "SELECT sqlite_version () >= '3.8.1'" since '3.10.0' is NOT >= '3.8.1' compared as strings. Suggested patch attached.
Created attachment 318944 [details] [review] test for newer SQLite version
Hi Roderich, I'm extremely sorry for the delay, see inline: (In reply to Roderich Schupp from comment #36) > This bug is back (in banshee master) with the latest SQLite (3.10.0). > Reason is the check whether SQLite supports UNLIKELY > > "SELECT sqlite_version () >= '3.8.1'" > > since '3.10.0' is NOT >= '3.8.1' compared as strings. That is a really nice catch. I thought that relying on this kind of comparison was hooking on some SQLite magic to compare version numbers, but I guess I was wrong. (In reply to Roderich Schupp from comment #37) > Created attachment 318944 [details] [review] [review] > test for newer SQLite version That's a much better approach than the previously committed patch indeed. However, given that nowadays most distros ship versions higher than 3.8.1 anyway, I've decided that the simpler approach is to raise our dependency. I've committed the change to master and stable branches. (In reply to Marcello Perathoner from comment #28) > The fix does not address the root problem, which is badly written SQL. > > Banshee executes this query, which on my Thinkpad T420 with a fast SSD takes > 77681 ms: > ... > Rewriting the query in a sensible way makes it execute in 62 ms: > ... > There are more queries like this one, but they can all be rewritten quite > easily. No need to fiddle with the execution plans (which will probably > break again on the next sqlite update), no need to test for sqlite versions, > no need to depend on cutting-edge sqlite versions, just write sensible SQL. Marcello, that's certainly great insight from you. But given that this bug is so ancient, and in the end only focused on this LIKELY usage as a solution, let's close it, and we can discuss your fixes in this other bug moving forward: https://bugzilla.gnome.org/show_bug.cgi?id=771896 Thanks for your patience everyone. This problem has been fixed in our software repository. The fix will go into the next software release. Once that release is available, you may want to check for a software upgrade provided by your Linux distribution.
this bug appeared again in Linux Mint 18.1 after I updated my mint to this version the sql queries take too long (1-2 minutes), banshee version 2.6.3 sqllite version 3.11.0