After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 648592 - Freeze when importing collections larger than a few thousand songs (queries on CoreCache table, smartplaylist related?)
Freeze when importing collections larger than a few thousand songs (queries o...
Status: RESOLVED WONTFIX
Product: banshee
Classification: Other
Component: Importing
2.0.0
Other Linux
: Normal major
: ---
Assigned To: Banshee Maintainers
Banshee Maintainers
gnome[unmaintained]
: 661125 662561 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2011-04-25 09:22 UTC by Dave Wales
Modified: 2020-03-17 09:25 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Skip Smart Playlist update if import is running (1.37 KB, patch)
2011-06-10 00:19 UTC, Mackenan Grassi
none Details | Review
Allows OptimizeDatabase to be called more than simply at startup, and uses that during imports to prevent slowdowns (2.21 KB, patch)
2011-06-11 22:17 UTC, Mackenan Grassi
none Details | Review

Description Dave Wales 2011-04-25 09:22:36 UTC
Hey all, I have reported a bug similar to this in the past but i would like to say congrats on fixing the previous bug before 2.0 (it was a crash on import problem, all gone in 2.0.0).

System:
4ghz quad core 8gb DDR3 Ubuntu 11.04 (up to date) and banshee that is in the stock repos (2.0.0)

Bug:
Importing larger collections (Say larger than three or four thousand) causes banshee to freeze periodically and takes an overly long period of time (in relation to other media players).

I saw the thread for the buglist in the "forums" which detailed this bug, however it was insanely out of control and full of flaming.

Some ideas i thought of to fix this were:

1. Automatically disable all extensions that can slow the process down while doing the initial scan and subsequent scans. And tell the user something to this effect with a yes no box with a remember my choice tick box. 

2. Break the database up every five thousand (5000) items (or a smaller number if plausible say one thousand (1000)) and have a folder of database files instead of one almighty db file. I dont know if this would actually help, my extent of knowledge is only word-press admin grade. 

This would however need a master db outlining the first/last song that should be in each db to check for corruption and to make it work in the real world i imagine would be hard. However, i feel it makes sense as an option for another reason. That is, to refresh the database one would need only delete the master db file and all the other smaller db files would be rewritten or kept as they are depending on which portion of the db is wrong or needs updating.

B.t.w. after 3 hours i am sitting here at nine thousand (9000) songs out of forty thousand (40 000) imported if you were wondering about how severe the problem is, on a slower computer or over a network i imagine it would be implausible to import such a library.

If you wish to trial any of these ideas or other fixes i will happily be a test dummy.


Thanks,
Dave.
Comment 1 Dave Wales 2011-04-25 09:35:56 UTC
This can be reproduced at any time by deleting the banshee.db file.
Comment 2 Jason Smith 2011-05-17 21:18:00 UTC
This bug occurs on my system too. There are a series of SQL requests that are made to sqlite in order to recalculate the smart playlists periodically throughout the import process. These begin taking HUGE amounts of time after importing about 5k music files. It also seems these are going off on a timer ever 5 or so seconds, so the longer they take, the worse things get. Once banshee gets to the point of these queries taking 5+ seconds, that is pretty much all it seems to do at that point. Importing of music basically stops.
Comment 3 Jason Smith 2011-05-17 21:38:00 UTC
Dave,

On the off chance that you are still living without your music library. Whenever it starts getting too slow, just kill banshee and start it again, then rescan the library. You get a little further each time. After 5 or 6 iterations all my music was imported.
Comment 4 Dave Wales 2011-05-18 14:58:41 UTC
Hey Jason, as of 2.0.0 and including 2.1.0 i don't think it is possible to crash banshee if left to run its course on import. This however is incorrect if you keep clicking etc while its doing the sql requests you talked about as it just decides its taking to long and offers to close it for you. By morning it had imported all songs to the library, so its not like it doesn't do it, its just really slow as we have experienced.
Comment 5 Jason Smith 2011-05-18 15:08:57 UTC
On my system it never finished by morning. I let it run for 18 hours once...
Comment 6 Mackenan Grassi 2011-06-09 01:32:27 UTC
I can confirm what Jason said about the SQL requests. There's a conditional statement inside ImportTrack (found in DatabaseImportManager.cs) that attempts to limit the Smart Playlist updating to every 250 files or 20% of the import, whichever is more. This *should* make a maximum of 5 reloads during import. In reality, the Smart Playlists are reloaded every 250 files. I suspect this is because the queue is files to import is being updated as files are being imported and the SQL requests slow down the queuing of files, so the calculation of 20% is never greater than 250. 

I commented that conditional block out and was able to import a little over 7000 songs in 6 min 11 seconds (starting from a blank database). Only about 1 min 30 sec of that were actually importing the media files. The rest of the time was for updating the Smart Playlists. Doing the same import with the block of code uncommented resulted in only 3250 songs being imported in the same time. The first minute 1500 songs were imported, the next minute had the count up to 2250, then each minute after that only added another 250 songs.

Obviously, the attempt to minimize the Smart Playlist reloads is failing. On top of that, even if the code worked as intended it would make that 6 minute import something closer to 15 minutes (at a guess). I think the best option would be to delay the Smart Playlist reloading until all media files are imported. The second best option would be to ensure that all media files are queued up for importing before the first one even gets processed, possibly by implementing a Pause and Resume in QueuePipeline.cs (found in Hyena.Collection, it's a class that ImportManager inherits). Thoughts?
Comment 7 Mackenan Grassi 2011-06-10 00:19:28 UTC
Created attachment 189592 [details] [review]
Skip Smart Playlist update if import is running

It turned out that delaying the Smart Playlist update was much easier than I thought it would be. The Smart Playlists are updated by catching the TracksAdded event. I simply added a check to see if there's an import in process and, if so, skip the normal RefreshAndReload.

With this patch, an import of 7000 songs (starting from nothing) took 6 min 36 sec. Most of that time is updating the Smart Playlists. During the import, UI is fully functional and any media files already imported can be browsed, played, etc.
Comment 8 Andrés G. Aragoneses (IRC: knocte) 2011-06-10 10:01:18 UTC
Thanks for looking into this!

(In reply to comment #7)
> Created an attachment (id=189592) [details] [review]
> Skip Smart Playlist update if import is running
> 
> It turned out that delaying the Smart Playlist update was much easier than I
> thought it would be. The Smart Playlists are updated by catching the
> TracksAdded event. I simply added a check to see if there's an import in
> process and, if so, skip the normal RefreshAndReload.

With this patch, do the Smartplaylist reload at least at the end of the import?

> With this patch, an import of 7000 songs (starting from nothing) took 6 min 36

Compared to how long when not using the patch? It would be useful to put this info in the commit message too.

> sec. Most of that time is updating the Smart Playlists. During the import, UI
> is fully functional and any media files already imported can be browsed,
> played, etc.

This patch, being a one-liner, has high chances of being backported to 2.0 stable BTW. Let's hope a maintainer reviews it soon!
Comment 9 Mackenan Grassi 2011-06-10 17:39:21 UTC
Yes, the Smart Playlists reload at the end of the import. Without the patch, the same import took probably about an hour.

I'm starting to think that the real problem lies elsewhere though. Once the import is done, each Smart Playlist takes about a minute to reload as a result of two SQL requests that each take just under 30 seconds. These same SQL requests take just as long each subsequent time they're called (as a result of clicking on the Smart Playlist to make it the active view, or as a result of a TracksChanged event -- such as changing the rating, or letting a song play through). This same problem can be seen either with or without the patch. The funny thing is, these same SQL requests take no time at all if you simply close Banshee after the import and reopen it. Could it be a problem with sqlite after handling all those inserts? It's definitely a problem that needs much further inspection.

Right now, I think my patch just covers up the issue a bit. It doesn't fix whatever the real problem is.
Comment 10 Mackenan Grassi 2011-06-10 21:06:48 UTC
Here are the problematic SQL requests. I'm only listing two because each request is repeated for each Smart Playlist.

[4 Warn  14:04:36.285] Executed in 29270ms SELECT COUNT(*), SUM(CoreTracks.FileSize), SUM(CoreTracks.Duration) FROM CoreTracks CROSS JOIN CoreArtists,CoreAlbums, CoreSmartPlaylistEntries WHERE CoreArtists.ArtistID = CoreTracks.ArtistID AND CoreAlbums.AlbumID = CoreTracks.AlbumID  AND  CoreSmartPlaylistEntries.TrackID = CoreTracks.TrackID AND CoreSmartPlaylistEntries.SmartPlaylistID = 8 
[4 Warn  14:05:05.437] Executed in 29151ms 
                    DELETE FROM CoreCache WHERE ModelID = 59;
                        INSERT INTO CoreCache (ModelID, ItemID) SELECT 59, CoreTracks.TrackID FROM CoreTracks CROSS JOIN CoreArtists,CoreAlbums, CoreSmartPlaylistEntries WHERE CoreArtists.ArtistID = CoreTracks.ArtistID AND CoreAlbums.AlbumID = CoreTracks.AlbumID  AND  CoreSmartPlaylistEntries.TrackID = CoreTracks.TrackID AND CoreSmartPlaylistEntries.SmartPlaylistID = 8  ORDER BY 
                        CoreArtists.NameSortKey ASC,
                        CoreTracks.Year ASC,
                        CoreAlbums.TitleSortKey ASC,
                        CoreTracks.Disc ASC,
                        CoreTracks.TrackNumber ASC
Comment 11 Andrés G. Aragoneses (IRC: knocte) 2011-06-10 21:37:25 UTC
Interesting.

WRT the first one I wonder if:

- it would be fast to do the cross join with CoreSmartPlaylistEntries.SmartPlaylistID as the first element in the WHERE?

- would it be faster to replace "*" with "'x'"?

- why checking file size and duration? I thought this sum was only done for the whole library, not for playlists.

WRT the second one:

- why add an ORDERBY clause for an INSERT clause?

WRT both:

- why join with CoreAlbums and CoreArtists if there is no selection for them in the FROM and no specific filter? (I guess we could replace "CoreAlbums.AlbumID = CoreTracks.AlbumID" with "CoreTracks.AlbumID IS NOT NULL", or just get rid of that in case the element is not nullable and we're sure the referential integrity is respected... but are we using ForeignKeys? maybe not).

- not sure why using CROSS JOINs instead of INNER ones...
Comment 12 Mackenan Grassi 2011-06-11 22:17:06 UTC
Created attachment 189741 [details] [review]
Allows OptimizeDatabase to be called more than simply at startup, and uses that during imports to prevent slowdowns

I managed to get the import time for 7000 tracks to drop down to 1 min 20 seconds. The key to the problem was that none of these SQL requests were a problem after Banshee restarted. Obviously there must be something Banshee is doing on startup that fixed whatever problem there is after all those imports. The answer is that it calls BansheeDbConnection.OptimizeDatabase, which executes "ANALYZE". I made a quick change (without my prior patch) to change OptimizeDatabase to a public method and call it from DatabaseImportManager.ImportTrack (right before the PrimarySource.NotifyTracksAdded call). With that simple change the import completes faster than ever, the UI is usable during the import, and Smart Playlists are updated regularly throughout the import.
Comment 13 Mackenan Grassi 2011-06-20 15:04:54 UTC
Dave or Jason, any chance you could test this patch?
Comment 14 olivier dufour 2011-08-19 08:38:07 UTC
We need test to see if the optimize process do not take more time than it gain.
So need benchmark from people with big list of file.
Comment 15 Age Bosma (IRC: Forage) 2011-10-07 15:46:54 UTC
*** Bug 661125 has been marked as a duplicate of this bug. ***
Comment 16 Age Bosma (IRC: Forage) 2011-10-07 16:17:22 UTC
The same smart playlist freezes can be observed when connecting a device like an iPod or Android phone, as mentioned in bug #661125. Having just 250 tracks on the device is already sufficient though.

Also, the same playlist update issues occur when deleting tracks in addition to the update rate of the playlists when importing.

I've applied the patch but unfortunately the issue is not resolved for smart playlists when loading devices.

Running the optimize process does not address the real problem imho, or it least it's just a part of the problem. Updating should also be prevented until an import or delete of multiple tracks is finished.
Comment 17 Andrés G. Aragoneses (IRC: knocte) 2011-10-24 00:56:16 UTC
*** Bug 662561 has been marked as a duplicate of this bug. ***
Comment 18 Andrés G. Aragoneses (IRC: knocte) 2011-10-24 01:01:50 UTC
Do these freezes happen if you guys delete all smart playlists?
Comment 19 Trey Raymond 2011-10-24 21:42:42 UTC
I get it with no user created smart playlists, sql statements still hang querying those tables.  There are ten 'system' smart playlists that are repopulated if deleted, see the dupe ticket for my full info
Comment 20 André Klapper 2020-03-17 09:25:47 UTC
Banshee is not under active development anymore and had its last code changes more than three years ago. Its codebase has been archived.

Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect
reality. Please feel free to reopen this ticket (or rather transfer the project
to GNOME Gitlab, as GNOME Bugzilla is being shut down) if anyone takes the
responsibility for active development again.
See https://gitlab.gnome.org/Infrastructure/Infrastructure/issues/264 for more info.