GNOME Bugzilla – Bug 346949
[PATCH] The importing process is too slow
Last modified: 2008-03-25 23:52:48 UTC
Please describe the problem: I wanted to improve the speed of the importing process so I made some changes. Steps to reproduce: Actual results: Expected results: Does this happen every time? Other information:
Created attachment 68601 [details] [review] the patch This patch makes it considerably faster. I don't expect it to get committed in this state, it still requires some changes, but I'd like to hear some comments. The main reason for the increased speed is that the songs are added to the database in batches of 100 (that number should be tweaked). They are wrapped in a transaction and the QueueTransaction() call to the database thread doesn't block the calling thread. Most of the other code is there just to support this change. The ImportManager thread instantiates LibraryTrackInfo objects. That used to happen in a PlayerInterface event handler (for an event emitted from the ImportManager). The LibraryTrackInfo constructor automatically does the song metadata parsing. Once 100 LibraryTrackInfo objects have successfully been created they are added to the batch queue for another ImportManager thread to process them (that doesn't necessarily have to happen on a separate thread, perhaps it should all be in one thread?). That thread adds them to the database and adds their URIs to a temporary table used only during the importing process. Then the songs' TrackIDs are retrieved using those URIs from the temporary table and each of the LibraryTrackInfo objects is notified of their TrackID so they can emit their event. Is ImportManager supposed to be a singleton? It looks like one but the constructor is public. I removed the LoadFromDatabase() call in the LibraryTrackInfo constructor. It looked unnecessary. And it seems like it never found anything in the database anyway because it looks like the IDataReader never gets closed, otherwise that would have probably created some problems for the database thread. I replaced the Monitor.Pulse() calls in the database thread with an AutoResetEvent. When you send a Pulse signal to a thread that isn't yet waiting, it's like it never happened. So I think there is a possibility another thread might add a command to the queue and send the Pulse signal just after the database thread has finished dequeuing it but before it started waiting for the signal. And it wouldn't know a signal had been sent because it wasn't waiting at the time. So that command won't get executed until a new one is added. An AutoResetEvent remains signaled until a single waiting thread is released (it doesn't matter if there was no thread actually waiting for the signal when it was set). I marked the boolean flags that are accessed by different threads as volatile. I added the option of specifying the thread priority in Banshee.Base.ThreadAssist and all of the ImportManager threads are created with the lowest priority. I also added some comments in the code about stuff I was unsure of. There is one noticeable problem. I tried not to change the events that are emitted during the importing process and now the user interface is flooded with events. Every time a batch of songs is added to the database 100 events are emitted. I think perhaps the events should change to signal that “a certain number of songs have just been added” instead of “a song has just been added”. Then it would be possible to emit just one event per batch.
I like a lot of the concepts in this patch. I will review much more closely in the coming days. Thanks a lot Marin!
Oh wow. Marin: my apologies that your patch and excellent overview has been sitting here for some months now. It was marked as "reviewed" so I never saw it in the patch list. I'm setting priority high on this and will review it and see what's still applicable. The good news is that the import process is rather fast these days. I can import 5200 songs in about 3 minutes now (down from around 10). However, I'm sure there are still great pieces in your patch we can still use.
I'm sorry, I haven't been following Banshee lately. I'm barely familiar with the changes in the past few months and I know nothing about the changes in the code. But if I can somehow help you with this patch, let me know.
Created attachment 83419 [details] [review] banshee-queuedsqlite-autoreset.patch I started reviewing this (very nice) patch. I'm going to bring it in sync and split it up into smaller parts. Here's the first one: Convert all Monitor code to AutoResetEvent code. Should prevent very obscure deadlocks, minor improvements in the dispose method (no more spinlock). QueuedSqliteDatabase.cs | 37 ++++++++++++++++++++++--------------- 1 file changed, 22 insertions(+), 15 deletions(-)
Created attachment 83465 [details] [review] banshee-threadassist-priority.patch Add overloads to ThreadAssist.Spawn to pass a ThreadPriority. ChangeLog | 5 +++++ src/Core/Banshee.Base/Utilities.cs | 10 ++++++++-- 2 files changed, 13 insertions(+), 2 deletions(-)
Updating patch statuses. Will further review this when I find some time.
Trunk's importing code is largely rewritten, and seems to perform well. Closing this.