Bug 325215 – Improve startup time

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 325215 - Improve startup time


Summary:	Improve startup time


Status:	RESOLVED OBSOLETE

Product:	rhythmbox
Classification:	Other
Component:	playback
Version:	HEAD
Hardware:	Other All

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	RhythmBox Maintainers
QA Contact:	RhythmBox Maintainers

URL:
Whiteboard:

Duplicates:	325607 337778 338114 339588 (view as bug list)
Depends on:
Blocks:

Reported:	2005-12-29 12:06 UTC by Nicholas Allen
Modified:	2011-07-24 23:24 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
maybe a patch (1.50 KB, patch) 2005-12-30 09:37 UTC, Jonathan Matthew	accepted-commit_now	Details \| Review
better patch (4.55 KB, patch) 2006-01-03 23:12 UTC, Jonathan Matthew	none	Details \| Review
remember entry visibility (3.91 KB, patch) 2006-02-04 12:36 UTC, James "Doc" Livingston	committed	Details \| Review
yet another patch (7.00 KB, patch) 2006-04-25 02:51 UTC, Jonathan Matthew	committed	Details \| Review
a little bit further.. (4.02 KB, patch) 2006-05-06 04:38 UTC, Jonathan Matthew	rejected	Details \| Review
commit in chunks during db load (1.12 KB, patch) 2006-05-09 08:57 UTC, Jonathan Matthew	committed	Details \| Review
Quickstart Rhythmbox in less than 10 seconds (1.04 KB, application/octet-stream) 2006-05-10 21:09 UTC, Joel Bryan Juliano		Details
perform sync_mirror only when needed (10.01 KB, patch) 2006-06-26 10:42 UTC, James "Doc" Livingston	committed	Details \| Review
do recurses asynchronously (10.02 KB, patch) 2006-07-25 07:10 UTC, James "Doc" Livingston	none	Details \| Review
updated async recurse patch (16.40 KB, patch) 2006-08-11 15:14 UTC, James "Doc" Livingston	committed	Details \| Review

Description Nicholas Allen 2005-12-29 12:06:17 UTC

Please describe the problem:
I do not remember having noticed this problem on 0.9.0. When rb starts up it
loads the library from the XML file. The hard drive light is then on for about
30 seconds and although the rb window is displayed no songs appear during these
30 seconds. It does not seem to be parsing the tags as this takes much much
longer. I guess it is checking whether the files still exist or not. 

It would be nice if it could just immediately display the files and do this
check in the background. This would allow playback to begin immediately instead
of having to wait 30 seconds every time. If a song doesn't exist anymore and the
user tries to play it then it could just inform the user. This situation would
be rare anyway. I'm sure this is how it used to work as I never had to wait
before...

Steps to reproduce:
1. Have a large collection
2. Start rhythmbox
3. 


Actual results:
The tracks don't appear immediately but after a long delay

Expected results:
The tracks would immediately appear

Does this happen every time?
yes

Other information:

Comment 1 Jonathan Matthew 2005-12-29 12:53:39 UTC

Songs are no longer displayed immediately after the database is loaded.  Instead, we wait until we have checked that the file actually exists.  This was necessary for handling remote mounts that require gnome-vfs authentication.

I think the way the stat requests and results are queued is the problem here.  On a single CPU machine, it appears that the stat requests are all queued, then all processed, then the results are all processed, and then the songs are all added to the display.  On SMT or SMP machines, those steps are interleaved, and rhythmbox appears to start much more quickly.

Comment 2 Nicholas Allen 2005-12-29 15:04:08 UTC

Is there no better way to do this? Rhythmbox used to feel so fast and snappy and was one of the main reasons I prefer it to other media players. Is there no solution that can make this fast on an single CPU machine? Most people probably use single CPU and local files so it is a shame that they are penalised for remote mounts that they probably never use. 
Why can't rb just delay authentication until the user tries to play a song on a remote mount and then check it still exists? Itunes on windows works fine with SMB mounts and displays the tracks immediately so it must be possible somehow.

Comment 3 Jonathan Matthew 2005-12-30 09:37:17 UTC

Created attachment 56541 [details] [review]
maybe a patch

This won't make any difference to the total startup time, but it should make songs appear in the main window sooner.  I haven't made any effort to test with a cold disk cache, which is where it should actually make a difference.

Comment 4 James "Doc" Livingston 2005-12-30 13:11:26 UTC

Unfortunately it isn't safe for Rhythmbox to do *any* synchronous operation on remote files until the asynchronous stat has completed, and the easiest way to deal with it is not to display them until it is. Displaying them before it completes the file check can lead to RB blocking waiting for authentication or completely locking up.

That issue doesn't affect local files, so in theory we could display them immediately and hide then if they don't exist (the old behaviour). However there may be other parts of the code that have some subtle issues we haven't discovered yet.

Comment 5 James "Doc" Livingston 2005-12-30 13:12:24 UTC

The patch looks good, and seems to improve startup time a fair bit (for me).

Comment 6 Jonathan Matthew 2006-01-03 21:54:36 UTC

*** Bug 325607 has been marked as a duplicate of this bug. ***

Comment 7 Jonathan Matthew 2006-01-03 23:12:25 UTC

Created attachment 56740 [details] [review]
better patch

This one works much better for me.  It actually makes startup a couple of seconds slower, but the main window now appears after about 2 seconds, and loading the DB takes a further 10s or so.  It's reasonably responsive while the DB is loading, and I can start playback almost as soon as the window appears.

Comment 8 James "Doc" Livingston 2006-01-04 00:21:39 UTC

bug 323348 has a patch which makes the DB load in a thread. As with the other patches this makes RB finish loading slightly slower, but starts up much faster. It may need updating to cvs.

Comment 9 James "Doc" Livingston 2006-01-08 12:49:10 UTC

That patch looks okay to me. With a cold disc cache, this seems to improve start-up time a bit for me - probably because some of the the stats are sent to gnome-vfs earlier (or something).

Comment 10 Jonathan Matthew 2006-01-08 22:56:33 UTC

It still seems to build up a backlog of events to process.  It's better than without the patch, but when I start up with a cold cache there's still a period of 5 seconds or so where the main window is unresponsive.

Comment 11 Jonathan Matthew 2006-01-21 04:16:34 UTC

The patch from bug 323348 has been committed, so this patch is obsolete.  I'm still not completely satisfied with the cold disk cache startup time, so I'm not going to close this bug yet.

Comment 12 Alex Lancaster 2006-01-26 15:04:30 UTC

Moving to playback component.

Comment 13 James "Doc" Livingston 2006-01-31 04:13:44 UTC

Perhaps we should store the visibility in the on-disk db. Then if the file is local and was visible last time we ran, we make it initially visible (until a stat shows otherwise). That would be correct most of the time, and would alleviate the problems of waiting for stats to complete before tracks show up.

Comment 14 James "Doc" Livingston 2006-02-04 12:36:04 UTC

Created attachment 58699 [details] [review]
remember entry visibility

This patch stores entry visibility in the on-disk db. This means that local files will show up immediately (if they were visible last time RB ran), so start the time until everything shows up is greatly reduced.

Comment 15 Jonathan Matthew 2006-02-05 02:30:19 UTC

I think it'd be better to avoid saving the 'hidden' property when the value is false, but otherwise this looks like a good idea.

The other thing I think we need to look at is the rate at which we add new directory monitors.  When gnome-vfs is using gamin (rather than going directly to inotify), it looks like the buffers on the connection to gamin sometimes fill up, and rb blocks trying to write to it.  I think this is what causes the unresponsiveness I sometimes see on startup.

Comment 16 James "Doc" Livingston 2006-02-05 09:43:39 UTC

I've committed the patch to cvs, changing it so that it save hidden when false.

Slowing down the adding of directory monitors sound good if it will help.

Comment 17 Alex Lancaster 2006-02-20 11:57:15 UTC

(In reply to comment #15)

> The other thing I think we need to look at is the rate at which we add new
> directory monitors.  When gnome-vfs is using gamin (rather than going directly
> to inotify), it looks like the buffers on the connection to gamin sometimes
> fill up, and rb blocks trying to write to it.  I think this is what causes the
> unresponsiveness I sometimes see on startup.

It's possible this is what causes bug #331876.

Comment 18 James "Doc" Livingston 2006-04-09 09:25:08 UTC

*** Bug 337778 has been marked as a duplicate of this bug. ***

Comment 19 James "Doc" Livingston 2006-04-14 23:39:43 UTC

*** Bug 338114 has been marked as a duplicate of this bug. ***

Comment 20 James "Doc" Livingston 2006-04-14 23:44:59 UTC

Rhythmbox now had support for emitting profiling logging stuff, by running "strace -ttt -f -o logfile.txt rhythmbox"

You can even make nice pretty pictures with http://primates.ximian.com/~federico/docs/login-profile/plot-timeline.py

Comment 21 Jonathan Matthew 2006-04-24 22:20:27 UTC

*** Bug 339588 has been marked as a duplicate of this bug. ***

Comment 22 Jonathan Matthew 2006-04-25 02:51:23 UTC

Created attachment 64252 [details] [review]
yet another patch

I noticed we were wasting a lot of time in g_source_list_add, which sucks, because that's just a sorted list insertion.  We were calling g_idle_add twice for each entry loaded during startup.  This patch combines the two idle functions into one, combines entry lists from multiple commits into single calls to the idle function, and tweaks the way rhythmdb_idle_process_events decides whether to use a timeout or an idle function.  

This brings (warm) startup time down to about 3 seconds for me, and eliminates the case where it appeared to be adding entries to the library entry view one at a time, which was extremely slow.

Curiously, with this patch, it starts up *slower* with the main window hidden, because the main thread is mostly idle, so instead of the rhythmdb idle function emitting 2000 added entries at a time, it only does 1 or 2.

Comment 23 James "Doc" Livingston 2006-04-25 09:28:15 UTC

This reduces my warm startup time from around 6.5 cpu-sec to about 5, so +1 from me.

Comment 24 Jonathan Matthew 2006-04-25 22:02:50 UTC

Patch committed.  I wonder how much further we can take this..

Comment 25 Jonathan Matthew 2006-05-06 04:38:43 UTC

Created attachment 64902 [details] [review]
a little bit further..

This patch defers the initial query for the library source until the db load is complete.  This makes the contents of the library appear in one hit, rather than in batches.  This doesn't seem to improve the best case, but it makes startup time more consistent.

Comment 26 James "Doc" Livingston 2006-05-06 06:45:30 UTC

That makes startup slower for me, both in actual time and perceptually (since the track list shows up later). Currently the library-query is done in parallel with the db load (which is IO-heavy) and the patch makes it run in parallel with the playlist loading (which is CPU-heavy running their queries).

Comment 27 Andrew Conkling 2006-05-06 11:29:43 UTC

Pardon me for kinda jumping in here without an intimate knowledge of what happens, but I've observed something.  It seems that when I open RB, the db is immediately checked against the library, so it only shows up in chunks.  But (I imagine) that most people's libraries do not change so much as to necessitate that.

What if RB loaded the existing db (which would improve startup time), then worked on checking it (potentially more slowly, I don't know) against the library's contents.  

That may mean that the user won't be able to accurately browse the contents until they're done being verified, but that's really no different than the current situation: my list updates so heavily and jumpily that I can't really do anything until it's done loading anyway.  At least in this case you'd be able to start playing something... and again, I wager that whatever is played will most likely be a current file.

Comment 28 Jonathan Matthew 2006-05-06 23:57:54 UTC

(In reply to comment #26)
> That makes startup slower for me, both in actual time and perceptually (since
> the track list shows up later). Currently the library-query is done in parallel
> with the db load (which is IO-heavy) and the patch makes it run in parallel
> with the playlist loading (which is CPU-heavy running their queries).

Right.  I don't have many playlists where I was testing this, so I didn't notice that aspect of it.  I think a better approach would be to make the db loading code commit in chunks (RHYTHMDB_QUERY_MODEL_SUGGESTED_UPDATE_CHUNK, so it behaves the same as a query) rather than for each entry.

Perhaps we should also look at running the playlist queries sequentially, rather than in parallel?

(In reply to comment #27)
> What if RB loaded the existing db (which would improve startup time), then
> worked on checking it (potentially more slowly, I don't know) against the
> library's contents.  

This is exactly what we do now.

Comment 29 Andrew Conkling 2006-05-07 00:29:49 UTC

(In reply to comment #28)
> (In reply to comment #27)
> > What if RB loaded the existing db (which would improve startup time), then
> > worked on checking it (potentially more slowly, I don't know) against the
> > library's contents.  
> 
> This is exactly what we do now.

Ah, my apologies; I'm only observing, and I'm still running 0.9.4.

Comment 30 Jonathan Matthew 2006-05-09 08:57:13 UTC

Created attachment 65070 [details] [review]
commit in chunks during db load

Comment 31 Joel Bryan Juliano 2006-05-10 21:07:04 UTC

Hi, I made a simple script to quickstart rhythmbox by optimizing it's
XML database.

The script is just a basic optimizer for XML & HTML.
It can be used to any HTML & XML files.

I restart the system to check the difference of
non-optimized and optimized database.

I have 7814 songs, and The difference is when it's non-optimized,
rhythmbox will start loading the songs, counting 2 digits, (i.e. 40, 46,
49....7814) for 30 seconds or so, but when it's optimized, the counting
will skip to 3 digits to 4 digits in just seconds, (i.e. 146, 300,
615....7814), and rhythmbox starts much faster.

Rhythmbox will reformat the XML, so the script will be needed to run
again. I think it's good to run the script at startup because once
rhythmbox had been loaded, it'll be much faster, for both optimized and
non-optimized, so re-running the script is a good idea when the cached
memory had been cleared.

a backup of your current databases will be created (rhythmdb_backup.xml,
playlists_backup.xml), so just copy it to the rhythmdb.xml.

The databases are in $HOME/.gnome2/rhythmbox/ directory.

The script is attached in this message, (rhythmbox-quickstart)


JB

Comment 32 Joel Bryan Juliano 2006-05-10 21:09:00 UTC

Created attachment 65197 [details]
Quickstart Rhythmbox in less than 10 seconds

Rhythmbox Quickstart

Comment 33 Joel Bryan Juliano 2006-05-10 21:18:04 UTC

another quickstart is to cat rhythmdb.xml in /dev/null
so it will fills up the disk cache.

Comment 34 Andrew Conkling 2006-05-10 21:36:42 UTC

(In reply to comment #31)
> Hi, I made a simple script to quickstart rhythmbox by optimizing it's
> XML database.

If this is something that Rhythmbox can open (and more quickly!), maybe there'd be a way to optimize this so that Rhythmbox keeps it in this slimmer format?  If we could make it internal, this script would be obsolete (in a good way :).

Good find!

Comment 35 Jonathan Matthew 2006-05-10 22:07:50 UTC

Do you have any idea why this has the effect you claim it does?  I don't believe removing whitespace from the xml file will change the parser's behaviour in any significant way, and if it does, the parser needs to be fixed rather than the input.

Comment 36 James "Doc" Livingston 2006-05-11 02:23:11 UTC

The script doesn't produce any noticable improvement in startup time for me, and takes 15 seconds to run. I've never noticed XML parsing as significant on any profile I've done.

Comment 37 James "Doc" Livingston 2006-05-12 07:39:54 UTC

(In reply to comment #30)
> Created an attachment (id=65070) [edit]
> commit in chunks during db load

This looks fine to me.


(In reply to comment #28)
> Perhaps we should also look at running the playlist queries sequentially,
> rather than in parallel?

If you think it will help. I guess it could improve the single-cpu case slightly, at the expense of the multi-cpu case (where they can actually be done in parallel).

Comment 38 Jonathan Matthew 2006-05-13 07:10:58 UTC

OK, committed.

Comment 39 James "Doc" Livingston 2006-06-07 06:35:19 UTC

Since this is an open-ended "improve startup time" bug, retitling and changing to enhancement.

Comment 40 James "Doc" Livingston 2006-06-26 10:42:05 UTC

Created attachment 68026 [details] [review]
perform sync_mirror only when needed

I wrote a benchmarking test for rhythmdb loading, and profiling revealed that 15-20% of the CPU time was being spent in rhythmdb_entry_sync_mirrored. After looking closer at it, I noticed that it was being called from rhythmdb_entry_allocate, when the values were going to be changed soon after when the real ones got read from the on-disk DB.

My solution is have "dirty" flags for the values that get mirrored, and only generate the mirrored values when they are being accessed and the flag is set. As well as fixing the above case, it has other benefits because the mirrored values don't need to be created at all if the user doesn't have the columns visible in the UI (saving startup time and memory).


This patch includes the change, and the program tests/bench-rhythmdb-load. I haven't figured out why yet, but my benchmark only uses CPU in bursts - probably something to do with the "waiting for load-complete signal" code.


It is only an artificial benchmark, but loading my rhythmdb.xml file 100 times:
without patch:  53s user, 15s sys
with patch: 39s user, <1s sys

Warm-cache startup time of RB:
without patch (columns shown or hidden): 7.1s user, 0.5s sys
with patch (columns shown): 6.3s user, 0.3s sys
with patch (columns hidden): 5.4s user, 0.3s sys

Comment 41 beerfan 2006-07-04 19:15:34 UTC

(In reply to comment #4)
> Unfortunately it isn't safe for Rhythmbox to do *any* synchronous operation on
> remote files until the asynchronous stat has completed, and the easiest way to
> deal with it is not to display them until it is. Displaying them before it
> completes the file check can lead to RB blocking waiting for authentication or
> completely locking up.

I don't really understand what you mean here but the current behavior is pretty much a deal breaker for me. I have relatively few songs in my library (a little over 3000) which are located on my server and RB (0.9.3.1) takes forever (at least 2 minutes) to start. This is just ludicrous.

iTunes does not verify that songs in its database exist until they are clicked on or played and consequently loads immediately. Why would it be impossible to search for the song file ONLY when it's accessed?

Comment 42 Jonathan Matthew 2006-07-09 02:39:02 UTC

We still sometimes end up adding entries to the visible entry view in chunks of RHYTHMDB_QUERY_MODEL_SUGGESTED_UPDATE_CHUNK during startup, which is much slower than adding them thousands at a time.  I think this means the suggested update chunk size is too low, and that increasing it to perhaps 500 (as something of a middle ground) would be a good idea.

Also, the previous patch seems to have been committed, so I'm marking it as such.

Comment 43 James "Doc" Livingston 2006-07-25 07:10:47 UTC

Created attachment 69555 [details] [review]
do recurses asynchronously

Gnomevfs doesn't (currently) have any asynchonous directory operations, which means in turn rb_uri_handle_recursively block until it completes. During startup this can block the UI for a while, which appears to make it start slower.

This patch adds a rb_uri_handle_recursively_async, which currently uses a thread, and so does not block the UI. If at some point in the future gnomevfs gets async directory functions we can use those without changing rb_uri_handle_recursively_async's prototype.

Comment 44 James "Doc" Livingston 2006-08-11 15:14:08 UTC

Created attachment 70721 [details] [review]
updated async recurse patch

An update to the last patch which is more sane, and calls the callback in the main thread.

Comment 45 Jonathan Matthew 2006-08-13 09:15:18 UTC

looks OK to me.

Comment 46 Frank Russo 2006-11-02 17:23:27 UTC

Good afternoon,

I wrote the comment below, realized I sounded like a pompous arse, decided not to submit it, re-read it, realized that some of the points are somewhat valuable, and decided to submit it after all. Please ignore the parts where I sound like an arse.

I have a library of around 50,000 songs. The following occurs when I start Rhythmbox 0.9.6.

1: frusso@frusso-desktop:~$ rhythmbox
2: 20 seconds later, the gui comes up.
3: For 20 seconds, it counts my artists/songs (about 1300/50000)
4: For 3 minutes, one of my processors is at 100% (all IO wait) and the Gui is inaccessible (doesn't redraw if I drag another window over top of it). App is doing property_model_inserts
5: For another 3 minutes were in the same position as above, except the Gui will redraw. Playback is not available (RB is doing not/modified checks)
6: When it's done loading, it consumes 220MB of system memory
7: Everything works fine

Other info:
Songs are on a local disk 500GB SATA disk.

My 2 cents:
I am confused/disturbed by the above indications that you "Load the database". While I haven't had a chance to peruse the source yet, I sincerely hope that you are not loading the entire play list into system memory. Though looking at the property_model_inserts it's doing during this time, it appears that this is exactly the problem.

Databases should not have to "load". They're databases. If you "load" them, then they're not a DB anymore. Gui's should only read what they're required to display, and applications SIUD (select,insert, update, delete) changes directly to the database.

What effort would you be needed to use gnome-db/libdga for playlist/metadata handling? Startup would be instant, memory footprint would be greatly reduced.

Thank you for your time,
Frank Russo

Comment 47 Jonathan Matthew 2006-11-02 21:49:43 UTC

(In reply to comment #46)
> I am confused/disturbed by the above indications that you "Load the database". 
> While I haven't had a chance to peruse the source yet, I sincerely hope that
> you are not loading the entire play list into system memory.  Though looking at
> the property_model_inserts it's doing during this time, it appears that this is
> exactly the problem.

This is exactly what we do.  This is apparently not a problem for 99% of the user base.  Do you have a better alternative that doesn't destroy performance for smaller collections?

(there was another paragraph here, but you told me to ignore it)

> What effort would you be needed to use gnome-db/libdga for playlist/metadata
> handling?  Startup would be instant, memory footprint would be greatly reduced.

Substantial portions of the application would need to be redesigned to realise any benefit from this at all.

Comment 48 beerfan 2006-11-02 21:55:25 UTC

I recently discovered avahi and daap (specifically the mt-daapd aka firefly media server daemon) and the fact that rhythmbox supports daap. Accessing music via daap shares, even music on the same system, is so much faster (load the same set of songs in 3 seconds that takes close to a minute to load otherwise) and easier than letting rhythmbox load a "library" that it seems rather impractical to do so in reality. I might come to a different conclusion if rhythmbox had any sort of tag editing capability but it doesn't (that I can find).

I agree that Rhythmbox really needs of a database. I have far less files than Frank (a few thousand) and rhythmbox is still slow to start for me.

If nothing else, user expectations or requirements should be examined further. I expect a program to start up and be usable in a few seconds at most.

Comment 49 Nicholas Allen 2006-11-02 22:10:38 UTC

The ultimate way to work is to have a cache (say 2000 entries but the limit would not be hard coded) and when the user scrolls out of this area move the cached area around the database doing intelligent SQL queries to get the next set of data. The SQL queries should be done in a separate thread and the GUI can display "waiting for data" in the browser. The query would be executed after the user paused scrolling for a while. If the number of songs is very large then you don't want a scroll bar but page up/down and on the fly movement to search criteria as it is typed on the keyboard. On huge collections a scroll bar is nearly useless anyway...

Comment 50 Jonathan Matthew 2006-11-02 22:12:36 UTC

Rhythmbox loads my ~12000 song library and is usable in about 5 seconds.  Loading that same set of songs over DAAP takes much longer.  Also, tag editing has been enabled by default for the past few releases.

I don't think that switching to an SQL database will actually help anyone who doesn't have an exceptionally large collection.  I'm always happy to be proven wrong.  'proof' and 'uninformed speculation' are two entirely different things, of course.

Comment 51 Frank Russo 2006-11-03 10:26:29 UTC

I had never realized that 50K songs was an "exctionally large collection".  I'd consider 5M to be "exctionally large", but 50K is simply 5 years of my wife and I purchasing CD's, ripping Vinyl, and digitizing some reel-to-reel.

Generating 50K songs for regression testing is a reletively simple chore.  Just use MBrola and id3ed in a 'for' loop.  Something like:

for FOO in seq 1,1,50000 ; do MBrola ${FOO} ${FOO}.wav && oggenc -2 ${FOO}.wav && id3ed -s ${FOO} -i ${FOO}.ogg ; done

You now have 50,000 unique songs, with unique filenames, with unique id3 info.

On the topic at hand.  The playlist store should do just that.  Store the playlist.  It can be a DB, an xml file, a flat file, it doesn't matter.  The Gui should display the data.  The Gui need not hold more data than what it is currently displaying.  The application connects the Gui to the Store.  It does not need the entire store, just what the Gui asks it for, some prefetching, and the logic for things like sorts, change detection, and error handling.

Whether 500, 5000, 50000, or 5million songs, the application memory footprint and startup time should not differ (much).

I appologize for taking up (more of) your time.

Frank Russo

Comment 52 Andrew Conkling 2006-11-03 14:57:12 UTC

(In reply to comment #50)
> Rhythmbox loads my ~12000 song library and is usable in about 5 seconds. 
> Loading that same set of songs over DAAP takes much longer.  Also, tag editing
> has been enabled by default for the past few releases.

Well, you're apparently getting much better mileage out of RB than I am. I have about ~10000 songs and it takes much longer for me. (FWIW, mine are mostly FLAC.)

> I don't think that switching to an SQL database will actually help anyone who
> doesn't have an exceptionally large collection.  I'm always happy to be proven
> wrong.  'proof' and 'uninformed speculation' are two entirely different things,
> of course.

Well, if you could guide some of us people experiencing a lag in producing some hard numbers (or whatever would be needed), I think we'd be able to speak a bit more specifically here.

Comment 53 Andrew Conkling 2006-11-03 15:08:44 UTC

(In reply to comment #51)
> On the topic at hand.  The playlist store should do just that.  Store the
> playlist.  It can be a DB, an xml file, a flat file, it doesn't matter.  The
> Gui should display the data.  The Gui need not hold more data than what it is
> currently displaying.  The application connects the Gui to the Store.  It does
> not need the entire store, just what the Gui asks it for, some prefetching, and
> the logic for things like sorts, change detection, and error handling.

FWIW, let's take a lesson from Banshee here. They seem to reload the library
only when displaying it, and it always takes a minute or more (bug #325968),
even when switching from the library to a playlist to the library in rapid
succession. So, from my perspective (just "uninformed speculation"... more like
"observation") RB storing the library (in memory?) goes a long way to making
the UI responsive like you expect.

Comment 54 Andrew Conkling 2006-11-03 15:12:12 UTC

(In reply to comment #52)
> (In reply to comment #50)
> > Rhythmbox loads my ~12000 song library and is usable in about 5 seconds. 
> > Loading that same set of songs over DAAP takes much longer.  Also, tag editing
> > has been enabled by default for the past few releases.
> 
> Well, you're apparently getting much better mileage out of RB than I am. I have
> about ~10000 songs and it takes much longer for me. (FWIW, mine are mostly
> FLAC.)

And FWIW, my options are either NFS or DAAP; my music is on a network share. Inevitably, NFS produces a bottleneck; IIRC there were some bugs on exactly this problem.

Comment 55 James "Doc" Livingston 2006-12-03 12:14:12 UTC

Last patch committed to cvs.

Comment 56 Steve Fosdick 2007-01-12 14:28:26 UTC

Earlier when discussing the possibility of using a database engine to hold the tag database rather than an XML file that is loaded into memory it was suggested that a database would make startup time better at the expense of a responsive user interface, i.e. that there is a compromise between the two.

On one of my PCs my music collection is 10,579 tracks (54.1Gb).

Starting rhythmbox, after the window first appears on screen in takes between 5 and 7 seconds to get to the point where the status message at the bottom of the screen seems stable at the full 10,579 songs and during this time rhythmbox seems to be using lots of CPU.

Next there is a wait of aproximately 1 minute 45 seconds which appears to be mostly spent waiting for I/O. During this time the user interface is not responsive.

When everything is finally loaded the user interface is usable and responsive.

I also have music player daemon (mpd) installed on the same PC with the same music collection. The backend, i.e. mpd which is responsible for playing the songs and managing the database (but no GUI) is able to open/load the database and start playing in less than a second.

Starting gpmc a GNOME user interface to mpd it takes about the same length of time for the initial window to appear and there is no significant delay when browsing the via the user interface. The responsiveness of the GUI is good.

So to summarise the responsiveness of the GUI is similar between gmpc/mpd and rhythmbox but rythmbox has a long startup time which gmpc/mpd completely avoids.

Comment 57 Christophe Fergeau 2007-01-12 14:35:07 UTC

(In reply to comment #56)
> 
> So to summarise the responsiveness of the GUI is similar between gmpc/mpd and
> rhythmbox but rythmbox has a long startup time which gmpc/mpd completely
> avoids.
> 

How does that relate to using an xml file/using an sqlite (or whatever) file? Where are the traces/logs proving that most of the delays you're mentioning comes from reading/parsing the xml file and querying the parsed xml data stored in memory?
The 1min45 delay might be rhythmbox browsing through all the files to figure out which ones have been changed.

Comment 58 Frank Russo 2007-01-14 15:53:01 UTC

>The 1min45 delay might be rhythmbox browsing through all the files to figure
>out which ones have been changed.

Why would/should the check for new/modified/deleted files lock the UI?

Frank

Comment 59 Steve Fosdick 2007-01-15 13:21:14 UTC

(In reply to comment #57)

I will look into trying to generate a log that says exactly what rythmbox is doing during these delays.

What I was trying to establish in the last message though is that it is possible to do something similar to what ryhthmbox does, i.e. maintain a file on disk which contains the metadata from audio files and load it into memory at startup faster than rhythmbox currently does.  My message was not intended to blame a particular technology for the slow startup or blame the particular implementation used by rhythmbox.

To answer the other point, if I guess that the 5 to 7 seconds with the CPU busy is loading the XML file into memory and the 1m45s is cross checking this against the filesystem then there is still a significant performance gap as mpd does the filesystem crosscheck/update on the same music collection in 35s.

Steve.

Comment 60 Steve Fosdick 2007-01-15 13:53:50 UTC

(In reply to comment #57)

> Where are the traces/logs proving that most of the delays you're mentioning
> comes from reading/parsing the xml file and querying the parsed xml data
> stored in memory?

32585 1168867367.103843 access("MARK:             [rhythmdb.c rhythmdb_load_thread_main 2353] START loading db", F_OK <unfinished ...>
32585 1168867372.588512 access("MARK:         [rhythmdb.c rhythmdb_load_thread_main 2378] END loading db", F_OK) = -1 ENOENT (No such file or directory)

So according to the above trace on my collection loading the database takes about 5.5 seconds.

For some strange reason rhythmbox seems to have stopped doing the filesystem check part - I still have library watch enabled and a valid library location.

Steve.

Comment 61 Jonathan Matthew 2011-07-24 23:24:49 UTC

how long is a short piece of string?