GNOME Bugzilla – Bug 345688
Show and album covers from local files
Last modified: 2008-01-26 20:16:16 UTC
See also Bug #307848 The current artdisplay plugin by James "Doc" Livingston will download and cache JPG covers from Amazon, storing them here: ~/.gnome2/rhythmbox/covers/artist - album.jpg Or, if the download fails, a black list file is created with extension *.rb-blist I am about to attach a patch to look for local cover files BEFORE trying Amazon. The code looks for images in the folder where the track is, including "Cover.jpg" (and variants) and the Gnome Nautilus folder image ".folder.png"
Created attachment 67867 [details] modified /rhythmbox/plugins/artdisplay/CoverArtDatabase.py This was based on the version shipped with 0.9.5, to test just replace existing file: /rhythmbox/plugins/artdisplay/CoverArtDatabase.py A lot of this could be configurable, but that is currently beyond me.
Created attachment 67876 [details] [review] Convert file to patch Convert file to a patch against CVS HEAD.
Using this I get a lot of errors like the following: Traceback (most recent call last):
+ Trace 69010
self.set_entry(sp.get_playing_entry ())
self.art_db.get_pixbuf(db, entry, self.on_get_pixbuf_completed)
pixbuf = gtk.gdk.pixbuf_new_from_file (local_filename)
Created attachment 67886 [details] [review] Revised patch to /rhythmbox/plugins/artdisplay/CoverArtDatabase.py Good catch - fixed. Never removed elements from a python list while iterating over it. Doh. Top Tip: Simple debugging - Start rhythmbox from the command line, and then any python print statements in the plugin will be output there. P.S. Beware of whitespace - I had "issues" with this, I think the current file may have had a mixture of tabs and spaces...
Created attachment 67887 [details] Revised /rhythmbox/plugins/artdisplay/CoverArtDatabase.py Just in case the patch doesn't apply cleanly (due to tabs vs. spaces etc), this is my second attempt - again based on 0.9.5
Created attachment 67893 [details] /usr/local/lib/rhythmbox/plugins/artdisplay/CoverArtDatabase.py Again, this is a modified version of the file shipped with 0.9.5 Changes (1) Use liburl.unescape() to deal with turning URL's into files, rather than: location.replace("%20"," ").replace("%2C",",").replace("%26","&") etc (2) If only a single candiate image is found in the folder, it will return that - regardless of the filename. (3) Print errors to the console to help with any debugging.
Rather than converting the URI to a path and then using the os.* functions, it would probably be better to use gnomevfs. That way everything would Just Work for things that aren't file:// URIs.
Nice idea James. I have converted my code to use gnomevfs and it seems to work nicely for local files (I don't have any remote shared music to try it on...) However, the call to gtk.gdk.pixbuf_new_from_file() expects a plain filename - it won't take this string for example: "file:///home/maubp/mp3/David%20Bowie/Low/Cover.jpg" This must be coverted to: "/home/maubp/mp3/David Bowie/Low/Cover.jpg" e.g. import gnomevfs from urllib import unquote image_location = "file:///home/maubp/mp3/David%20Bowie/Low/Cover.jpg" ... uri = gnomevfs.URI(image_location) if uri.is_local : pixbuf = gtk.gdk.pixbuf_new_from_file(unquote(uri.path)) callback (entry, pixbuf) return else : print "Do not know how to handle this: " + image_location Any ideas? We COULD copy remote files to the ~/.gnome2/rhythmbox/covers/ cache, and load from there?
Created attachment 68063 [details] /usr/local/lib/rhythmbox/plugins/artdisplay/CoverArtDatabase.py Revised fix, uses gnomevfs to search the folder/directory where the track is. This was tested with a remote sftp folder, using a public/private key setup with passphrase and rhythmbox given permisson. Local image files are loaded directly; remote image files are copied to the cache and then loaded. To prevent re-copying of remote images I have changed the ordering of the lookup to check the cache BEFORE checking the folder the image is in: (1) Cached image (2) Image embedded in track, see bug 345975 (3) Image is same folder as strack (4) Search online (currently using Amazon), unless blacklisted Note that (1) and (2) could be switched without causing any problems at the moment, BUT by putting the cache at the top of the list we make it easy to replace the image for a given album WITHOUT having to edit embedded images, or remote files. i.e. If something like bug 307848 comment 71 is added, if the user provides a specific cover (by drag and drop or otherwise) can be handled by simple putting the new image in our cache (overwritting the old image if present).
Found a small bug in the networked files support: If I've found a PNG image I copy it to the cache with a JPG extension. I suppose we could covert these, or just look for both PNG and JPG images in the cache? Messy... Doc - How do you think we should handle covers for tracks on remote servers? I could make a local-only version of the code (still using gnomevfs) if you would rather review that, and postpone any decision on the network case.
Is there any reason to copy art from network locations into the cache, rather than just getting it from there every time? While it may save some bandwidth (if the "local" folder is actually local, and not on a network), it means you can't change the folder.jpg and have it get used.
Created attachment 68112 [details] /usr/local/lib/rhythmbox/plugins/artdisplay/CoverArtDatabase.py As per Doc's request in comment 11 if we find an image in the same folder as the music track, I load it from there (across the network if remote). As gtk.gdk.pixbuf_new_from_file will only take a (local) filename, I have used a combination of gtk.gdk.PixbufLoader() which takes raw data and gnomevfs.read_entire_file() to get the data. I have tested this for both local and SFTP access. Hopefully the OS or gnomevfs will cache the image for us, otherwise when playing a remote album the same image will be requested across the network again and again. But given the user is going to be needing a lot of bandwidth for the tracks themselves, the network should be fast enough.
Created attachment 68391 [details] [review] rb-art-local.patch New patch
I tried out Peter's patch but ended up rewriting it from scratch. This version splits out the local art code into a separate class (and a separate file). It uses gnomevfs.async to prevent any possibility of blocking. I've had to rewrite the AmazonCoverArtSearch and CoverArtDatabase slightly; there was an embarrasing amount of abstraction leakage. It should now be possible to easily add other art search providers (last.fm? Google Images?). The interface is: class ArtSearch: """a best-first image search""" def __init__ (self, loader): self.loader = loader def search (self, db, entry, on_search_completed_cb, *args): """begin search async call on_search_completed_cb (self, entry, results, *args) once results is opaque""" def search_next (self): """return True and later async call on_search_completed_cb (...) once or return False""" def get_best_match_urls (self, results): """convert an opaque results object to a list of image URLs""" The blacklist applies to remote search providers, not local ones. Images are loaded by slurping (using gnomevfs.async) into a buffer and then using GdkPixbufLoader. Cached images are written back out from the pixbuf as jpegs. Currently local images *are* cached; not sure whether this is correct.
I'm glad to see you are using gnomevfs.async as that was something I had thought of since submitting my last lump of code. I also like the use of mime types rather than simple using file extentions - but does this mean opening every single file to see if its an image? Won't that be slow? e.g. A big album folder with 20+ mp3/ogg files and one or two images? Also, this check: if f.mime_type.startswith ("image/") and f.permissions & gnomevfs.PERM_USER_READ: Wouldn't it be better done as two nested if statements to allow "lazy evaluation"? i.e. if f.mime_type.startswith ("image/") : if f.permissions & gnomevfs.PERM_USER_READ: Or, if the permission check is faster, do that first: if f.permissions & gnomevfs.PERM_USER_READ: if f.mime_type.startswith ("image/") : A minor point, when matching image filenames I used a case insensitive match (thus I didn't need to have "Cover" and "cover" in my short list). I agree that setting things up ready for the possibility of other online searches is a good idea (I was trying not to make too many changes in one go).
1. I use gnomevfs.FILE_INFO_FORCE_FAST_MIME_TYPE, which restricts gnomevfs to guessing the mime type from file extension (and other metadata), not sniffing the file contents. 2. Python uses lazy evaluation between boolean operators. I doubt there'll be any difference in speed between the two orderings. 3. The file_root () function is used to force file names to lowercase for matching. Maybe it's a little unclear?
1. I spotted that after writing comment 15 but wasn't sure exactly what it did. 2. Good point, lazy evaulation for "and" and "or", but not the "&" and "|" operators. My fault - I tend to make any lazy evaulation explicit (probably to much exposure to VB). 3. So it does - I should have search the patch really... Now you have pointed it out the lower() it seems clear enough; a little comment where the comparison is done wouldn't hurt [but only if you need to make other changes]. Another couple of questions: 4. Have you tested this with remote music files, e.g. over FTP, SFTP or some other fileshare? 5. Probably a quesition for James ("Doc"). Looking at artDisplay/Loader.py it will use gnomevfs if present, and fallback on urllib if not. Can we assume gnomevfs is present? If gnomevfs is missing we could fallback on the os module for local files (and ignore remote files). Or just ignore possible image files in the same folder as the music track and just use our cache/online searches.
3. Yes, it is a little unclear. I've made some improvements. 4. Yes, over sftp. 5. Personally I'd say rhythmbox-python should depend on python-gnomevfs. Rhythmbox itself uses gnomevfs everywhere so if python-gnomevfs isn't a dependency, python plugins will have to do all sorts of kludges to find out when the files are and aren't local.
Created attachment 68447 [details] [review] rb-art-local.patch
Plenty of changes: 1. Spaghetti code in CoverArtDatabase.py rewritten as a coroutine. Should make it much easier to understand. 2. Some readability fixes in LocalCoverArtSearch.py. 3. Blacklists only apply to remote search engines. 4. Covers are not cached for local search engines, so it shows up immediately if user changes covers. 5. Blacklist syntax change: instead of a zero-byte file, now a list of failed search engines. Will make it possible to add engines between versions. Future: add more info e.g. failure timestamps? 6. Only blacklist if empty search results returned; before we were blacklisting if no search results were returned, which would blacklist if network was down. Empty search results indicates we talked to the server and they had no syitable artwork.
This looks great to me, although I had to add "folder" to the IMAGE_NAMES list, since my art is called "folder.jpg" not ".folder.jpg". I'd be happy to commit this, if no-one has any issues and you don't have more updates. (In reply to comment #18) > 5. Personally I'd say rhythmbox-python should depend on python-gnomevfs. > Rhythmbox itself uses gnomevfs everywhere so if python-gnomevfs isn't a > dependency, python plugins will have to do all sorts of kludges to find out > when the files are and aren't local. Yes, I think we should make gnome-python (which include the gnomevfs bindings) a dependency of python plugin support. I can't see many people having pygtk, but not being able to get gnome-python. (In reply to comment #20) > 5. Blacklist syntax change: instead of a zero-byte file, now a list of failed > search engines. Will make it possible to add engines between versions. Future: > add more info e.g. failure timestamps? That sounds good. It would actually be good to record where and when we got the from too - since we're any art from amazon is supposed to be deleted and redownloaded every 90 days (it's in their T&C). (In reply to comment #14) > I've had to rewrite the AmazonCoverArtSearch and CoverArtDatabase slightly; > there was an embarrasing amount of abstraction leakage. It should now be > possible to easily add other art search providers (last.fm? Google Images?). Yay! I'd been planning do this for a while, but had never gotten around to it. Now I don't have to :)
(In reply to comment #21) > This looks great to me, although I had to add "folder" to the IMAGE_NAMES > list, since my art is called "folder.jpg" not ".folder.jpg". Should this list be exposed as a setting - using GCONF or even adding a GUI for the artdisplay preferences? (In reply to comment #21) > Yes, I think we should make gnome-python (which include the gnomevfs > bindings) a dependency of python plugin support. I can't see many people > having pygtk, but not being able to get gnome-python. Does this conflict with any plans to get rhythmbox to work on non-Linux platforms, e.g. Windows as discussed on the mailing list in December 2005? I would still check this in as is for now (with your addition to IMAGE_NAMES). (In reply to comment #21) > That sounds good. It would actually be good to record where and when we > got the [cover art] from too - since we're any art from amazon is supposed > to be deleted and redownloaded every 90 days (it's in their T&C). I've filed Bug 346852 – Delete covers downloaded from Amazon after 90 days
Created attachment 68708 [details] [review] updated patch This is the patch updated to cvs, with a small change: Coroutine is now in the "common plugin classes" (plugins/rb/) which are imported in the "rb" namespace. (In reply to comment #22) > (In reply to comment #21) > > This looks great to me, although I had to add "folder" to the IMAGE_NAMES > > list, since my art is called "folder.jpg" not ".folder.jpg". > > Should this list be exposed as a setting - using GCONF or even adding a GUI for > the artdisplay preferences? In GConf sounds good; also putting the list of gconf engines to use there sounds good too. > (In reply to comment #21) > > Yes, I think we should make gnome-python (which include the gnomevfs > > bindings) a dependency of python plugin support. I can't see many people > > having pygtk, but not being able to get gnome-python. > > Does this conflict with any plans to get rhythmbox to work on non-Linux > platforms, e.g. Windows as discussed on the mailing list in December 2005? If you want to use gconf from python, we need to depend on it anyway. It contains bindings for gnomevfs, gconf, libgnome/gnomeui, bonobo and gnomecanvas. RB already works fine on non-Linux systems (like Solaris, *BSD), but as to Windows it's mostly just a matter of getting all the dependencies built and handling any X-specific stuff we do. Nothing gnome-python binds should be an issue there.
Seems like there is a small issue with os.path.unlink. Python 2.4.3 (which ships with FC5) only has os.unlink not os.path.unlink. Traceback (most recent call last):
+ Trace 69255
self._resume ()
self._continuation.next ()
self.write_blist (blist_location, blist)
os.path.unlink (blist_location)
I've committed the patch to cvs, and fixed the issue Alex mentioned. Thanks for working on it, the code is *much* saner now.
Looking over the code (now that its been committed), I noticed in function get_pixbuf from: /rhythmbox/plugins/artdisplay/artdisplay/CoverArtDatabase.py This check is made before trying ANY image source: # If unknown artist and album there is no point continuing if st_album == "Unknown" and st_artist == "Unknown": callback (entry, None) return This check makes good sense for the Amazon search, but seems premature for the local file search. For example, consider a set of tracks stored in folder with a cover image, but without any track/album/artist tags (e.g. a set of WAV files, or simply an mp3 file with no ID3 tags). In this case, st_album and st_artist would be set to "Unknown", but a local file search would find the cover art anyway. Could we remove this check, and replace it with an Amazon specific check?
*** Bug 347571 has been marked as a duplicate of this bug. ***
See also #509507