GNOME Bugzilla – Bug 330766
Incorrectly downloading lugradio podcasts
Last modified: 2006-09-18 23:06:56 UTC
Please describe the problem: downloading lugradio feeds: http://lugradio.org/episodes.ogglow.rss downloads the episodes as: lugradio-someepisode.ogg?podcast running rhythmbox from dapper version: 0.9.3.1-0ubuntu1 Steps to reproduce: 1. 2. 3. Actual results: Expected results: Does this happen every time? Other information:
Is the problem that it downloads it as "lugradio-someepisode.ogg?podcast" instead of "lugradio-someepisode.ogg", which means applications that go on the file extension don't see it? Fixing that will probably require us to think of a good file-naming algorithm. Simply chopping off anything after a "?" won't work too well, since some feeds use episodes like "download.php?episode=4".
https://lists.ubuntu.com/archives/ubuntu-users/2006-February/066931.html I suppose the main concern is that having a file of the form foo.ogg?podcast is probably going to mess up on a lot of portable digital audio players.
The only thing I can think of doing would be to append the "right" extension, if it isn't already there. However: a) figuring out what the "right" extension is, is Very Hard. b) everything else I've seen (browsers, a few other podcast apps) also download it with that file name - presumably because of (a). If we wanted to implement this, we would need "deep typefinding" of files (telling us that a file is audio/mpeg data wrapped in ID3 tags), and some way of mapping this to a file extension. I don't know if this is easily doable.
*** Bug 340463 has been marked as a duplicate of this bug. ***
Created attachment 70054 [details] [review] fix local podcast file names in most cases This uses gnome_vfs_async_get_file_info to figure out the local file name. This follows redirects, so most of the time we get a useful unique local file name. This fixes most cases except the one originally reported in this bug, where the query string isn't used to construct a redirect, but (I'm guessing) just for statistics on the server. I have a few (hackish) ideas on how we can handle that, but I'd like to get some testing for this before I do anything silly. It also doesn't handle cases like bug 321991, where the query string contains the complete redirect URL.
Looks okay to me.
Works for me (with the exception that some files like the lugradio podcast files don't remove the "?podcast" because that's still in the resolved filename).
Patch #70054 really fixes bug #340463 rather than the actual problem reported in this bug (the extra "?podcast" in the lugradio filenames).
Created attachment 70239 [details] [review] combined patch This includes the patch from bug 321991 (it touches the same code and I'm too lazy to keep them apart), and also checks for the query string at the end of the local file name, removing it if it matches the query string from the original url in the feed, so it'll remove '?podcast' from the lugradio episodes.
(In reply to comment #9) > Created an attachment (id=70239) [edit] > combined patch Works for me for 302 redirects like: http://leoville.tv/podcasts/sn.xml and the original URL reported here, it removes the trailing stuff after the file extension: http://lugradio.org/episodes.ogglow.rss and it avoids duplicates for feeds like: http://feeds.wnyc.org/onthemedia
Any reason that this can't be committed now?
committed.
*** Bug 356604 has been marked as a duplicate of this bug. ***