After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 484768 - Use totem's playlist parser for Podcast parsing
Use totem's playlist parser for Podcast parsing
Status: RESOLVED FIXED
Product: rhythmbox
Classification: Other
Component: Podcast
unspecified
Other Linux
: Normal normal
: ---
Assigned To: RhythmBox Maintainers
RhythmBox Maintainers
Depends on:
Blocks: 355984 379932 427975 489520 489856 489874 498797
 
 
Reported: 2007-10-08 15:37 UTC by Bastien Nocera
Modified: 2008-01-05 06:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
rb-use-newer-plparser.patch (5.56 KB, patch)
2007-10-22 16:08 UTC, Bastien Nocera
needs-work Details | Review
rb-use-newer-plparser-2.patch (28.54 KB, patch)
2007-10-23 16:35 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-3.patch (31.03 KB, patch)
2007-10-23 17:04 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-4.patch (33.50 KB, patch)
2007-10-24 16:14 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-5.patch (35.47 KB, patch)
2007-10-24 16:53 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-6.patch (37.07 KB, patch)
2007-10-24 17:38 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-7.patch (39.54 KB, patch)
2007-10-25 10:01 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-8.patch (44.73 KB, patch)
2007-11-30 17:19 UTC, Bastien Nocera
none Details | Review
rb-use-newer-plparser-9.patch (54.62 KB, patch)
2007-11-30 18:12 UTC, Bastien Nocera
none Details | Review

Description Bastien Nocera 2007-10-08 15:37:03 UTC
The feature is in current Totem trunk.
Comment 1 Bastien Nocera 2007-10-22 16:08:45 UTC
Created attachment 97641 [details] [review]
rb-use-newer-plparser.patch

First patch to use the new playlist parser all over.
Comment 2 Bastien Nocera 2007-10-23 16:35:13 UTC
Created attachment 97738 [details] [review]
rb-use-newer-plparser-2.patch

Port the podcast parsing to using totem-pl-parser. Adds support for Atom podcasts.
Comment 3 Bastien Nocera 2007-10-23 17:04:15 UTC
Created attachment 97740 [details] [review]
rb-use-newer-plparser-3.patch

Add support for itpc schemes, and itunes podcasts.

I need to:
- move the podcast detection code to the URI helpers
- make the shell detect and handle RSS and Atom feeds, so they get added to the Podcast manager instead of as playlists (eek!)
- once that's added, make sure we can't create duplicate feeds
Comment 4 Bastien Nocera 2007-10-24 16:14:11 UTC
Created attachment 97787 [details] [review]
rb-use-newer-plparser-4.patch

Updated to support OPML files.
Comment 5 Bastien Nocera 2007-10-24 16:53:47 UTC
Created attachment 97789 [details] [review]
rb-use-newer-plparser-5.patch

Add rb_uri_could_be_podcast() and use it.
Comment 6 Bastien Nocera 2007-10-24 17:38:18 UTC
Created attachment 97791 [details] [review]
rb-use-newer-plparser-6.patch

- Fix detection for itms: URLs (needs Totem trunk)
Only the command-line integration left to do...
Comment 7 Bastien Nocera 2007-10-25 10:01:21 UTC
Created attachment 97831 [details] [review]
rb-use-newer-plparser-7.patch

Passing http://planet.lugradio.org/hashlugradio/releases/episodes/feed/atom/ on the command-line doesn't work as expected (adds an iRadio). So I added a hack for it in rb-file-helpers.c.

This adds command-line support. It will also switch to the Podcast source when added. Once installed, you should be able to go to:
http://www.itunes.com/podcast?id=207870198
and click on "I have iTunes" to make it load directly in Rhythmbox

It's ready to commit as far as I'm concerned, please test thoroughly.
Comment 8 Bastien Nocera 2007-10-29 13:49:02 UTC
The only thing that seems to be missing, is better error reporting when an itms, or itunes music store URL is passed that isn't a Podcast... Also, would it make sense to have the "might be podcast" function to totem's playlist parser?
Comment 9 Jonathan Matthew 2007-11-04 08:59:06 UTC
This doesn't work for most of the podcasts I'm subscribed to, as they are mostly detected as application/xml (rather than application/rss+xml), and totem_pl_parser_add_xml_feed just returns TOTEM_PL_PARSER_RESULT_ERROR.  I added a dumb implementation of that function - read the feed, if is_rss() { add_rss() }, etc. - to get past that.

A number are also detected as text/plain, so they don't work at all:
http://dot.cult.bg/notitle/wp/?cat=10&feed=rss2
http://www.npr.org/rss/podcast.php?id=35
http://thestreamguide.com/xml.php?channelid=58
http://www.mashit.com/?feed=rss2&category_name=podcast

I haven't tried anything relating to atom or opml.
Comment 10 Bastien Nocera 2007-11-04 13:10:53 UTC
Which version of Totem are you using, and which version of shared-mime-info?

I've committed a fix to shared-mime-info to up the priority of the RSS magic, so it doesn't get detected as an XML file. I could parse all those podcasts with test-parser, in totem 2.21.1.

Could you let me know if those work now?
Comment 11 Jonathan Matthew 2007-11-04 14:45:09 UTC
I'm using current svn totem, shared-mime-info 0.22.  I've copied the rss-related bits from shared-mime-info cvs (and run update-mime-database) and it doesn't seem to have helped.  I'll debug further tomorrow, hopefully.
Comment 12 Jonathan Matthew 2007-11-05 03:05:12 UTC
Setting the 'force' property on the parser seems to help.  With that and the current rss bits from shared-mime-info (and without my add_xml_feed hack), almost all the podcasts I use are identified and parsed correctly.

Feeds served from feedburner.com include too much xml junk at the start for the magic in shared-mime-info to work:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/rss2enclosuresfull.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http:
//feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><rss ...

which is apparently 267 characters, but the magic only checks the first 256.  Ugh.

The old parser handles the 'smart quote' entities (&#xxxx; encoding) in this feed properly, but the totem parser leaves them encoded:

http://www.mashit.com/?feed=rss2&category_name=podcast

Comment 13 Bastien Nocera 2007-11-05 16:32:58 UTC
(In reply to comment #12)
> Setting the 'force' property on the parser seems to help.  With that and the
> current rss bits from shared-mime-info (and without my add_xml_feed hack),
> almost all the podcasts I use are identified and parsed correctly.
> 
> Feeds served from feedburner.com include too much xml junk at the start for the
> magic in shared-mime-info to work:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <?xml-stylesheet
> href="http://feeds.feedburner.com/~d/styles/rss2enclosuresfull.xsl"
> type="text/xsl" media="screen"?><?xml-stylesheet href="http:
> //feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css"
> media="screen"?><rss ...
> 
> which is apparently 267 characters, but the magic only checks the first 256. 
> Ugh.

But they should be recognised as XML, and then tested. is_xml_feed in totem's plparser needs to be implemented.

Do you have a test URL please?

> The old parser handles the 'smart quote' entities (&#xxxx; encoding) in this
> feed properly, but the totem parser leaves them encoded:
> 
> http://www.mashit.com/?feed=rss2&category_name=podcast

Fixed:
2007-11-05  Bastien Nocera  <hadess@hadess.net>

        * src/plparse/xmllexer.c: (lexer_get_token),
        (lexer_decode_entities): Fix decoding of decimal entities,
        useful for Podcasts (see #484768)
Comment 14 Jonathan Matthew 2007-11-06 06:29:26 UTC
Here's one: http://feeds.feedburner.com/dppodcasts
Comment 15 Bastien Nocera 2007-11-06 10:13:45 UTC
(In reply to comment #14)
> Here's one: http://feeds.feedburner.com/dppodcasts

Fix in totem:
2007-11-06  Bastien Nocera  <hadess@hadess.net>

        * src/plparse/totem-pl-parser-podcast.c:
        (totem_pl_parser_add_xml_feed): Implement, fixes parsing
        of feeds from Feedburner.com (see #484768)

Better?
Comment 16 Jonathan Matthew 2007-11-06 10:44:02 UTC
Thanks, they're working now.
Comment 17 Jonathan Matthew 2007-11-06 13:50:18 UTC
It's not exactly a real-world test case, but I just hacked up a perl script to create an OPML file from rhythmdb.xml, then used that to add the feeds to a new db file, and everything came through correctly.
Comment 18 Bastien Nocera 2007-11-06 14:07:53 UTC
(In reply to comment #17)
> It's not exactly a real-world test case, but I just hacked up a perl script to
> create an OPML file from rhythmdb.xml, then used that to add the feeds to a new
> db file, and everything came through correctly.

Great. Would OPML save support in totem-pl-parser be useful then?
Comment 19 Jonathan Matthew 2007-11-07 12:38:50 UTC
I think it would be useful.

As a slightly more relevant test, I found that the OPML from here: http://share.opml.org/toppodcasts/ was handled correctly.
Comment 20 Bastien Nocera 2007-11-07 13:27:23 UTC
(In reply to comment #19)
> I think it would be useful.

Ok. I'll leave it to bug #489856 though, as I'd rather get this code released before putting more work into it.

> As a slightly more relevant test, I found that the OPML from here:
> http://share.opml.org/toppodcasts/ was handled correctly.

Cool!
Comment 21 Bastien Nocera 2007-11-30 17:19:59 UTC
Created attachment 99916 [details] [review]
rb-use-newer-plparser-8.patch

- Set the locale in the test-parser, so we can output UTF-8
- Fix stupid bug that made the podcasts never update, even when forced
Comment 22 Bastien Nocera 2007-11-30 18:12:08 UTC
Created attachment 99919 [details] [review]
rb-use-newer-plparser-9.patch

With the gecko plugin from bug 489874 integrated.
Comment 23 Jonathan Matthew 2007-12-01 10:41:54 UTC
The only remaining question I have about this is about dependencies.  If we depend on totem 2.21+, will it still be practical to build rhythmbox trunk on recent desktop distribution releases (fedora 8, ubuntu gutsy, etc.)?

I've seen from some recent totem changes that you're planning to split the playlist parser out into a separate module.  I definitely wouldn't have a problem with this going in once that's done.  Is this likely to happen soon?
Comment 24 Bastien Nocera 2007-12-03 13:16:34 UTC
2007-12-03  Bastien Nocera  <hadess@hadess.net>

        * configure.ac: Require totem-pl-parser 2.21.4

        * lib/rb-file-helpers.c: (rb_uri_could_be_podcast):
        * lib/rb-file-helpers.h: Add helper function to detect
        whether it could be a podcast from the URI passed

        * data/rhythmbox.schemas: Handle feed://, itpc:// and itms:// URIs

        * podcast/Makefile.am:
        * podcast/rb-podcast-manager.c:
        (rb_podcast_manager_subscribe_feed),
        (rb_podcast_manager_parse_complete_cb),
        (rb_podcast_manager_thread_parse_feed),
        (rb_podcast_manager_save_metadata),
        (rb_podcast_manager_insert_feed):
        * podcast/rb-podcast-parse.c: (playlist_metadata_foreach),
        (playlist_started), (playlist_ended), (entry_metadata_foreach),
        (entry_parsed), (rb_podcast_parse_load_feed),
        (rb_podcast_parse_channel_free):
        * podcast/rb-podcast-parse.h: Use totem-pl-parser to parse Podcast
        files (including Atom files) (Closes: #355984), as well as OPML files

        * shell/rb-playlist-manager.c: (handle_playlist_entry_cb),
        (playlist_load_started_cb), (playlist_load_ended_cb),
        (rb_playlist_manager_parse_file), (_is_dirty_playlist),
        (rb_playlist_manager_save_playlists):
        * shell/rb-shell-player.c: (rb_shell_player_class_init),
        (rb_shell_player_handle_eos_unlocked), (rb_shell_player_init),
        (playlist_entry_cb), (open_location_thread),
        (rb_shell_player_open_location), (rb_shell_player_sync_play_order),
        (rb_shell_player_do_previous), (rb_shell_player_do_next_internal),
        (rb_shell_player_entry_activated_cb):
        * shell/rb-shell.c: (handle_playlist_entry_cb),
        (rb_shell_load_uri): Update API to totem-pl-parser 2.21.4, detect
        podcasts and OPML files (Closes: #489520) and add them to the
        podcast source

        * podcast/test-podcast-parse.c: (main): Call setlocale
        so we print UTF-8 characters

        (Closes: #484768)
Comment 25 Paul Drain 2008-01-05 05:10:12 UTC
(In reply to comment #23)
> The only remaining question I have about this is about dependencies.  If we
> depend on totem 2.21+, will it still be practical to build rhythmbox trunk on
> recent desktop distribution releases (fedora 8, ubuntu gutsy, etc.)?
> 
> I've seen from some recent totem changes that you're planning to split the
> playlist parser out into a separate module.  I definitely wouldn't have a
> problem with this going in once that's done.  Is this likely to happen soon?
> 

Was there ever an answer to this?

I've been playing with various iPod bugs and patches on my Ubuntu Gutsy desktop recently and basically backed out the -9 patch to get trunk to build.

I don't use Podcasts personally, and Bastien's patch currently backs out quite well, but it's not really a solution for current "Desktop" users or if any major work goes into the podcast parsing in the near future (ie. before GNOME 2.22).
Comment 26 Jonathan Matthew 2008-01-05 06:01:16 UTC
(In reply to comment #25)

> Was there ever an answer to this?

The totem/totem-plparser split occurred just before the patch was committed.  If it hadn't, I'm sure I would have objected by now.
 
> I've been playing with various iPod bugs and patches on my Ubuntu Gutsy desktop
> recently and basically backed out the -9 patch to get trunk to build.
>
> I don't use Podcasts personally, and Bastien's patch currently backs out quite
> well, but it's not really a solution for current "Desktop" users or if any
> major work goes into the podcast parsing in the near future (ie. before GNOME
> 2.22).

If you're already building rhythmbox from source, you can certainly build totem-plparser too.  It's a very simple library that has few dependencies.