After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 551402 - Cannot update list of episodes in podcasts
Cannot update list of episodes in podcasts
Status: RESOLVED FIXED
Product: rhythmbox
Classification: Other
Component: Podcast
0.11.x
Other All
: Normal normal
: ---
Assigned To: RhythmBox Maintainers
RhythmBox Maintainers
: 558391 588170 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2008-09-08 17:30 UTC by Ricardo Fernández Pascual
Modified: 2010-03-20 12:01 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
rhythmbox -d output (982.56 KB, application/x-gzip)
2008-09-09 16:01 UTC, Ricardo Fernández Pascual
  Details
rhythmbox_feedupdate.log (3.86 KB, text/plain)
2009-07-03 13:14 UTC, Martin Häger
  Details
rhythmbox_feedadd.log (41.50 KB, text/plain)
2009-07-03 13:17 UTC, Martin Häger
  Details
Syncs podcasts to exactly what is in the feed (6.62 KB, patch)
2010-01-28 00:29 UTC, Chad Braun-Duin
needs-work Details | Review
Syncs podcasts to exactly what is in the feed (Rev 2) (4.09 KB, patch)
2010-01-30 00:40 UTC, Chad Braun-Duin
needs-work Details | Review
Sets deleted podcast posts to hidden (8.59 KB, patch)
2010-02-28 17:58 UTC, Chad Braun-Duin
needs-work Details | Review
cleaned up - set deleted podcast posts to hidden (7.69 KB, patch)
2010-03-13 22:06 UTC, Chad Braun-Duin
needs-work Details | Review
cleaned up (2) - set deleted podcast posts to hidden (6.41 KB, patch)
2010-03-13 23:41 UTC, Chad Braun-Duin
committed Details | Review

Description Ricardo Fernández Pascual 2008-09-08 17:30:59 UTC
Please describe the problem:
The list of episodes of a podcast is not updated automatically, nor can I update it via the "Update podcast provider" (or something like that, I am using the Spanish translation) context menu option.

However, If I remove the podcast and add it again, I see the new episodes.


Steps to reproduce:
1. Add a podcast. For example http://www.rtve.es/podcast/SSATEL.xml.
2. Wait a few days
3. Check that the podcast does have new episodes (using another client, like wget or firefox dynamic bookmarks)
4. Notice that rhythmbox does not show any new episode.
5. Try to update using the context menu, and see that nothing happens.

Actual results:
Nothing. That's the problem ;-)

Expected results:
New episodes should appear.

Does this happen every time?
As far as I can see, it does. Caveat: I have only tried with feeds from rtve.es for now.

Other information:
The new episodes do appear if you remove and then add the podcast again.
Comment 1 Bastien Nocera 2008-09-08 22:04:07 UTC
Could you please get the output of "rhythmbox -d" when reproducing the problem, and mention which version of Rhythmbox you're using?
Comment 2 Ricardo Fernández Pascual 2008-09-09 16:01:43 UTC
Created attachment 118365 [details]
rhythmbox -d output
Comment 3 Ricardo Fernández Pascual 2008-09-09 16:14:17 UTC
I am using version 0.11.6 (from rhythmbox-0.11.6-2.fc9.x86_64 rpm).

About the submitted trace:

- At 17:53:32 I select the podcast. Eight entries appear in the list of episodes.
- At 17:54:06 I select the update action from the context menu (actually I finish clicking it at 17:54:12). Still eight entries only.
- At 17:55:14 I remove the podcast. Shortly after that, I add the same podcast again. When I select it now, it has nine episodes (that is, it now has last night's program too).

I have noticed that the feeds from rtve.es are not entirely correct, I think. At least the dates are not encoded using RFC 822, hence they cannot be parsed correctly by rhythmbox. But I think that should not make any difference with respect to this bug.
Comment 4 Bastien Nocera 2008-09-09 16:41:24 UTC
The dates are parsed fine. It seems that it doesn't see any new entries, probably because the file is still the old one.

Do you use a proxy of any kind? Does your ISP use transparent proxy?
Comment 5 Ricardo Fernández Pascual 2008-09-09 17:52:08 UTC
No, the dates are not parsed correctly. But this is not rhythmbox's fault, as the format used by rtve.es is braindead: "dd-mm-yyyy hh:mm" which is undistinguishable with "mm-dd-yyyy hh:mm" (that's what rhythmbox picks, I get the month and the day reversed). Anyway, I think this is not related at all with the problem, and I don't care that much about the dates (I have already sent a "bug report" to rtve.es, fwiw). Sorry from bringing this up in the previous comment, it was offtopic.

I am not using a proxy, and I doubt that there is any transparent proxy involved (I cannot know for sure, but I have never had any problem that could be explained by a transparent proxy).

Moreover, wget does always get an up to date file. Firefox dynamic bookmarks work correctly too with these feeds. And remember that even rhythmbox does get the updated file if I remove the podcast and add it again, only not when clicking the update context menu option.

Thank you for your help.
Comment 6 Bastien Nocera 2008-09-09 18:19:58 UTC
(In reply to comment #5)
> No, the dates are not parsed correctly. But this is not rhythmbox's fault, as
> the format used by rtve.es is braindead: "dd-mm-yyyy hh:mm" which is
> undistinguishable with "mm-dd-yyyy hh:mm" (that's what rhythmbox picks, I get
> the month and the day reversed). Anyway, I think this is not related at all
> with the problem, and I don't care that much about the dates (I have already
> sent a "bug report" to rtve.es, fwiw). Sorry from bringing this up in the
> previous comment, it was offtopic.

They *are* being parsed properly, using totem-pl-parser (which is what Rhythmbox uses):
$ ./test-parser http://www.rtve.es/podcast/SSATEL.xml | grep publication-date
	publication-date = '2008-09-09T03:21:00Z' (1220930460/'09-09-2008 03:21')
	publication-date = '2008-08-09T03:23:00Z' (1218252180/'08-09-2008 03:23')
	publication-date = '2008-05-09T03:22:00Z' (1210303320/'05-09-2008 03:22')
	publication-date = '2008-04-09T03:22:00Z' (1207711320/'04-09-2008 03:22')
	publication-date = '2008-03-09T03:22:00Z' (1205032920/'03-09-2008 03:22')
	publication-date = '2008-02-09T03:22:00Z' (1202527320/'02-09-2008 03:22')
	publication-date = '2008-01-09T03:22:00Z' (1199848920/'01-09-2008 03:22')
	publication-date = '2010-05-08T04:51:00Z' (1273294260/'29-08-2008 04:51')
	publication-date = '2008-01-07T11:55:00Z' (1199706900/'01-07-2008 11:55')

> I am not using a proxy, and I doubt that there is any transparent proxy
> involved
<snip>

There's another case for which Rhythmbox doesn't add entries: when the newest post's publication-date isn't newer than the last one we've seen from the podcast.

So please check whether the dates are parsed properly, and if they're not, mention which version of evolution-data-server and totem-pl-parser you're using.
Comment 7 Ricardo Fernández Pascual 2008-09-10 08:14:31 UTC
> They *are* being parsed properly, using totem-pl-parser (which is what
> Rhythmbox uses):

No, they aren't. The parser may not detect any error, but the month and the day are reversed. For example:

'08-09-2008 03:23' is interpreted as '2008-08-09T03:23:00Z', but should be '2008-09-08T03:23:00Z'.

Or worse:

'29-08-2008 04:51' is interpreted as '2010-05-08T04:51:00Z', but should be 
'2008-08-29T04:51:00Z'. This one is obviously wrong.

There is no way to write a computer program to correctly parse these dates (in Spanish format, dd-mm-yyyy) and still be able to correctly parse the American format (mm-dd-yyyy) without extra information, because some dates are ambiguous.

> There's another case for which Rhythmbox doesn't add entries: when the newest
> post's publication-date isn't newer than the last one we've seen from the
> podcast.

I think this is a bug, because dates may be incorrect (as this bug shows). You should not trust the dates provided by the feed. Also, I don't think that feeds are _required_ to only add entries in chronological order (and even if they were, some feeds would not respect the requirement).

> So please check whether the dates are parsed properly, and if they're not,
> mention which version of evolution-data-server and totem-pl-parser you're
> using.

I am using evolution-data-server-2.22.3-2.fc9.x86_64 and totem-pl-parser-2.22.3-1.fc9.x86_64.

I think that, to fix this bug, rhythmbox should add all entries in the feed regardless of their dates (as long as they have not been added already, of course).





Comment 8 Martin Häger 2009-07-03 13:12:01 UTC
I'm seeing this problem too in git master (3b8c0f2cfeb8c533264ff5f8d03d1894a702ad5d), that is, version 0.12.2.91.

For the following feed:
http://rss.conversationsnetwork.org/series/stackoverflow.xml

Rhythmbox does not, when updating the feed on Fri, 3 Jul 2009 14:09:00 CEST, add the latest episode (#60 as of writing) that was published on Wed, 1 Jul 2009 00:00:00 CDT (see rhythmbox_feedupdate.log). When afterwards deleting/re-adding the feed, it adds episode #60 (see rhythmbox_feedadd.log).

However, the following feed has not yet had this problem as far as I recall:
http://feeds.feedburner.com/linuxoutlaws

Using:
  totem-plparser 2.26.2
  evolution-data-server 2.26.2
Comment 9 Martin Häger 2009-07-03 13:14:25 UTC
Created attachment 137786 [details]
rhythmbox_feedupdate.log
Comment 10 Martin Häger 2009-07-03 13:17:01 UTC
Created attachment 137787 [details]
rhythmbox_feedadd.log
Comment 11 Chad Braun-Duin 2010-01-28 00:29:47 UTC
Created attachment 152459 [details] [review]
Syncs podcasts to exactly what is in the feed

This patch changes the way Rhythmbox updates its podcast feeds. This patch always syncs the podcasts shown on screen with what the feed contains. 

Previously, the code would only add a new entry if the post-time for that entry happened after the last-overall-update time for the feed. This scenario didn't hold true for a lot of feeds. Furthermore, even if it did hold true some of the time for a feed, this condition would only be hit consistently if you kept Rhythmbox always running. If not, you could miss some podcast entries.

My patch does away with all of that. Simply put, after my patch is applied, you will always see every podcast entry that is currently in the feed.
Comment 12 Jonathan Matthew 2010-01-29 12:03:56 UTC
Review of attachment 152459 [details] [review]:

::: podcast/rb-podcast-manager.c
@@ +712,3 @@
 
+	rb_debug ("got file info results for %s", 
+                 get_remote_location (data->entry));

Please don't reformat code you haven't modified.  It's harder to review patches when half of the patch doesn't have any actual changes in it.

@@ +1107,3 @@
+	RhythmDBQueryModel *existing_entries = NULL;
+        existing_entries = rhythmdb_query_model_new_empty (db);
+        rhythmdb_do_full_query (db, RHYTHMDB_QUERY_RESULTS (existing_entries),

This seems like it wouldn't be a very reliable way of identifying duplicates.  What are we trying to catch here?  The same episode showing up in the feed with multiple URIs?

@@ +1121,3 @@
+                filesize,
+                RHYTHMDB_QUERY_END);
+        if (existing_entries != NULL) {

existing_entries cannot be NULL here, and even if it was, the previous call to rhythmdb_do_full_query would have crashed, so this check is useless.

@@ +1989,3 @@
+                download_entries = g_list_prepend (download_entries, post_entry);
+                new_last_post = item->pub_date;
+            }

you seem to have reformatted this entire block of code, which makes it very hard to see what you've actually changed.
Comment 13 Chad Braun-Duin 2010-01-29 13:39:36 UTC
Thanks for the review. Would it make sense for me to redo my changes
by modifying the least amount of code possible? If so, I can do that
and submit another patch.

Chad
Comment 14 Chad Braun-Duin 2010-01-30 00:40:33 UTC
Created attachment 152609 [details] [review]
Syncs podcasts to exactly what is in the feed (Rev 2)

How does this revision look?

I agree that the extra check I've added to rb_podcast_manager_add_post is a bit clunky. What is really needed is a rhythmdb_entry_lookup_by_mount_point function, since the unique identifier for a podcast-entry (the location) changes to the mount_point attribute after the podcast file is downloaded. Therefore, a data entry would have a different location (the url) than the same db entry (the local file path). Thus, duplicates will occur for all downloaded entries. What do you think?
Comment 15 Jonathan Matthew 2010-02-14 01:04:44 UTC
Review of attachment 152609 [details] [review]:

::: podcast/rb-podcast-manager.c
@@ +1108,3 @@
+         * this check is necessary since after an entry's file is downloaded,
+         * the location stored in the db changes to the local file path
+         * instead of the url. The url moves to the mount-point attribute.

Then this should query by mount point, rather than messing around with the title and file size.

@@ +1972,3 @@
+                 * because that check was not a great
+                 * indicator of whether or not to insert a post_entry.
+                 * For some podcasts, this condition was never met.

Well, that's one perspective.  My view on this is that it's stupid for podcasts to post entries out of order, and that this patch is working around that.

This removes the ability for the user to delete podcast entries, since they'll just show up again next time the feed is updated.  We could possibly hide entries, rather than deleting them.
Comment 16 Chad Braun-Duin 2010-02-17 02:04:44 UTC
(In reply to comment #15)
> Review of attachment 152609 [details] [review]:
> 
> ::: podcast/rb-podcast-manager.c
> @@ +1108,3 @@
> +         * this check is necessary since after an entry's file is downloaded,
> +         * the location stored in the db changes to the local file path
> +         * instead of the url. The url moves to the mount-point attribute.
> 
> Then this should query by mount point, rather than messing around with the
> title and file size.
> 
> @@ +1972,3 @@
> +                 * because that check was not a great
> +                 * indicator of whether or not to insert a post_entry.
> +                 * For some podcasts, this condition was never met.
> 
> Well, that's one perspective.  My view on this is that it's stupid for podcasts
> to post entries out of order, and that this patch is working around that.
> 
> This removes the ability for the user to delete podcast entries, since they'll
> just show up again next time the feed is updated.  We could possibly hide
> entries, rather than deleting them.

What would your ideal solution be? Querying by MountPoint and hiding deleted entries? If so, I can go after that.
Comment 17 Jonathan Matthew 2010-02-17 02:27:32 UTC
I think that would work and would not break anything else, so yes.
Comment 18 Chad Braun-Duin 2010-02-28 17:58:48 UTC
Created attachment 154903 [details] [review]
Sets deleted podcast posts to hidden

Per your last review, this patch sets deleted podcast posts to hidden so they do not return after the user does an update.
Comment 19 Jonathan Matthew 2010-03-07 00:57:59 UTC
Review of attachment 154903 [details] [review]:

I think this is roughly in the right shape now, but there are some style issues and the like that need to be fixed.

::: podcast/rb-podcast-manager.c
@@ +1120,3 @@
+                RHYTHMDB_QUERY_PROP_EQUALS,
+                RHYTHMDB_PROP_SUBTITLE,
+                subtitle,

Why do we still need to check the subtitle here?

@@ +1126,3 @@
+                RHYTHMDB_QUERY_PROP_EQUALS,
+                RHYTHMDB_PROP_HIDDEN,
+                FALSE,

Why are we excluding hidden entries here?

@@ +1701,3 @@
                                 	RHYTHMDB_QUERY_END);
 
+		GtkTreeIter iter;

Variable declarations go at the start of the block.  Move this back where it was.

@@ +1969,3 @@
+                /* 
+                 * Always try to add a post_entry.
+                 * Removed the check of whether or not the entry's post_time >

Don't comment on code that isn't there any more.  Just explain what it does now.

::: rhythmdb/rhythmdb-query-model.h
@@ +100,2 @@
 RhythmDBQueryModel *	rhythmdb_query_model_new_empty		(RhythmDB *db);
+RhythmDBQueryModel *	rhythmdb_query_model_new_empty_with_hidden		(RhythmDB *db);

I don't think we want a new constructor for this.  If you want a query model that includes hidden entries, you just need to set the show-hidden property on it after creating it.

::: rhythmdb/rhythmdb-tree.c
@@ +1949,3 @@
+                                return FALSE;
+                        }
+                    RHYTHMDB_PROPERTY_COMPARE (!=)

No need for all of this, just change the strcmp in RHYTHMDB_PROPERTY_COMPARE to g_strcmp0.

::: rhythmdb/rhythmdb.h
@@ +390,2 @@
 void		rhythmdb_entry_delete	(RhythmDB *db, RhythmDBEntry *entry);
+void		rhythmdb_entry_set_visibility	(RhythmDB *db, RhythmDBEntry *entry, gboolean visible);

This is an internal rhythmdb function, it shouldn't be exposed.

::: sources/rb-podcast-source.c
@@ +942,3 @@
+            /* set podcast entries to invisible instead of deleted so they will
+             * not reappear after the podcast has been updated*/
+            rhythmdb_entry_set_visibility (source->priv->db, l->data, FALSE);

use this instead:
  
  GValue v = {0,};
  g_value_init (&v, G_TYPE_BOOLEAN);
  g_value_set_boolean (&v, TRUE);
  rhythmdb_entry_set (source->priv->db, l->data, RHYTHMDB_PROP_HIDDEN, &v);
  g_value_unset (&v);

rhythmdb_entry_set_visibility should only be used inside rhythmdb itself.  Yes, it's more verbose, but we have to deal with that everywhere.
Comment 20 Chad Braun-Duin 2010-03-13 22:06:55 UTC
Created attachment 156078 [details] [review]
cleaned up - set deleted podcast posts to hidden

I fixed up all of the style issues from my previous patch. Most were fairly easy fixes. The only thing that tripped me up was setting a query model's show_hidden property. To me it seems like to do this directly requires exposing private fields and/or functions. Instead, I created a simple "setter" method called "rhythmdb_query_model_set_show_hidden". It is used in rb-podcast-manager.c on lines 1692 and 1859.

Please let me know if this is bad style. If so, please help me out with the direction I should take on this. To me, it seems like any other solution breaks encapsulation in some way. I could be missing something, though. Any help is appreciated.

Thank you for taking the time to review my code and provide feedback. I hope we are close to a final solution.

Chad
Comment 21 Jonathan Matthew 2010-03-13 22:32:04 UTC
Review of attachment 156078 [details] [review]:

Just one last thing and I think this is ready.

::: rhythmdb/rhythmdb-query-model.h
@@ +142,3 @@
 									 const char *plural);
+void                    rhythmdb_query_model_set_show_hidden    (RhythmDBQueryModel *model,
+                                                                    gboolean hidden);

Sorry, I should have explained this better. 'show-hidden' is exposed as GObject property, so to set it, all you need to do is:

  g_object_set (model, "show-hidden", TRUE, NULL);

See the GObject documentation for more information on properties: http://library.gnome.org/devel/gobject/stable/gobject-properties.html
Comment 22 Chad Braun-Duin 2010-03-13 23:41:58 UTC
Created attachment 156089 [details] [review]
cleaned up (2) - set deleted podcast posts to hidden

now using g_object_set. Thanks for the explanation.
Comment 23 Jonathan Matthew 2010-03-17 12:31:02 UTC
*** Bug 558391 has been marked as a duplicate of this bug. ***
Comment 24 Jonathan Matthew 2010-03-17 12:31:11 UTC
*** Bug 588170 has been marked as a duplicate of this bug. ***
Comment 25 Jonathan Matthew 2010-03-20 12:00:45 UTC
I cleaned the patch up a bit (tabs vs spaces, minor comment formatting things) and pushed it as commit 99f6d8a.  Thanks for working on this, Chad.