GNOME Bugzilla – Bug 379932
Don't parse bad dates as 1970
Last modified: 2008-06-03 17:41:02 UTC
Please describe the problem: Sometimes I get podcasts with 1970 1st jan. as date. I Guess this is because rhythmbox fails to parse the date. Setting the date to the current would probably make it easier when sorting by date. Steps to reproduce: 1. Try to add this podcast: http://podcast.dr.dk/p3/rssfeed/desortespejdere.xml Actual results: A lot of files from 1970 1st jan Expected results: Setting the date to the current would probably make it easier when sorting by date. Does this happen every time? Other information:
Have also seen the same with a couple of Symantec podcasts available here: http://www.symantec.com/about/news/podcasts/index.jsp
I ran into this bug too on http://podcast.dr.dk/p4/rssfeed/bornholmaktuelt.xml I had a look at the XML for the feed - dates like "<pubDate>28 Mar 2007 10:28:18 GMT</pubDate>" get displayed OK. Similarly, dates in April get displayed OK. But dates like "<pubDate>02 maj 2007 12:34:19 GMT</pubDate>" get displayed as 1970-01-01. "Maj" is Danish for "May". As far as I can read from the source file podcast/rb-podcast-parse.c::rb_podcast_parse_date(), it goes to great lengths to find a valid date, but always uses strptime(), which uses the current locale. I confirmed this by deleting the feed, running rhythmbox with LANG set to "da_DK" and re-importing the feed. Now the dates gets displayed correctly. (Oddly, just restarting rhythmbox with LANG changed does *not* change the display?) According to http://www.apple.com/itunes/store/podcaststechspecs.html#_Toc526931694 it is a "common mistake" not to follow RFC2822 (http://www.faqs.org/rfcs/rfc2822.html section 3.3 I presume). My conclusion: Rhythmbox goes out of its way to parse dates. But the fault (at least in my case) lies with the feed... I'll have to drop them a mail I guess...
Upon further investigation, it turns out that even though the feed (in my case) is not conforming to standards, neigher is Rhythmbox(!) Cause: rhythmbox uses strptime() to parse dates, which always uses the current locale. Result: users with a non-english-speaking locale will have problems with conforming feeds...
http://podcast.dr.dk/p3/rssfeed/desortespejdere.xml is broken, the dates are in completely the wrong format. http://www.symantec.com/content/en/us/about/rss/ent/ent.xml gives back random dates using the ISO8601 parser from GLib, filed as bug 503029. http://podcast.dr.dk/p4/rssfeed/bornholmaktuelt.xml is parsed properly. Both podcasts from podcast.dr.dk contain BOMs at the start of the file, which we don't handle yet. Filed as bug 503031. 2007-12-11 Bastien Nocera <hadess@hadess.net> * configure.in: * totem-plparser-uninstalled.pc.in: * totem-plparser.pc.in: Depend on camel to parse RFC 2822 date strings (as used in RSS feeds), use GLib to parse RFC 3339/ISO 8601 dates (as used in Atom feeds) * plparse/test-parser.c: (test_date_real), (test_date), (main): Add tests for a few RSS and Atom date parsing * plparse/totem-pl-parser.c: (totem_pl_parser_parse_date): use Camel and GLib for date parsing (Closes: #379932)
#4 http://podcast.dr.dk/p3/rssfeed/desortespejdere.xml is broken, the dates are in completely the wrong format. You know, this very bug report wasn't much about a bug in the date parser, but merely a request that unparsable dates should default to current date rather than 1970.
(In reply to comment #5) > #4 > http://podcast.dr.dk/p3/rssfeed/desortespejdere.xml > is broken, the dates are in completely the wrong format. > > You know, this very bug report wasn't much about a bug in the date parser, but > merely a request that unparsable dates should default to current date rather > than 1970. Right. I'll get that fixed in Rhythmbox directly then. Filed as bug 503273
Mass-move from totem to totem-pl-parser. You can remove all messages by searching for this comment.