After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 331691 - Podcast feeds don't handle timezones in <pubDate> tag
Podcast feeds don't handle timezones in <pubDate> tag
Status: RESOLVED FIXED
Product: rhythmbox
Classification: Other
Component: Podcast
0.9.3
Other Linux
: Normal normal
: ---
Assigned To: RhythmBox Maintainers
RhythmBox Maintainers
Depends on:
Blocks:
 
 
Reported: 2006-02-18 17:42 UTC by Sebastien Bacher
Modified: 2006-02-21 10:04 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Non-working patch (644 bytes, patch)
2006-02-18 18:38 UTC, Alex Lancaster
none Details | Review
Updated patch to get around lousy timezone handing in strptime (1.08 KB, patch)
2006-02-21 09:27 UTC, Alex Lancaster
none Details | Review
Better patch which also handles timezone offsets (1.26 KB, patch)
2006-02-21 09:34 UTC, Alex Lancaster
committed Details | Review

Description Sebastien Bacher 2006-02-18 17:42:04 UTC
That bug has been opened on https://launchpad.net/distros/ubuntu/+source/rhythmbox/+bug/29631

"Rhythmbox doesn't recognize the date in the podcasts.
It doesn't recognize the PubDate tag.
I use the last dapper cvs release 20060124.

I experience the problem with the following feed:
http://internetradio.vrt.be/podcast/StuBru/rss-41_spod.xml"
Comment 1 Alex Lancaster 2006-02-18 18:37:08 UTC
I can confirm this problem.  It uses a bogus date of 1969-12-31.  The reason is that the <pubDate> tag uses a timezone which the parser doesn't recognize.  I'm attaching a patch that *should* fix it, but doesn't quite.
Comment 2 Alex Lancaster 2006-02-18 18:38:43 UTC
Created attachment 59662 [details] [review]
Non-working patch

This patch doesn't quite work.  It has the right date format according to the manual for strptime(), but it doesn't seem to recognize the timezone "%Z" part.
Comment 3 Alex Lancaster 2006-02-18 18:59:21 UTC
Retitling to reflect underlying bug.
Comment 4 James "Doc" Livingston 2006-02-19 10:27:21 UTC
That is because "CET" isn't a timezone. Timezones are like "+1100" or "-0730".
Comment 5 Alex Lancaster 2006-02-19 10:34:36 UTC
CET is a timezone name, +1100 are offsets from GMT, and according to strptime() offsets are handled by %z (lowercase).  It does say that timezone names (%Z) aren't properly handled, but it does say that are consumed (presumably being read, but not set in the tm structure), so it's weird.
Comment 6 Alex Lancaster 2006-02-19 10:35:54 UTC
From:

info strptime
    `%z'
          The offset from GMT in ISO 8601/RFC822 format.

    `%Z'
          The timezone name.

          _Note:_ Currently, this is not fully implemented.  The format
          is recognized, input is consumed but no field in TM is set.
Comment 7 James "Doc" Livingston 2006-02-19 10:49:06 UTC
From the ISO time standard:

"There exists no international standard that specifies abbreviations for civil time zones like CET, EST, etc. and sometimes the same abbreviation is even used for two very different time zones. In addition, politicians enjoy modifying the rules for civil time zones, especially for daylight saving times, every few years, so the only really reliable way of describing a local time zone is to specify numerically the difference of local time to UTC."

The CET timezone could mean different things depending on what country you are in, so isn't really a valid timezone specifier.
Comment 8 Alex Lancaster 2006-02-19 11:03:08 UTC
I understand (and the manual implies), but I'm just saying that from the description for the library function, it implies that it is expecting a *name* (i.e. text) for the "%Z" option (%z is used for the GMT offset) and will parse it as free text, later on in the info manual it mentions "EDT" explicitly.  

So I would expect it to consume text with no whitespace, but not necessarily *use* it (which we don't need anyway).
Comment 9 Alex Lancaster 2006-02-19 11:05:05 UTC
(In reply to comment #8)

> So I would expect it to consume text with no whitespace, but not necessarily
> *use* it (which we don't need anyway).

It says of %Z:  "The format is recognized, input is consumed".  Clearly it just doesn't work, so we need another way to parse this.

Comment 10 Alex Lancaster 2006-02-21 09:27:27 UTC
Created attachment 59829 [details] [review]
Updated patch to get around lousy timezone handing in strptime

Updated to patch to get around handling of timezone names.  This will match all the characters up to the timezone name, then look for a timezone which is a non-zero-length collection of capitalised alphabetical characters like CET, AEST, then check the remainder as a year.  Not so pretty but it works for this file and have checked it against made-up <pubDate> tags.

e.g.: 

Fri Feb 17 16:34:06 CET 2006

will succeed but both:

Fri Feb 17 16:00:00 1999 aaa
Fri Feb 10 18:00:00 cet 2006

will fail.
Comment 11 Alex Lancaster 2006-02-21 09:34:44 UTC
Created attachment 59831 [details] [review]
Better patch which also handles timezone offsets

Better patch which also handles timezone offsets like "-1100" and "-0730" as per comment #4.
Comment 12 James "Doc" Livingston 2006-02-21 10:04:43 UTC
Looks good, and works fine for me. Committed to cvs, thanks.