After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 753459 - GDateTime: Add conversion functions from/to ISO 8601 strings
GDateTime: Add conversion functions from/to ISO 8601 strings
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: general
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
: 662060 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2015-08-10 12:41 UTC by Sebastian Dröge (slomo)
Modified: 2017-09-11 19:53 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
GTimeZone: Support the Unicode minus character (2.72 KB, patch)
2016-08-25 00:00 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (31.64 KB, patch)
2016-08-25 00:01 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (17.99 KB, patch)
2016-08-29 03:38 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (20.96 KB, patch)
2017-06-24 09:16 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (22.16 KB, patch)
2017-07-05 04:53 UTC, Robert Ancell
none Details | Review
GDateTime: Reject days outside of month limits (3.69 KB, patch)
2017-08-03 23:03 UTC, Robert Ancell
committed Details | Review
GDateTime: Support parsing ISO 8601 strings. (25.31 KB, patch)
2017-08-03 23:03 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (25.32 KB, patch)
2017-08-03 23:09 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (3.69 KB, patch)
2017-08-15 23:55 UTC, Robert Ancell
none Details | Review
GDateTime: Support parsing ISO 8601 strings. (25.42 KB, patch)
2017-08-21 00:01 UTC, Robert Ancell
committed Details | Review

Description Sebastian Dröge (slomo) 2015-08-10 12:41:20 UTC
It could currently be implemented with g_time_val_from_iso8601() and g_date_time_new_from_timeval_utc(), but that's not very observable from the docs. Same g_date_time_*() API would be good to have.
Comment 1 Matthias Clasen 2015-08-10 15:02:37 UTC
Whats wrong with g_date_time_format (dt, "%F") ?

iso 8601 is even mentioned in the documentation of g_date_time_format.
I don't think adding more api is really required here.
Comment 2 Sebastian Dröge (slomo) 2015-08-10 15:12:38 UTC
That would be in one direction, but what about creating a GDateTime from a ISO 8601 string?

I didn't think of the %F format, good point :)
Comment 3 Robert Ancell 2016-08-25 00:00:57 UTC
Created attachment 334109 [details] [review]
GTimeZone: Support the Unicode minus character
Comment 4 Robert Ancell 2016-08-25 00:01:49 UTC
Created attachment 334110 [details] [review]
GDateTime: Support parsing ISO 8601 strings.
Comment 5 Robert Ancell 2016-08-25 00:02:20 UTC
Both patches are in the wip/rancell/iso8601 branch.
Comment 6 Robert Ancell 2016-08-25 00:21:34 UTC
This may help resolving bug 651168, bug 662060 and bug 760983
Comment 7 Allison Karlitskaya (desrt) 2016-08-25 16:15:58 UTC
I'm not super-impressed about the prospect of creating a partial GDateTime and using GDateTimeParseFlags to indicate which parts of it are actually usable.

If we get this API, I want a fully-baked date/time string, or an error.

The "have some extra data" approach that you have (with specifying the date) might be a better direction to go here.  I can imagine, at the very least, it would be useful for the timezone specification...

As for ISO dates (time missing), a long-desired feature has been a GDay (or overloaded GDate) API with the same look and feel as GDateTime, but without the time (or timezone).

Let's chat on IRC.
Comment 8 Robert Ancell 2016-08-26 04:55:47 UTC
(didn't manage to catch you on IRC today)

This parser only generates valid GDateTime objects - the flags are only there to give you an idea of the accuracy of the date-time. The cases where you need to provide more information requires you to use g_date_time_new_from_iso8601_with_date().

For example you can have ISO 8601 strings like:

2016-08

or

2016-08-26T16:51

or

2016-08-26T16:51:03

or

2016-08-26T16:51.03.14159

they're all acceptable and only differ in accuracy. If you need to know this accuracy then you can check the flags. I'd expect most programs would not need to as the accuracy is probably known in advance (are these from a log with second or subsecond accuracy?)

I was thinking perhaps the functions should be:

g_date_time_new_from_iso8601 (text)

and

g_date_time_new_from_iso8601_full (text, year, month, day, flags);

i.e. the former is the common case and the latter the complex one for the corner cases.
Comment 9 Allison Karlitskaya (desrt) 2016-08-29 01:06:35 UTC
Review of attachment 334109 [details] [review]:

No.  Unicode minus is a not a reasonable feature of ISO8601 (which I am increasingly regarding as a completely unreasonable specification to implement fully).
Comment 10 Allison Karlitskaya (desrt) 2016-08-29 01:10:37 UTC
Review of attachment 334110 [details] [review]:

I think we should forget handling ISO8601 in its general case.

I think what would be more useful is to parse RFC3339 strings.

We could allow a couple of exceptions there to make the format a bit friendlier (since it is not always super-readable).  Specifically, relating to timezone, we could allow it to be specified with whitespace before it, or excluding the ":" character.

The biggest point here, however: whatever we accept should always be a fully specified *date* and *time*.  We should never fill in fields with 0, and thus we will never need a flags enum to report which fields we have.  The only thing that we should consider allowing to be blank is (as per RFC) the microseconds.
Comment 11 Robert Ancell 2016-08-29 03:38:03 UTC
Created attachment 334326 [details] [review]
GDateTime: Support parsing ISO 8601 strings.

Simplified parser as per review
Comment 12 Allison Karlitskaya (desrt) 2016-08-29 12:53:08 UTC
Review of attachment 334326 [details] [review]:

This looks great.  Thanks very much.  Some work is still needed, though.

I am concerned about the name of the function, since we are no longer parsing all of 8601.  Maybe we should name it for rfc3339?  People might be reasonably annoyed if a function claiming to support 8601 doesn't, but they can't be too upset if a function claiming to support 3339 does more than that.

On the other hand, we really do have a reasonable subset of 8601 behaviour here, and have gone far beyond 3339.  8601 is also more widely known.  Do you have any thoughts on this?

A better commit message (ie: anything at all) would be nice.

The addition to glib-sections.txt is missing.

::: glib/gdatetime.c
@@ +964,3 @@
+  gint days, week_offset;
+
+    {

Week 53 is (sometimes) valid.  2015 had 53 ISO8601 weeks.

@@ +973,3 @@
+    week_offset += 7;
+
+      v += (c - '0') * multiplier;

Wow... it *seems* like this will work correctly, but I'd really appreciate some testcases for year-before and next-year week numbering days...

Good examples: 2015-W01-1 is actually 2014-12-28, and 2015-W53-7 is 2016-01-03.

@@ +981,3 @@
+                    gint *year, gint *month, gint *day, gint *offset)
+{
+  *value = v;

On first brush, I find this "try all the cases" approach to parsing the date to be very strange, but due to the relatively few cases, it actually seems to be quite reasonable, and likely to be extremely fast.

@@ +1084,3 @@
+
+  /* Check for timezone suffix */
+             get_iso8601_int (text + 8, 2, day);

Ouch.  Isn't there some better way to do this?  In particular, you're looping over the full range of the length of the string, calling into a function that will check that length against a bunch of acceptable lengths (which will be false the first ~dozen times you call it) before returning immediately.

A couple of possible other approaches:
 - directly check from the end (ie: "Z" at offset -1?  '+' or '-' at offset -2, -4, or -5?)
 - scan for 'Z', '+' or '-', none of which can appear in times

also: although it's not part of 8601, maybe we consider allowing a space to appear before the timezone.

@@ +1120,3 @@
+ * <date>T<time> or <date>t<time> or <date> <time>
+ *
+             get_iso8601_int (text + 9, 1, &week_day) &&

After running through gtk-doc, this will end up as a single wall of text paragraph with no formatting (or even line breaks).  Can you please add some markdown, as appropriate?

@@ +1175,3 @@
+
+  if (tz == NULL)
+             convert_from_iso8601_ordinal (*year, ordinal_day, month, day);

I think it's definitely wrong to default to the local timezone in case the timezone is not given in the string.  That makes this function non-deterministic in a non-obvious way.

The string should either be forced to contain a timezone, or the user should have to explicitly pass one in, as a default.

@@ +1178,3 @@
+  datetime = g_date_time_new (tz, year, month, day, hour, minute, seconds);
+  if (datetime != NULL && offset != 0)
+  else

Although I understand why you do it, this offset business is pretty messy.  Faking like it's Jan 1 when you create the datetime and then tweaking it after is a bit evil.

Why not just have the date portion produce an ordinal day, and do the same for time, then construct the GDateTime for yourself using g_date_time_alloc()?  The internal helper ymd_to_days() can help you here.

Upon closer reading of that helper, however, it has become clear that we apparently think that February 30 is a valid day.... and it looks like your code will accept that as well :(  This also impacts all of the other public APIs, like g_date_time_new().  Some extra error checking there could be nice... :/

::: glib/tests/gdatetime.c
@@ +382,2 @@
 static void
+test_GDateTime_new_from_iso8601 (void)

Thanks for the tests.
Comment 13 Philip Withnall 2017-05-02 22:40:23 UTC
*** Bug 662060 has been marked as a duplicate of this bug. ***
Comment 14 Robert Ancell 2017-06-24 09:16:17 UTC
Created attachment 354350 [details] [review]
GDateTime: Support parsing ISO 8601 strings.

Updated patch (brought to use by airport lounges and long distance travel).

Remaining issues:
- Commit message - I'm not sure what sort of thing you want me to add here.
- GTK-Doc formatting not updated.
- Didn't change it to allow whitespace before timezone.

Possible enhancements:
- Since we actually parse the timezone information ourself this could be simpler if we create a g_time_zone_new_offset(hours, minutes). Does that seem useful?
- I refactored and made a private g_date_time_new_week() and g_date_time_new_week(). These could be useful to make public functions.
Comment 15 Robert Ancell 2017-06-24 09:20:08 UTC
I think we should probably just call the function g_date_time_from_string(). Yeah, I agree putting the spec name in it is confusing since it doesn't strictly conform to either specs. We'll have to rely on people reading the docs to determine exactly what strings we support.

There might be some risk of bugs stating "it doesn't parse this random string", but we should probably assert it only supports standardised strings. If other formats are required (unlikely?) then make a new function for them.
Comment 16 Philip Withnall 2017-07-04 10:25:03 UTC
Review of attachment 354350 [details] [review]:

Thanks for this. Allison’s on an extended holiday, so I’ll do a bit of review.

I haven’t re-checked the comments from Allison’s most recent review, and I haven’t re-read the ISO 8601 spec in detail, so I think this will need another round of review.

::: glib/gdatetime.c
@@ +905,3 @@
 
+static gboolean
+get_iso8601_int (const gchar *text, gint length, gint *value)

length should be gsize, and there should be a
g_return_val_if_fail (length > 9, FALSE)
to protect *value from overflow.

@@ +911,3 @@
+  for (i = 0; i < length; i++)
+    {
+      gchar c = text[i];

const

@@ +922,3 @@
+
+static gboolean
+get_iso8601_seconds (const gchar *text, gint length, gdouble *value)

`length` should be gsize.

Should there be a limit imposed on `length` so *value doesn’t overflow?

@@ +940,3 @@
+    }
+
+  for (; i < length; i++)

This function will accept `123.` (note the trailing dot) as valid input. Is that OK?

@@ +954,3 @@
+
+static GDateTime *
+g_date_time_new_ordinal (GTimeZone *tz, gint year, gint ordinal_day, gint hour, gint minute, gint seconds)

`gdouble seconds`?

@@ +968,3 @@
+
+static GDateTime *
+g_date_time_new_week (GTimeZone *tz, gint year, gint week, gint week_day, gint hour, gint minute, gint seconds)

`gdouble seconds`?

@@ +998,3 @@
+
+static GDateTime *
+parse_iso8601_date (const gchar *text, gint length,

`gsize length`

@@ +999,3 @@
+static GDateTime *
+parse_iso8601_date (const gchar *text, gint length,
+                    gint hour, gint minute, gint seconds, GTimeZone *tz)

`gdouble seconds`?

@@ +1064,3 @@
+
+static GTimeZone *
+parse_iso8601_timezone (const gchar *text, gint length, gint *tz_offset)

`gsize length`, `gsize *tz_offset`


(I realise that in reality all these lengths and offsets are bounded as very small numbers, but I think the use of gsize instead of gint for lengths and offsets helps distinguish them from the parsed time values, which should remain as gints.)

@@ +1088,3 @@
+      if (!get_iso8601_int (text + i + 1, 2, &offset_hours) ||
+          !get_iso8601_int (text + i + 4, 2, &offset_minutes))
+          return NULL;

Nitpick: Indentation’s off for this return statement (and the two in the other cases below).

@@ +1102,3 @@
+      if (!get_iso8601_int (text + i + 1, 2, &offset_hours))
+          return NULL;
+      offset_minutes = 0;

Instead of discarding the values of `offset_hours` and `offset_minutes` when returning a timezone from this function, you could check them against g_time_zone_get_offset() and bail (g_return_val_if_fail()) if they don’t match.

Or factor the parsing code for these three formats out of g_time_zone_new() so that the work isn’t done twice (once here, once in g_time_zone_new()).

@@ +1112,3 @@
+
+static gboolean
+parse_iso8601_time (const gchar *text, gint length,

`gsize length`

@@ +1143,3 @@
+ * g_date_time_new_from_iso8601:
+ * @text: an ISO 8601 formatted time string.
+ * @default_tz: (nullable): a #GTimeZone to use if the text doesn't contain a timezone or %NULL.

Nitpick: Put a comma before ‘or %NULL’.

@@ +1145,3 @@
+ * @default_tz: (nullable): a #GTimeZone to use if the text doesn't contain a timezone or %NULL.
+ *
+ * Creates a #GDateTime corresponding to the given ISO 8601 formatted string

Would be good to link to the Wikipedia page on ISO 8601 (since I don’t think there’s a freely available version of the actual standard?):

[ISO 8601 formatted string](https://en.wikipedia.org/wiki/ISO_8601)

@@ +1146,3 @@
+ *
+ * Creates a #GDateTime corresponding to the given ISO 8601 formatted string
+ * @text. Only the following subset of ISO8601 is supported:

s/ISO8601/ISO 8601/

@@ +1166,3 @@
+ * Z                - UTC.
+ * +hh:mm or -hh:mm - Offset from UTC in hours and minutes, e.g. +12:00.
+ * +hh or -hh       - Offset from UTC in hours, e.g. +12.

Not sure how the formatting of the above will come out in the formatted gtk-doc output. Might need to be formatted as lists and with `monospace` for the literal examples.

@@ +1174,3 @@
+ * when you are done with it.
+ *
+ * Returns: a new #GDateTime, or %NULL

(transfer full) (nullable)

@@ +1176,3 @@
+ * Returns: a new #GDateTime, or %NULL
+ *
+ * Since: 2.50

2.54

@@ +1177,3 @@
+ *
+ * Since: 2.50
+ **/

Nitpick: gtk-doc comments are *supposed* to be terminated with a `*/` rather than `**/` (which is an old style). Not that it really matters.

@@ +1181,3 @@
+g_date_time_new_from_iso8601 (const gchar *text, GTimeZone *default_tz)
+{
+  gint length, date_length = -1;

gsize

@@ +1189,3 @@
+  g_return_val_if_fail (text != NULL, NULL);
+
+  /* Date and time is separated by 'T', 't', or ' '*/

Nitpick: Missing space before the `*/`

@@ +1193,3 @@
+    {
+      if (date_length < 0 && (text[length] == 'T' || text[length] == 't' || text[length] == ' '))
+        date_length = length;

Do you want to break after setting the date_length for the first time? An input string containing multiple `T`, `t` or ` ` characters is invalid, isn’t it?

@@ +1205,3 @@
+    return NULL;
+
+  datetime = parse_iso8601_date (text, date_length, hour, minute, seconds, tz ? tz : default_tz);

Passing `gdouble seconds` here to a `gint seconds` parameter will truncate it. Is that what you want? Do we want to round instead? Does a test cover this behaviour?

::: glib/gdatetime.h
@@ +119,3 @@
 GDateTime *             g_date_time_new_from_timeval_utc                (const GTimeVal *tv);
 
+GLIB_AVAILABLE_IN_2_50

GLIB_AVAILABLE_IN_2_54 now.

::: glib/tests/gdatetime.c
@@ +27,2 @@
 #define ASSERT_DATE(dt,y,m,d) G_STMT_START { \
+  g_assert ((dt)); \

Nitpick: You can use g_assert_nonnull() to make this (and the one below) a little clearer (improved error message, etc.).

@@ +480,2 @@
 static void
+test_GDateTime_new_from_iso8601 (void)

This is a decent set of tests, but it’s hard to work out if it’s complete. Is there a canonical ISO 8601 test suite / conformance suite somewhere? Some quick googling didn’t find one, but you might know of one.

@@ +486,3 @@
+  /* Need non-empty string */
+  dt = g_date_time_new_from_iso8601 ("", NULL);
+  g_assert (dt == NULL);

Nitpick: You can use g_assert_null() to make this a little clearer (improved error message, etc.).
Comment 17 Robert Ancell 2017-07-05 04:53:27 UTC
Created attachment 354906 [details] [review]
GDateTime: Support parsing ISO 8601 strings.

Updated as per Philip's review:
- gsize used over gint
- more safety checks to avoid overflows
- indentation fixes
- docstring fixes
- test fixes

Specific responses to review:

- I looked at factoring out the GTimeZone code but it seemed to require a string internally anyway so I just added the check that it matched what I expected.

- There was no break once 'T' is hit because the code was also counting the length of the string. I've commented this so it's more clear.

- I also didn't find a comprehensive set of tests, this has just been me trying to get as many corner cases as I thought of. I'm sure there could be potentially more.
Comment 18 Philip Withnall 2017-07-07 11:11:28 UTC
Review of attachment 354906 [details] [review]:

Thanks for the updates. A commit message would be useful, describing what’s been added, what it does and does not support, and the motivation behind it.

(In reply to Allison Lortie (desrt) (extended vacation) from comment #12)
> Review of attachment 334326 [details] [review] [review]:
> 
> Upon closer reading of that helper, however, it has become clear that we
> apparently think that February 30 is a valid day.... and it looks like your
> code will accept that as well :(  This also impacts all of the other public
> APIs, like g_date_time_new().  Some extra error checking there could be
> nice... :/

Allison pointed this out, and I can’t immediately see a test case which covers February 30th. Would be good to have one (in each applicable date format).

::: glib/gdatetime.c
@@ +1125,3 @@
+    {
+      g_time_zone_unref (tz);
+      return NULL;

I was thinking this should cause an assertion failure if the check fails. Do you think there are situations where this could fail but not be a programming error?

@@ +1174,3 @@
+ * <date> is in the form:
+ *
+ * - YYYY-MM-DD - Year/month/day, e.g. 2016-08-24.

I think it would be clearer if you added Markdown backticks to format the examples as monospace. For example:
 * - `YYYY-MM-DD` — Year/Month/Day, for example 2016-08-24.
 * - `YYYYMMDD` — Same as above without dividers.

@@ +1202,3 @@
+ * when you are done with it.
+ *
+ * Returns: (transfer full): (nullable): a new #GDateTime, or %NULL

Drop the colon between `(transfer full)` and `(nullable)`; that’s not how annotations are formatted (https://wiki.gnome.org/Projects/GObjectIntrospection/Annotations).

@@ +1221,3 @@
+    {
+      if (date_length < 0 && (text[length] == 'T' || text[length] == 't' || text[length] == ' '))
+        date_length = length;

Ah, I see that this isn’t going to set date_length multiple times (due to the (date_length < 0) check), but it will still iterate over the rest of the string after setting it, which seems a bit pointless. You could move the (date_length < 0) check to the `for` loop condition to fix this:
    for (length = 0; text[length] != '\0' && date_length < 0; length++)
Comment 19 Robert Ancell 2017-08-03 05:00:19 UTC
> ::: glib/gdatetime.c
> @@ +1125,3 @@
> +    {
> +      g_time_zone_unref (tz);
> +      return NULL;
> 
> I was thinking this should cause an assertion failure if the check fails. Do
> you think there are situations where this could fail but not be a
> programming error?

Since this is parsing external data that is not under the control of the developer if the timezone is wrong then we shouldn't assert failure.
 
> @@ +1221,3 @@
> +    {
> +      if (date_length < 0 && (text[length] == 'T' || text[length] == 't' ||
> text[length] == ' '))
> +        date_length = length;
> 
> Ah, I see that this isn’t going to set date_length multiple times (due to
> the (date_length < 0) check), but it will still iterate over the rest of the
> string after setting it, which seems a bit pointless. You could move the
> (date_length < 0) check to the `for` loop condition to fix this:
>     for (length = 0; text[length] != '\0' && date_length < 0; length++)

The important point is we use 'length' later in the function. This is essentially just avoiding running strlen(). It's probably a pretty minor optimisation so I'm happy to just iterate over the string twice if it's still too confusing.
Comment 20 Robert Ancell 2017-08-03 23:03:13 UTC
Created attachment 356901 [details] [review]
GDateTime: Reject days outside of month limits
Comment 21 Robert Ancell 2017-08-03 23:03:39 UTC
Created attachment 356902 [details] [review]
GDateTime: Support parsing ISO 8601 strings.
Comment 22 Robert Ancell 2017-08-03 23:04:25 UTC
Patch updates courtesy of Dubai and Auckland international airports.
Comment 23 Robert Ancell 2017-08-03 23:09:33 UTC
Created attachment 356903 [details] [review]
GDateTime: Support parsing ISO 8601 strings.

Fix commit message.
Comment 24 Philip Withnall 2017-08-14 11:46:21 UTC
Review of attachment 356901 [details] [review]:

Excellent.
Comment 25 Philip Withnall 2017-08-14 11:54:37 UTC
(In reply to Robert Ancell from comment #19)
> > ::: glib/gdatetime.c
> > @@ +1125,3 @@
> > +    {
> > +      g_time_zone_unref (tz);
> > +      return NULL;
> > 
> > I was thinking this should cause an assertion failure if the check fails. Do
> > you think there are situations where this could fail but not be a
> > programming error?
> 
> Since this is parsing external data that is not under the control of the
> developer if the timezone is wrong then we shouldn't assert failure.

Except the code here is checking that the result g_time_zone_new() comes up with when parsing `text` is the same as the result your code has come up with by parsing `text` a different way. So as far as I can see, any discrepancy between those two offsets indicates a programming error in g_time_zone_new() or in your code. No?

> > @@ +1221,3 @@
> > +    {
> > +      if (date_length < 0 && (text[length] == 'T' || text[length] == 't' ||
> > text[length] == ' '))
> > +        date_length = length;
> > 
> > Ah, I see that this isn’t going to set date_length multiple times (due to
> > the (date_length < 0) check), but it will still iterate over the rest of the
> > string after setting it, which seems a bit pointless. You could move the
> > (date_length < 0) check to the `for` loop condition to fix this:
> >     for (length = 0; text[length] != '\0' && date_length < 0; length++)
> 
> The important point is we use 'length' later in the function. This is
> essentially just avoiding running strlen(). It's probably a pretty minor
> optimisation so I'm happy to just iterate over the string twice if it's
> still too confusing.

Got it. Looks good to me.
Comment 26 Philip Withnall 2017-08-14 11:56:05 UTC
Review of attachment 356903 [details] [review]:

::: glib/gdatetime.c
@@ +1122,3 @@
+  *tz_offset = i;
+  tz = g_time_zone_new (text + i);
+  if (g_time_zone_get_offset (tz, 0) != offset_sign * (offset_hours * 3600 + offset_minutes * 60))

Marking as needs-work pending resolution of whether this should be an assertion or as it is currently.
Comment 27 Robert Ancell 2017-08-15 00:58:04 UTC
(In reply to Philip Withnall from comment #25)
> (In reply to Robert Ancell from comment #19)
> > > ::: glib/gdatetime.c
> > > @@ +1125,3 @@
> > > +    {
> > > +      g_time_zone_unref (tz);
> > > +      return NULL;
> > > 
> > > I was thinking this should cause an assertion failure if the check fails. Do
> > > you think there are situations where this could fail but not be a
> > > programming error?
> > 
> > Since this is parsing external data that is not under the control of the
> > developer if the timezone is wrong then we shouldn't assert failure.
> 
> Except the code here is checking that the result g_time_zone_new() comes up
> with when parsing `text` is the same as the result your code has come up
> with by parsing `text` a different way. So as far as I can see, any
> discrepancy between those two offsets indicates a programming error in
> g_time_zone_new() or in your code. No?
>

Correct. I think the important point is it doesn't indicate an error in the input data, so it shouldn't be considered an error by the glib consumer.

If a program feeds in a date string (which it probably got from some external data) and their program crashes, I don't think they'd see that as reasonable as they probably didn't enter the date themselves. Returning NULL would be unexpected, but should follow their normal "bad data inputted" path.

I think we should be able to assume that either a code bug in the gdatetime or the gtimezone code is a bad bug that should be fixed asap, so if we ultimately do use an assertion I think that's probably fine given this is likely to be a low risk event.
Comment 28 Philip Withnall 2017-08-15 08:39:23 UTC
(In reply to Robert Ancell from comment #27)
> (In reply to Philip Withnall from comment #25)
> > (In reply to Robert Ancell from comment #19)
> > > > ::: glib/gdatetime.c
> > > > @@ +1125,3 @@
> > > > +    {
> > > > +      g_time_zone_unref (tz);
> > > > +      return NULL;
> > > > 
> > > > I was thinking this should cause an assertion failure if the check fails. Do
> > > > you think there are situations where this could fail but not be a
> > > > programming error?
> > > 
> > > Since this is parsing external data that is not under the control of the
> > > developer if the timezone is wrong then we shouldn't assert failure.
> > 
> > Except the code here is checking that the result g_time_zone_new() comes up
> > with when parsing `text` is the same as the result your code has come up
> > with by parsing `text` a different way. So as far as I can see, any
> > discrepancy between those two offsets indicates a programming error in
> > g_time_zone_new() or in your code. No?
> >
> 
> Correct. I think the important point is it doesn't indicate an error in the
> input data, so it shouldn't be considered an error by the glib consumer.
> 
> If a program feeds in a date string (which it probably got from some
> external data) and their program crashes, I don't think they'd see that as
> reasonable as they probably didn't enter the date themselves. Returning NULL
> would be unexpected, but should follow their normal "bad data inputted" path.
> 
> I think we should be able to assume that either a code bug in the gdatetime
> or the gtimezone code is a bad bug that should be fixed asap, so if we
> ultimately do use an assertion I think that's probably fine given this is
> likely to be a low risk event.

Exactly. Please change it to an assertion. If there’s a bug in either of those code paths, we want to find it. If we return NULL as in a normal error in that case, such bugs will almost certainly never be found, and might be producing incorrect results in other places without being detected.
Comment 29 Robert Ancell 2017-08-15 23:55:22 UTC
Created attachment 357679 [details] [review]
GDateTime: Support parsing ISO 8601 strings.

Changed to use an assertion for the timezone validation.
Comment 30 Philip Withnall 2017-08-16 09:13:10 UTC
(In reply to Robert Ancell from comment #29)
> Created attachment 357679 [details] [review] [review]
> GDateTime: Support parsing ISO 8601 strings.
> 
> Changed to use an assertion for the timezone validation.

Seems like the wrong patch was uploaded. This one is actually a copy of the one to fix the month length limits!
Comment 31 Philip Withnall 2017-08-16 09:48:51 UTC
Before I forget: this has missed the API freeze for 2.54, so will have to go into master once we’ve branched. Sorry about that.
Comment 32 Robert Ancell 2017-08-21 00:01:05 UTC
Created attachment 358037 [details] [review]
GDateTime: Support parsing ISO 8601 strings.

Re-uploaded patch correctly, sorry!
Comment 33 Philip Withnall 2017-08-21 09:41:21 UTC
Review of attachment 358037 [details] [review]:

Cool, this looks good to go in once we’ve branched. Thanks for all your work on this!
Comment 34 Philip Withnall 2017-09-11 18:38:00 UTC
Now that GLib 2.54 has been released, we’ve branched, and I’ve updated and pushed this patch. I changed the symbol versioning, and dropped a trailing full-stop from the commit subject. Thanks for all your work on this!
Comment 35 Robert Ancell 2017-09-11 19:53:18 UTC
I'd been checking every day for the branch :) Thanks for all the reviews!