After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 756026 - gtimezone: Add sanity checks on tzdata file format
gtimezone: Add sanity checks on tzdata file format
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: datetime
unspecified
Other All
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2015-10-03 23:18 UTC by Philip Withnall
Modified: 2018-05-24 18:15 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gtimezone: Add sanity checks on tzdata file format (2.28 KB, patch)
2015-10-03 23:19 UTC, Philip Withnall
none Details | Review
0001-gtimezone-Add-sanity-checks-on-tzdata-file-format.patch (2.67 KB, patch)
2015-10-05 20:08 UTC, Colin Walters
needs-work Details | Review
gtimezone: Add sanity checks on tzdata file format (7.60 KB, patch)
2015-10-19 15:52 UTC, Philip Withnall
needs-work Details | Review

Description Philip Withnall 2015-10-03 23:18:56 UTC
Patch attached to add a few g_return_if_fail()s after the offset calculations in the tzdata loading function, to bail if the tzdata file produces invalid offsets. It does not do further validation on the data (for example, it does not check any of the transitions). The idea is that this will prevent GLib roaming off into odd parts of memory if a corrupt tzdata file is loaded, and will cause it to abort gracefully instead.
Comment 1 Philip Withnall 2015-10-03 23:19:00 UTC
Created attachment 312613 [details] [review]
gtimezone: Add sanity checks on tzdata file format

If the tzdata file is somehow corrupt, it could cause dereferences of
arbitrary memory and loops with arbitrary termination conditions. Avoid
that with a couple of sanity checks on the data.

Coverity CID: 1325338
Comment 2 Colin Walters 2015-10-05 14:58:59 UTC
Review of attachment 312613 [details] [review]:

It might be easier to sanity check and maintain this if the bounds check is immediately after the real pointer computation, but before the dereference.  Something like:

diff --git a/glib/gtimezone.c b/glib/gtimezone.c
index b61529f..6b4ab02 100644
--- a/glib/gtimezone.c
+++ b/glib/gtimezone.c
@@ -467,9 +467,13 @@ init_zone_from_iana_info (GTimeZone *gtz, GBytes *zoneinfo)
   type_count = guint32_from_be(header->tzh_typecnt);
 
   tz_transitions = ((guint8 *) (header) + sizeof (*header));
+  g_return_if_fail (tz_transitions - header < size, FALSE);
   tz_type_index = tz_transitions + timesize * time_count;
+  g_return_if_fail (tz_type_index - header < size, FALSE);
   tz_ttinfo = tz_type_index + time_count;
+  g_return_if_fail (tz_ttinfo - header < size, FALSE);
   tz_abbrs = tz_ttinfo + sizeof (struct ttinfo) * type_count;
+  g_return_if_fail (tz_abbrs - header < size, FALSE);
 
   gtz->t_info = g_array_sized_new (FALSE, TRUE, sizeof (TransitionInfo),
                                    type_count);
?
Comment 3 Philip Withnall 2015-10-05 15:50:36 UTC
(In reply to Colin Walters from comment #2)
> Review of attachment 312613 [details] [review] [review]:
> 
> It might be easier to sanity check and maintain this if the bounds check is
> immediately after the real pointer computation, but before the dereference. 
> Something like:

Either works for me. Generally I think that more checks is worse because that means more possibilities for bugs, but I agree that having them closer to the individual calculations is good.

Up to you.
Comment 4 Dan Winship 2015-10-05 15:52:33 UTC
"tz file is corrupt" isn't a programmer error, so we shouldn't use g_return_if_fail (which can potentially get compiled out). Though it looks like g_time_zone_new() can't actually return an error (it just returns UTC on error), so we might want a g_warning() in addition to bailing out early.
Comment 5 Colin Walters 2015-10-05 20:08:32 UTC
Created attachment 312689 [details] [review]
0001-gtimezone-Add-sanity-checks-on-tzdata-file-format.patch

How about this?  (compile tested only)
Comment 6 Philip Withnall 2015-10-06 07:21:32 UTC
Review of attachment 312689 [details] [review]:

One check I missed out of my patch (and hence which is missing here): the calculation of t_info.abbrev in the first loop does `&tz_abbrs[info.tt_abbrind]`. `tz_abbrs` is unbounded at the top (there are no fields after it, unlike `tz_ttinfo` which is bounded on top by `tz_abbrs`); and `info.tt_abbrind` is untrusted.

::: glib/gtimezone.c
@@ +443,3 @@
+                   ptrdiff_t            offset)
+{
+  gconstpointer ret = ((const guint8*)base) + offset;

Couldn’t this potentially overflow?

@@ +447,3 @@
+  if (G_UNLIKELY (computed_offset >= size))
+    g_error ("Offset %" G_GUINT64_FORMAT " is outside timezone data size %" G_GUINT64_FORMAT "; corrupted timezone database?",
+             computed_offset, size);

Could we include the tzinfo filename in the error message to make things a little easier to investigate?

@@ +482,3 @@
 
+  tz_transitions = offset_boundcheck (header, size, header, sizeof (*header));
+  tz_type_index = offset_boundcheck (header, size, tz_transitions, timesize * time_count);

On 32-bit systems, I think `timesize * time_count` could potentially overflow (if `time_count` was sufficiently corrupt/big (for example, 0xffffffff).

@@ +484,3 @@
+  tz_type_index = offset_boundcheck (header, size, tz_transitions, timesize * time_count);
+  tz_ttinfo = offset_boundcheck (header, size, tz_type_index, time_count);
+  tz_abbrs = offset_boundcheck (header, size, tz_ttinfo, sizeof (struct ttinfo) * type_count);

Same here.
Comment 7 Colin Walters 2015-10-06 13:22:07 UTC
I found https://www.securecoding.cert.org/.../3524/pointer+arithmetic+v2.pdf
which says that because ptrdiff_t is signed, we're entering UB if it overflows.  We'd need to use ptrdiff_t or so.

I'm sure there's some prior art here to look at.
Comment 8 Philip Withnall 2015-10-19 15:52:43 UTC
Created attachment 313671 [details] [review]
gtimezone: Add sanity checks on tzdata file format

If the tzdata file is somehow corrupt, it could cause dereferences of
arbitrary memory and loops with arbitrary termination conditions. Avoid
that with a couple of sanity checks on the data.

Coverity CID: 1325338

---

Another attempt, which adds a warning message, removes some g_asserts(), falls back to loading UTC data, and includes the filename in the error message. The only thing I'd still like to do with this would be to add some unit tests, but that's a bit tricky when it's (almost) hard-coded to use files in /etc. I could either expose a public g_time_zone_new_from_file() method, or keep it internal in a way similar to gvariant-internal.h. Preferences? I'm leaning towards the latter.
Comment 9 Allison Karlitskaya (desrt) 2015-10-21 09:13:54 UTC
Review of attachment 313671 [details] [review]:

I think I'm going to try my hand at g_bytes_get_pointer().

::: glib/gtimezone.c
@@ +456,3 @@
+  gsize bytes_size;
+
+  bytes_base = g_bytes_get_data (bytes, &bytes_size);

I'm concerned about performance here.  Why not just pass in the base/offset directly instead of fetching it out of the GBytes over and over and over?

@@ +458,3 @@
+  bytes_base = g_bytes_get_data (bytes, &bytes_size);
+
+  if ((guintptr) bytes_base > G_MAXSIZE - bytes_size)

This seems to be a sanity check on the GBytes itself, independent of the passed-in parameters.  No sense in repeatedly checking this, either -- I think it is safe to assume that the kernel will not return us a mmap() region that wraps around in memory.

@@ +460,3 @@
+  if ((guintptr) bytes_base > G_MAXSIZE - bytes_size)
+    return FALSE;
+  if (base < bytes_base || base - bytes_base >= bytes_size)

This should be a > to allow for zero-sized regions, in my opinion.

@@ +462,3 @@
+  if (base < bytes_base || base - bytes_base >= bytes_size)
+    return FALSE;
+  if (n_elements == 0 || element_size > G_MAXSIZE / n_elements)

And indeed, we should allow n_elements == 0 specifically to deal with this case "zero-sized region" case.  imho it makes more sense to disallow element_size == 0 and I would do that via an assertion.

Put another way: it makes sense (and is valid) to have an empty array of integers, but it doesn't make sense to have a non-empty array of nothingness.

@@ +467,3 @@
+    return FALSE;
+
+  return TRUE;

This code seems generally useful.  If we're going to do this then why don't we add something like it on GBytes itself:


gconstpointer  g_bytes_get_pointer (GBytes *bytes,
                                    gsize   offset,
                                    gsize   element_size,
                                    gsize   n_elements);

and use that from here?  Of course, this flies in the face of my "performance" concerns above, but now we'd have some properly useful reusable code, which is a good reason for making this tradeoff.

This would check (with proper overflow handling) that offset+(n_elements*element_size) is within the range of the bytes and return the pointer, or NULL in case of failure.

Note: in case of a zero-sized region, we would still want to return a non-NULL pointer so that NULL always means "out of bounds".
Comment 10 Allison Karlitskaya (desrt) 2015-10-21 12:16:04 UTC
Okay.  See bug 756903 and bug 756906 about that.
Comment 11 GNOME Infrastructure Team 2018-05-24 18:15:46 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1088.