GNOME Bugzilla – Bug 756026
gtimezone: Add sanity checks on tzdata file format
Last modified: 2018-05-24 18:15:46 UTC
Patch attached to add a few g_return_if_fail()s after the offset calculations in the tzdata loading function, to bail if the tzdata file produces invalid offsets. It does not do further validation on the data (for example, it does not check any of the transitions). The idea is that this will prevent GLib roaming off into odd parts of memory if a corrupt tzdata file is loaded, and will cause it to abort gracefully instead.
Created attachment 312613 [details] [review] gtimezone: Add sanity checks on tzdata file format If the tzdata file is somehow corrupt, it could cause dereferences of arbitrary memory and loops with arbitrary termination conditions. Avoid that with a couple of sanity checks on the data. Coverity CID: 1325338
Review of attachment 312613 [details] [review]: It might be easier to sanity check and maintain this if the bounds check is immediately after the real pointer computation, but before the dereference. Something like: diff --git a/glib/gtimezone.c b/glib/gtimezone.c index b61529f..6b4ab02 100644 --- a/glib/gtimezone.c +++ b/glib/gtimezone.c @@ -467,9 +467,13 @@ init_zone_from_iana_info (GTimeZone *gtz, GBytes *zoneinfo) type_count = guint32_from_be(header->tzh_typecnt); tz_transitions = ((guint8 *) (header) + sizeof (*header)); + g_return_if_fail (tz_transitions - header < size, FALSE); tz_type_index = tz_transitions + timesize * time_count; + g_return_if_fail (tz_type_index - header < size, FALSE); tz_ttinfo = tz_type_index + time_count; + g_return_if_fail (tz_ttinfo - header < size, FALSE); tz_abbrs = tz_ttinfo + sizeof (struct ttinfo) * type_count; + g_return_if_fail (tz_abbrs - header < size, FALSE); gtz->t_info = g_array_sized_new (FALSE, TRUE, sizeof (TransitionInfo), type_count); ?
(In reply to Colin Walters from comment #2) > Review of attachment 312613 [details] [review] [review]: > > It might be easier to sanity check and maintain this if the bounds check is > immediately after the real pointer computation, but before the dereference. > Something like: Either works for me. Generally I think that more checks is worse because that means more possibilities for bugs, but I agree that having them closer to the individual calculations is good. Up to you.
"tz file is corrupt" isn't a programmer error, so we shouldn't use g_return_if_fail (which can potentially get compiled out). Though it looks like g_time_zone_new() can't actually return an error (it just returns UTC on error), so we might want a g_warning() in addition to bailing out early.
Created attachment 312689 [details] [review] 0001-gtimezone-Add-sanity-checks-on-tzdata-file-format.patch How about this? (compile tested only)
Review of attachment 312689 [details] [review]: One check I missed out of my patch (and hence which is missing here): the calculation of t_info.abbrev in the first loop does `&tz_abbrs[info.tt_abbrind]`. `tz_abbrs` is unbounded at the top (there are no fields after it, unlike `tz_ttinfo` which is bounded on top by `tz_abbrs`); and `info.tt_abbrind` is untrusted. ::: glib/gtimezone.c @@ +443,3 @@ + ptrdiff_t offset) +{ + gconstpointer ret = ((const guint8*)base) + offset; Couldn’t this potentially overflow? @@ +447,3 @@ + if (G_UNLIKELY (computed_offset >= size)) + g_error ("Offset %" G_GUINT64_FORMAT " is outside timezone data size %" G_GUINT64_FORMAT "; corrupted timezone database?", + computed_offset, size); Could we include the tzinfo filename in the error message to make things a little easier to investigate? @@ +482,3 @@ + tz_transitions = offset_boundcheck (header, size, header, sizeof (*header)); + tz_type_index = offset_boundcheck (header, size, tz_transitions, timesize * time_count); On 32-bit systems, I think `timesize * time_count` could potentially overflow (if `time_count` was sufficiently corrupt/big (for example, 0xffffffff). @@ +484,3 @@ + tz_type_index = offset_boundcheck (header, size, tz_transitions, timesize * time_count); + tz_ttinfo = offset_boundcheck (header, size, tz_type_index, time_count); + tz_abbrs = offset_boundcheck (header, size, tz_ttinfo, sizeof (struct ttinfo) * type_count); Same here.
I found https://www.securecoding.cert.org/.../3524/pointer+arithmetic+v2.pdf which says that because ptrdiff_t is signed, we're entering UB if it overflows. We'd need to use ptrdiff_t or so. I'm sure there's some prior art here to look at.
Created attachment 313671 [details] [review] gtimezone: Add sanity checks on tzdata file format If the tzdata file is somehow corrupt, it could cause dereferences of arbitrary memory and loops with arbitrary termination conditions. Avoid that with a couple of sanity checks on the data. Coverity CID: 1325338 --- Another attempt, which adds a warning message, removes some g_asserts(), falls back to loading UTC data, and includes the filename in the error message. The only thing I'd still like to do with this would be to add some unit tests, but that's a bit tricky when it's (almost) hard-coded to use files in /etc. I could either expose a public g_time_zone_new_from_file() method, or keep it internal in a way similar to gvariant-internal.h. Preferences? I'm leaning towards the latter.
Review of attachment 313671 [details] [review]: I think I'm going to try my hand at g_bytes_get_pointer(). ::: glib/gtimezone.c @@ +456,3 @@ + gsize bytes_size; + + bytes_base = g_bytes_get_data (bytes, &bytes_size); I'm concerned about performance here. Why not just pass in the base/offset directly instead of fetching it out of the GBytes over and over and over? @@ +458,3 @@ + bytes_base = g_bytes_get_data (bytes, &bytes_size); + + if ((guintptr) bytes_base > G_MAXSIZE - bytes_size) This seems to be a sanity check on the GBytes itself, independent of the passed-in parameters. No sense in repeatedly checking this, either -- I think it is safe to assume that the kernel will not return us a mmap() region that wraps around in memory. @@ +460,3 @@ + if ((guintptr) bytes_base > G_MAXSIZE - bytes_size) + return FALSE; + if (base < bytes_base || base - bytes_base >= bytes_size) This should be a > to allow for zero-sized regions, in my opinion. @@ +462,3 @@ + if (base < bytes_base || base - bytes_base >= bytes_size) + return FALSE; + if (n_elements == 0 || element_size > G_MAXSIZE / n_elements) And indeed, we should allow n_elements == 0 specifically to deal with this case "zero-sized region" case. imho it makes more sense to disallow element_size == 0 and I would do that via an assertion. Put another way: it makes sense (and is valid) to have an empty array of integers, but it doesn't make sense to have a non-empty array of nothingness. @@ +467,3 @@ + return FALSE; + + return TRUE; This code seems generally useful. If we're going to do this then why don't we add something like it on GBytes itself: gconstpointer g_bytes_get_pointer (GBytes *bytes, gsize offset, gsize element_size, gsize n_elements); and use that from here? Of course, this flies in the face of my "performance" concerns above, but now we'd have some properly useful reusable code, which is a good reason for making this tradeoff. This would check (with proper overflow handling) that offset+(n_elements*element_size) is within the range of the bytes and return the pointer, or NULL in case of failure. Note: in case of a zero-sized region, we would still want to return a non-NULL pointer so that NULL always means "out of bounds".
Okay. See bug 756903 and bug 756906 about that.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/1088.