After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 709898 - Expose thumbnail validity in GFile attributes
Expose thumbnail validity in GFile attributes
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: gio
2.38.x
Other All
: Normal enhancement
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks: 709900
 
 
Reported: 2013-10-11 10:21 UTC by Philip Withnall
Modified: 2013-10-24 17:56 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute (9.78 KB, patch)
2013-10-11 10:25 UTC, Philip Withnall
none Details | Review
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute (15.40 KB, patch)
2013-10-18 19:01 UTC, Allison Karlitskaya (desrt)
reviewed Details | Review
fixup! file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute (205.08 KB, patch)
2013-10-22 09:28 UTC, Philip Withnall
rejected Details | Review
fixup! file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute (3.32 KB, patch)
2013-10-22 09:28 UTC, Philip Withnall
none Details | Review
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute (15.69 KB, patch)
2013-10-22 15:28 UTC, Allison Karlitskaya (desrt)
reviewed Details | Review
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute (19.63 KB, patch)
2013-10-23 04:54 UTC, Allison Karlitskaya (desrt)
committed Details | Review
tests: Add tests for the thumbnail verification code in GIO (986.88 KB, patch)
2013-10-24 11:36 UTC, Philip Withnall
needs-work Details | Review
tests: Add tests for the thumbnail verification code in GIO (19.80 KB, patch)
2013-10-24 16:16 UTC, Philip Withnall
committed Details | Review

Description Philip Withnall 2013-10-11 10:21:32 UTC
The G_FILE_ATTRIBUTE_THUMBNAIL_PATH and G_FILE_ATTRIBUTE_THUMBNAILING_FAILED attributes don’t expose any concept of thumbnail validity, so programs which use them exclusively (without also using GnomeDesktopThumbnailFactory) can easily end up using outdated thumbnails which no longer represent the file, due to it being updated in the meantime.

Patch coming up to add a new G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute, which will indicate whether the thumbnail is valid. It can’t be 100% accurate without adding a dependency on gdk-pixbuf (to check the tEXt:Thumb::Mtime metadata). Instead, it compares the mtime of the thumbnail and the original file, and marks the thumbnail as invalid if the original file has been modified more recently. This will fail in situations where the original file has been reverted to an older version with an older mtime, but should otherwise be OK in practice.
Comment 1 Philip Withnall 2013-10-11 10:25:51 UTC
Created attachment 256988 [details] [review]
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute

This indicates whether the thumbnail (given by
G_FILE_ATTRIBUTE_THUMBNAIL_PATH) is likely to be valid — i.e. to represent
the file in its current state. If G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID is
FALSE (for a normal _or_ failed thumbnail) it means the file has changed since
the thumbnail was generated, and the thumbnail is out of date.

This attribute cannot be 100% accurate without adding a dependency on
gdk-pixbuf, so may report that a thumbnail is valid in some cases where it
isn’t. For example, if the file is reverted to an older version with an older
modification time, the thumbnail will still be marked as valid. Thumbnail
validity is calculated by comparing the modification times of the file and
the thumbnail.
Comment 2 Allison Karlitskaya (desrt) 2013-10-18 14:47:44 UTC
I wonder if we could have some very small piece of code for reading metadata from PNG files.  That seems like a problem that's reasonably well-contained.
Comment 3 Allison Karlitskaya (desrt) 2013-10-18 15:41:28 UTC
OK.  Turns out PNG is an extremely simple file format -- I'm going to write a (very) small piece of code to extract the headers and then we can do this properly.
Comment 4 Allison Karlitskaya (desrt) 2013-10-18 19:01:23 UTC
Created attachment 257659 [details] [review]
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute

This indicates whether the thumbnail (given by G_FILE_ATTRIBUTE_THUMBNAIL_PATH)
is valid — i.e. to represent the file in its current state. If
G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID is FALSE (for a normal _or_ failed
thumbnail) it means the file has changed since the thumbnail was generated, and
the thumbnail is out of date.

Part of checking thumbnail validity (by the spec) involves parsing headers out
of the thumbnail .png so we include some (small) code to do that.

Heavily based on a patch from Philip Withnall <philip.withnall@collabora.co.uk>
who suggested the feature and designed the API.
Comment 5 Colin Walters 2013-10-18 19:54:35 UTC
Review of attachment 257659 [details] [review]:

This looks to be quite well written, I looked for integer overflow flaws and the like, couldn't find any.

But: It'd be nice to have a test case for this with a static small .png file or two.

One other comment:

::: gio/glocalfileinfo.c
@@ +1284,3 @@
+ * is relatively easy.
+ */
+struct expected_info

Don't we normally use CamelCase for structs?
Comment 6 Philip Withnall 2013-10-22 09:23:06 UTC
Review of attachment 257659 [details] [review]:

A couple of minor comments. I also couldn’t find any integer overflows. I’ve got an updated patch with a couple of tests and fixes for the comments so far.

::: gio/glocalfileinfo.c
@@ +1361,3 @@
+    {
+      if (!check_integer_match (expected_info->size, value, value_size))
+        return FALSE;

Might want to document that Thumb::Size is not _required_ for a match, but if it’s set and mismatched to the stat()ed size, matching fails.

@@ +1378,3 @@
+
+  if (memcmp (contents, "\x89PNG\r\n\x1a\n", 8) != 0)
+    return FALSE;

Might be useful to add a comment pointing to the PNG documentation you used to write these functions.

@@ +1403,3 @@
+       */
+      if (size < chunk_size || size < chunk_size + 8)
+        goto out;

You should probably document here that (chunk_size + 8) is guaranteed not to overflow if (size >= chunk_size) because size is at most (G_MAXSIZE - 12) at this point.
Comment 7 Philip Withnall 2013-10-22 09:28:11 UTC
Created attachment 257824 [details] [review]
fixup! file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute
Comment 8 Philip Withnall 2013-10-22 09:28:14 UTC
Created attachment 257825 [details] [review]
fixup! file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute
Comment 9 Allison Karlitskaya (desrt) 2013-10-22 14:54:44 UTC
Review of attachment 257825 [details] [review]:

::: gio/glocalfileinfo.c
@@ +1284,3 @@
  * is relatively easy.
  */
+struct ExpectedInfo

i guess this should actually be a typedef.  I'll fix that up.

@@ +1407,3 @@
+       *
+       * The (chunk_size + 8) addition is guaranteed not to wrap if
+       * (size >= chunk_size) because size is at most (G_MAXSIZE - 12) at

I disagree with the logic here.

I check size < chunk_size specifically because (chunk_size+8) could very well wrap.  We just read chunk_size from the file, so it could be anything at all.

If chunk_size were '(guint)-8' for example and 'size' was 0 then we would see

  size < chunk_size + 8

  0    < ((uint)-8) + 8

  0    < 0

  false

and assume that everything is OK, even though 'chunk_size' is very much larger than 'size'.

We have to do the size<chunk_size check precisely to catch these cases that 'chunk_size' is very large.  It has nothing to do with 'size'.
Comment 10 Allison Karlitskaya (desrt) 2013-10-22 15:09:16 UTC
In retrospect, there is actually a bug here, but it's an extremely unlikely (in fact, impossible) one but since our threat model here appears to be that we have a hostile thumbnail directory, let's entertain it.

Consider the case that we somehow ended up with a thumbnail that was 4294967292 bytes.  That's ((guint)-4).  Now consider that this thumbnail is also corrupted and contains a chunk_size of 4294967290 ((guint)-6).

We check:

  size < chunk_size || size < chunk_size + 8

The first check will be false because:

  size < chunk_size

  4294967292 < 4294967290

  false

The second check will also fail:

  size < chunk_size + 8

  4294967292 < (4294967290 + 8)

  4294967292 < 2

  false


So I'll just rewrite this to flatly reject implausibly large chunks.

Thanks for all the attention to this suspicious line :)
Comment 11 Allison Karlitskaya (desrt) 2013-10-22 15:28:22 UTC
Created attachment 257853 [details] [review]
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute

This indicates whether the thumbnail (given by G_FILE_ATTRIBUTE_THUMBNAIL_PATH)
is valid — i.e. to represent the file in its current state. If
G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID is FALSE (for a normal _or_ failed
thumbnail) it means the file has changed since the thumbnail was generated, and
the thumbnail is out of date.

Part of checking thumbnail validity (by the spec) involves parsing headers out
of the thumbnail .png so we include some (small) code to do that.

Heavily based on a patch from Philip Withnall <philip.withnall@collabora.co.uk>
who suggested the feature and designed the API.
Comment 12 Allison Karlitskaya (desrt) 2013-10-22 15:30:46 UTC
Review of attachment 257824 [details] [review]:

This test isn't working for me.  The hash values in the thumbnail filenames are md5s of absolute file URIs and you are probably using ones from your system (ie; including the full path of where you happen to be building glib).

In addition to that, changing the XDG_* environment variables inside of a testcase like that (particularly when your test runs after another one) is extremely fragile and should be avoided.  In this case it's usually better to spawn a subprocess with the environment already set from the outside (ie: using the envp argument to spawn/gsubprocess).
Comment 13 Matthias Clasen 2013-10-22 16:43:35 UTC
Review of attachment 257853 [details] [review]:

Even if it is small, it might be nice to move the png parsing code into its own file ? Just an idea...
Comment 14 Matthias Clasen 2013-10-22 16:54:22 UTC
Review of attachment 257853 [details] [review]:

looks good

::: gio/glocalfileinfo.c
@@ +1458,3 @@
+           * The common case is that the tEXt chunks come at the start
+           * of the file before any of the image data.  This trick means
+           * that we will only fault in a single page (4k) whereas many

Should we just guarantee this, by capping size at 4k ?
Comment 15 Allison Karlitskaya (desrt) 2013-10-22 20:40:27 UTC
Review of attachment 257853 [details] [review]:

Splitting it out into a separate file is fine by me and makes sense, even -- we will want to duplicate this in GVfs :/

::: gio/glocalfileinfo.c
@@ +1458,3 @@
+           * The common case is that the tEXt chunks come at the start
+           * of the file before any of the image data.  This trick means
+           * that we will only fault in a single page (4k) whereas many

No.

It's quite possible that we could have a thumbnailer that writes PNG files with the metadata chunks at the end.  This hack doesn't assume that they are in any particular location -- only that they are all together.
Comment 16 Allison Karlitskaya (desrt) 2013-10-23 04:54:00 UTC
Created attachment 257889 [details] [review]
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute

This indicates whether the thumbnail (given by G_FILE_ATTRIBUTE_THUMBNAIL_PATH)
is valid — i.e. to represent the file in its current state. If
G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID is FALSE (for a normal _or_ failed
thumbnail) it means the file has changed since the thumbnail was generated, and
the thumbnail is out of date.

Part of checking thumbnail validity (by the spec) involves parsing
headers out of the thumbnail .png so we include some (small) code to do
that in a separate file.  We will likely want to copy this code to gvfs
to do the same for GVfsFile.

Heavily based on a patch from Philip Withnall <philip.withnall@collabora.co.uk>
who suggested the feature and designed the API.
Comment 17 Philip Withnall 2013-10-23 08:19:05 UTC
(In reply to comment #9)
> Review of attachment 257825 [details] [review]:
> @@ +1407,3 @@
> +       *
> +       * The (chunk_size + 8) addition is guaranteed not to wrap if
> +       * (size >= chunk_size) because size is at most (G_MAXSIZE - 12) at
> 
> We have to do the size<chunk_size check precisely to catch these cases that
> 'chunk_size' is very large.  It has nothing to do with 'size'.

Yes, and that’s why my logic was qualified with “if (size >= chunk_size)”.

(In reply to comment #10)
> Consider the case that we somehow ended up with a thumbnail that was 4294967292
> bytes.  That's ((guint)-4).  Now consider that this thumbnail is also corrupted
> and contains a chunk_size of 4294967290 ((guint)-6).

I don’t think it is possible to hit that situation, since at the point of the (size < chunk_size || size < chunk_size + 8) check, size is at most (G_MAXSIZE - 12), which is less than (G_MAXSIZE - 4). This is due to the subtraction of 8 and 4 bytes for the header and stored chunk size.

Anyway, you’ve rewritten it now so this is moot and I’ll take a look at the new version. :-)

(In reply to comment #12)
> Review of attachment 257824 [details] [review]:
> 
> This test isn't working for me.  The hash values in the thumbnail filenames are
> md5s of absolute file URIs and you are probably using ones from your system
> (ie; including the full path of where you happen to be building glib).

The hash values should be of paths in /tmp — that’s why I copy the image files to /tmp before querying their thumbnail information. I agree that this is ugly, but I couldn’t think of a better way of doing it.

Perhaps it would be possible to write a mock GFile implementation which returns a different URI from that of the file actually backing it? If you think that could work, let me know and I’ll put together a patch (but not otherwise, because it’s not a small amount of work).

> In addition to that, changing the XDG_* environment variables inside of a
> testcase like that (particularly when your test runs after another one) is
> extremely fragile and should be avoided.  In this case it's usually better to
> spawn a subprocess with the environment already set from the outside (ie: using
> the envp argument to spawn/gsubprocess).

That’s why I deliberately changed them at the start of main(). Agreed, it is fragile. I’ll change it in the next iteration of the patch.

Note that other test cases in gio/tests use g_setenv() similarly.
Comment 18 Philip Withnall 2013-10-23 08:31:42 UTC
Review of attachment 257889 [details] [review]:

::: gio/thumbnail-verify.c
@@ +156,3 @@
+       * Make sure it won't wrap when we add 8 to it.
+       */
+      if (chunk_size + 8 < chunk_size || size < chunk_size + 8)

Isn’t it possible for some compilers to optimise out checks like (chunk_size + 8 < chunk_size)?

The CERT-recommended way of performing this check would be:
    if (G_MAXUINT32 - chunk_size < 8 || size < chunk_size + 8)

As per https://www.securecoding.cert.org/confluence/display/seccode/INT30-C.+Ensure+that+unsigned+integer+operations+do+not+wrap.
Comment 19 Allison Karlitskaya (desrt) 2013-10-23 13:36:30 UTC
(In reply to comment #17)
> The hash values should be of paths in /tmp — that’s why I copy the image files
> to /tmp before querying their thumbnail information. I agree that this is ugly,
> but I couldn’t think of a better way of doing it.

Any approach here would rely on a fixed path in /tmp which is a bit of a non-starter.  A new approach would indeed be appreciated.  Maybe a hacked up GFile really is the best way...  I leave that to you to decide.

> That’s why I deliberately changed them at the start of main(). Agreed, it is
> fragile. I’ll change it in the next iteration of the patch.

Thanks for handling this.  Testing this is proving to be not-very-fun.

An idea for a different approach though: why not dynamically generate the thumbnails?  You could have fun testing 'corruption' cases this way, too.
Comment 20 Allison Karlitskaya (desrt) 2013-10-23 13:39:47 UTC
(In reply to comment #18)
> Review of attachment 257889 [details] [review]:
> Isn’t it possible for some compilers to optimise out checks like (chunk_size +
> 8 < chunk_size)?

Only if they are buggy.  Wrapping of unsigned ints is a very-clearly-defined part of the C specification.

> The CERT-recommended way of performing this check would be:
>     if (G_MAXUINT32 - chunk_size < 8 || size < chunk_size + 8)

I'm happy to do this as well -- it's essentially equivalent.
Comment 21 Allison Karlitskaya (desrt) 2013-10-23 15:57:41 UTC
Comment on attachment 257889 [details] [review]
file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute

Attachment 257889 [details] pushed as fe70697 - file-info: Add a G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID attribute


Leaving open for the promised testcases.
Comment 22 Philip Withnall 2013-10-24 11:36:40 UTC
Created attachment 258007 [details] [review]
tests: Add tests for the thumbnail verification code in GIO

This code was added for use by the G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID
file attribute, but may end up being used elsewhere (e.g. in GVfs) as well.
As it’s dealing with untrusted external files, and the non-trivial PNG file
format, this commit adds several test cases to cover valid and invalid PNG
files.

The security model for the thumbnail verification code is that the user’s
cache directory is untrusted, and potentially any PNG file which is passed
to the verifier has been manipulated arbitrarily by an attacker.

This is a follow-up to commit fe7069749fe39a006985ec266260a3c02ee8c855.
Comment 23 Allison Karlitskaya (desrt) 2013-10-24 14:09:43 UTC
Review of attachment 258007 [details] [review]:

I'm concerned about the size of the png files here -- we'd more or less be permanently adding 1MB to the size of a glib git download.  I think the best approach here would be to replace the image data in the examples with all-black.  That should compress nicely.

I also think that the private-library thing worked out a bit more complicated than it's worth.  If you can include the .c file directly instead I think it would be cleaner (and probably easier in the end).
Comment 24 Philip Withnall 2013-10-24 16:16:46 UTC
Created attachment 258036 [details] [review]
tests: Add tests for the thumbnail verification code in GIO

This code was added for use by the G_FILE_ATTRIBUTE_THUMBNAIL_IS_VALID
file attribute, but may end up being used elsewhere (e.g. in GVfs) as well.
As it’s dealing with untrusted external files, and the non-trivial PNG file
format, this commit adds several test cases to cover valid and invalid PNG
files.

The security model for the thumbnail verification code is that the user’s
cache directory is untrusted, and potentially any PNG file which is passed
to the verifier has been manipulated arbitrarily by an attacker.

This is a follow-up to commit fe7069749fe39a006985ec266260a3c02ee8c855.
Comment 25 Philip Withnall 2013-10-24 16:18:45 UTC
(In reply to comment #23)
> Review of attachment 258007 [details] [review]:
> 
> I'm concerned about the size of the png files here -- we'd more or less be

Good point, I completely forgot about that when doing the work. I’ve truncated the files down to 512B, which means they’re no longer valid PNG images (missing the IEND chunk), but we don’t care because the headers are all in-tact.

> I also think that the private-library thing worked out a bit more complicated
> than it's worth.  If you can include the .c file directly instead I think it
> would be cleaner (and probably easier in the end).

I think having it as a noinst library makes it easier to copy the code and Makefile around as a self-contained blob. It’s more effort than it’s worth to change to including the files directly, so I haven’t.
Comment 26 Allison Karlitskaya (desrt) 2013-10-24 17:56:45 UTC
Attachment 258036 [details] pushed as fcd2f7e - tests: Add tests for the thumbnail verification code in GIO

I trimmed the patch down to just add the testcase and avoid refactoring
the build system at the same time.

Thanks!