After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 711241 - Broken or unknown metadata tag should not cancel the whole metadata loading
Broken or unknown metadata tag should not cancel the whole metadata loading
Status: RESOLVED FIXED
Product: GIMP
Classification: Other
Component: libgimp
git master
Other All
: Normal normal
: 2.10
Assigned To: GIMP Bugs
GIMP Bugs
Depends on:
Blocks:
 
 
Reported: 2013-11-01 01:15 UTC by Jehan
Modified: 2013-11-11 00:02 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Jehan 2013-11-01 01:15:17 UTC
In bug 710937, a user provided a jpeg straight out a Canon camera.

When I open it in master, I get the following popup error:

-----------------------------------------------------------------
Calling error for procedure 'gimp-image-set-metadata':
Procedure 'gimp-image-set-metadata' has been called with value '<?xml version='1.0' encoding='UTF-8'?>
<metadata>
  <tag name="Exif.Canon.0x0003">0 0 0 0</tag>
  <tag name="Exif.Canon.0x0019">1</tag>
  <tag name="Exif.Canon.0x0035">0 0 0 0</tag>
  <tag name="Exif.Canon.0x0098">1551 0 1 0</tag>
  <tag name="Exif.Canon.0x009a">23790592 67109417 33162240 67109255 46924800</tag>
  <tag name="Exif.Canon.0x4008">0 0 0</tag>
  <tag name="Exif.Canon.0x4009">0 0 0</tag>
  <tag name="Exif.Canon.0x4010"></tag>
  <tag name="Exif.Canon.0x4011">0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 44 0 0 0 0 0 0 0 0 0 0 0 10 0 0 0 255 255 255 255 0 0 0 0 10 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 10 0 0 0 0 32 196 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 204 16 64 20</tag>
  <tag name="Exif.Canon.0x4012"> (invalid UTF-8 string)
-----------------------------------------------------------------

The image still loads fine though but clicking the "Image Metadata" item fails saying there is no metadata attached to this image.

I've tried to search, and don't really know what this Exif.Canon.0x4012 tag is. And anyway the XML encoding is advertised UTF-8, so there is definitely an issue in the source (is this XML taken from the image or is it generated by (G)Exiv2?).

But in any case, even if the broken metadata is in the image, GIMP should still at least load the other tags somehow.
A call to `exiftool -a -u -g1 IMG_4779.JPG` works fine with a whole bunch of information (Canon 0x4012 is also there, just empty). So it is definitely possible to save the rest of the information.
Comment 1 Jehan 2013-11-01 01:18:10 UTC
Damn, the attachment is too big.
The user provided an upload there: https://app.box.com/s/3j2u9nu7hcr9fgpz1two

Please download IMG_4779.JPG.
Comment 2 Michael Natterer 2013-11-01 11:51:41 UTC
That can only mean that gexiv2_metadata_get_tag_string() does not
always return utf-8. We feed it into  g_markup_escape_text() which
expects utf-8 input.
Comment 3 Michael Natterer 2013-11-01 13:19:32 UTC
This makes metadata serialization robust no matter what comes out
of gexiv2, or how we misinterpret it. Leaving open until we figure
the root of the problem.

commit 798c62a54486916c69141463980a4497aea14b98
Author: Michael Natterer <mitch@gimp.org>
Date:   Fri Nov 1 14:15:15 2013 +0100

    Bug 711241 - Broken or unknown metadata tag should not cancel...
    
    ...the whole metadata loading
    
    Don't serialize a value that does not UTF-8-validate to XML. This is
    not a real fix, but no matter what we do here in the future, UTF-8
    validation should always be part of the serialization, in order to
    avoid passing broken data into the core.

 libgimpbase/gimpmetadata.c | 61 ++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 40 insertions(+), 21 deletions(-)
Comment 4 Jehan 2013-11-01 21:09:29 UTC
Cool. I tested and confirm now the image returns its metadata (and a nice warning in the console for the invalid field).

Now about gexiv2_metadata_get_tag_string() not always returning UTF-8, is that expected? I can't see any kind of encoding process in GExiv2 code anyway and neither in Exiv2 documentation. Does it mean that we should know the meaning of the key, and depending on it, that's our role to do the appropriate conversion?
Comment 5 Michael Natterer 2013-11-11 00:02:12 UTC
Don't drop the tags, instead encode them as base64. This should be
able to handle whatever comes out of gexiv2, closing as FIXED.

commit 33a8d68117a1ade59279102935ca128a25ec04d3
Author: Michael Natterer <mitch@gimp.org>
Date:   Mon Nov 11 00:11:43 2013 +0100

    Bug 711241 - Broken or unknown metadata tag should not cancel...
    
    ...the whole metadata loading
    
    Don't drop non-utf8 values from gexiv2 when serializing to XML,
    instead, base64 encode them. This should be robust against whatever
    garbage data is in tags.

 libgimpbase/gimpmetadata.c | 98 +++++++++++++++++++++++++++++++++-------------
 1 file changed, 71 insertions(+), 27 deletions(-)