After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 486659 - xmp/exif metadata handling
xmp/exif metadata handling
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal enhancement
: 0.10.21
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks: 513182
 
 
Reported: 2007-10-14 19:24 UTC by Edgard Lima
Modified: 2010-09-15 15:30 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Edgard Lima 2007-10-14 19:24:51 UTC
Some info fisrt

- - - info related to Exif - - -

Can embedded EXIF into JPEG or TIFF 6.0 images

JPEG starts with 0xFFD8 (SOI - Start Of Image)

TIFF starts with 0x49492A00 or 0x4D4D002A it dependes on byte order

JPEG imagens could be found in JFIF file format or EXIF file format

JFIF starts with 0xFFD8 (jpeg SOI) 0xFFE0 (app mark 0) 0xxxxx (size) 0x4A464946 ('JFIF')

EXIF starts with 0xFFD8 (jpeg SOI) 0xFFE1 (app mark 1) 0xxxxx (size) 0x45786966 ('EXIF')

...so JFIF and EXIF are not compatible

JPEG files are divided in segments and sometimes we can find EXIF segment (which is app mark 1 + size + 'EXIF') somewhere after JFIF segment. It means that it is a JFIF file, but some EXIF libs search for such segments anyway. In our implementation we could decide to not get metadata of those files if we don't want to.

jpegdec can render both EXIF and JPEG

Once we find the EXIF inside the TIFF or JPEG file, we can extract the info in the same way

- - - info related to IPTC - - -

IPTC metadata can be embedded in JFIF (photoshop segment oxFFED (APP MARK 14)), EXIF (exif segment) and TIFF files

* it is alredy possible to have EXIF and IPTC on PSD files

- - - XMP - - -

can be inside PDF, JPEG, JPEG 2000, GIF, PNG, HTML, TIFF, Adobe Illustrator, PSD, SVG/XML, DNG, PostScript and Encapsulated PostScript

In PDF documents, XMP can not only be used to describe the document as a whole, but can also be attached to parts of the document, such as pages, included images, and tags defining structural divisions of the document

in case of JPEG it is inside a APP1 Marker (so it is compatible with a JFIF or also EXIF if it on a second APP1 marker)

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 

So it seems to be too much generic...

my idea is to have the following design

we should create on -base, helper functions to generate a tag list

GstTagList *gst_tag_list_from_metadata_chunk(const GstBuffer * buffer);

each image demuxer/decoder has to find out the metadata chunk and then call this method.

on the other side, muxer/encoder should call

GstBuffer *gst_exif_chunk_from_tag_list(GstTagList *);
GstBuffer *gst_iptc_chunk_from_tag_list(GstTagList *);
GstBuffer *gst_xmp_chunk_from_tag_list(GstTagList *);

and then, the muxer/encoder has to where to write this chunk in the file

....so, it would be need modifications to each decoder/demuxer/encoder/muxer that wants to extract/inject metadata info

ps: we could implement it step-by-strep, starting for example with jpegdec and exif

BR,
Edgard
Comment 1 Stefan Sauer (gstreamer, gtkdoc dev) 2007-10-15 06:41:32 UTC
Right, so there should be a heperl-library under e.g.
  gst-plugins-base/gst-libs/gst/imagetags/
that provides support for each of the formats a la:

GstTagList *gst_tag_list_from_exif_chunk(const GstBuffer * buffer);
GstBuffer *gst_exif_chunk_from_tag_list(GstTagList *tag_list);

Then jpeg{enc,dec}, tiff{enc,dec}, png{enc/dec} can make use of those.

gst_XXX_from_tag_list() should return NULL if the taglist does not contain suitable tags. For the future we should think about a useful interface that allows applications to select what metadata-formats should be produced.
Comment 2 Tim-Philipp Müller 2007-10-15 09:33:40 UTC
> Right, so there should be a helper-library under e.g.
>   gst-plugins-base/gst-libs/gst/imagetags/
> that provides support for each of the formats a la:

Umm, why not just put it into the existing libgsttag?
Comment 3 Edgard Lima 2007-10-15 11:31:18 UTC
I think it should be in

gst-plugins-base/gst-libs/ext/tags/

"tags" instead of "imagetags"
and
"ext" instead of "gst" 'cause I would like to use the following libs:

Exif:
 - http://libexif.sourceforge.net/

IPTC:
 - http://libiptcdata.sourceforge.net/

XMP:
 - http://libopenraw.freedesktop.org/wiki/Exempi

BR,
Edgard
Comment 4 Edgard Lima 2007-10-15 11:35:59 UTC
Do you think we should create a new tag lib like libgsttagext or move the libgsttag from gst to ext?

BR,
Comment 5 Tim-Philipp Müller 2007-10-15 11:45:49 UTC
I didn't realise you were planning on using external libraries (is that really needed? Are those tag formats so complicated?).  In that case libgsttag is not an option.
Comment 6 Stefan Sauer (gstreamer, gtkdoc dev) 2007-10-15 14:13:44 UTC
EXIF, XMP, IPCT are different from id3 or vorbiscomments. The standarts for metadata we are talking here about describes how to format it and how to embed this in various formats. In most cases it will be stored inside the container (and that the level of support we would like to address).

Now basically all format in which we would like to support it would need to parse the chunk and emit tags when reading and format the tags into a chunk when writing. How the chunk is streamlined with the container-format is specific, but the content of the chunk is not.

So basically when reading the app will do:
tags = gst_tag_list_from_exif_chunk (buffer)
when it found an exif chunk. If the proposed utility library has exif support gst_tag_list_from_exif_chunk will parse the block and generate single tags, if not it could emit GST_TAG_EXIF with the exif binary blob.

When writing the app will do:
buffer = gst_exif_chunk_from_tag_list (tags)
if there is exif support in the lib it will check if there are suitable tags are in the taglist it will return a buffer, else NULL. if there is no exif support it will check if there is GST_TAG_EXIF and return that.

This way we don't clutter several elements with #ifdef HAVE_LIBEXIF and we preserve the metadata for e.g. filesrc ! pngdec ! jpegenc ! filesink. If the libs are available one can even change the metadata.
Comment 7 Edgard Lima 2007-10-16 11:19:20 UTC

1. Use helper libraries to create tag lists from metadachunk

* lets think just about the design, no matter if it is ext or gst

GstTagList *gst_tag_list_from_exifimage_chunk(const GstBuffer * buffer);
GstTagList *gst_tag_list_from_iptc_chunk(const GstBuffer * buffer);
GstTagList *gst_tag_list_from_xmp_chunk(const GstBuffer * buffer);

GstBuffer *gst_exifimage_chunk_from_tag_list(GstTagList *tag_list);
GstBuffer *gst_iptc_chunk_from_tag_list(GstTagList *tag_list);
GstBuffer *gst_xmp_chunk_from_tag_list(GstTagList *tag_list);

1.1 those helper libraries would be called by decoders (tag_list_from_metadata_chunk) and sent as a message. On the other side, Encoders recive tag messsages and write it as chunks inside the file been converted to (metadata_chunk_from_tag_list)

1.1.1- Advantages
i - autoplugable
ii- there is only one central code (library) for all file formats. The only thing the file format has to do is to find the metadata chunk inside it.

1.1.2- Disadvanges
i - The encoder doesn't have any idea about what kind of netadata it is so it will have to write duplicate information to all the metadata chunks exif, iptc, xmp and future ones.

1.1.3- Open issues
i- Applications would need some extra Gst-Interface to encoders if it want to decide to write exif and/or iptc and/or xmp

examples:

-----------    -----------    ------------
| v4l2src | -> | jpegenc | -> | filesink |
-----------    -----------    ------------

The application could send metadata (as any other element in pipepilene). The end file will be a file with 3 metadata chunks (exif, iptc, xmp) with duplicated info
From the application point of view, there is no problem because the metadata is merged. Unless the application has 4 sidebars, one for general tags, one for iptc tags and one for xmp tags. In this case the application can't identify which metadata is from exif, iptc, and so on.

-----------    -----------    ----------    ------------
| filesrc | -> | jpegdec | -> | pngenc | -> | filesink |
-----------    -----------    ----------    ------------

The same problem, the application doesn't have control of what metadata will be created

-----------    -----------    ---------------
| filesrc | -> | jpegdec | -> | xvimagesink |
-----------    -----------    ---------------

The same problem, if the application wants to show tags separately it can't (lets just think of an app like Eye of Gnome for example) !!!

So, to solve these limitations
i- app can't know what kind of metadata it is
ii - encoders don't know what kind of metadata to write

I propose the following changes to GstTagList

1- A tag list has one id, name and description, i.e. GST_TAG_CATEGORY_EXIF, "Exif" and "Exif metadata for images"
1.1 - Ids starting from some value, for example 4000 are reserved for application specific
1.2 - there is a GST_TAG_CATEGORY_GENERAL

2- A tag values can contains not only unique values, like INT, STRING and so, but can also be of type GROUP
2.1-  GROUP tags has a name, and description i.e. "Rights" "Information regarding the legal restrictions" and also a list of other tags

With such a thing, the Application could show metadata all together or make it friendly like this:

Exif
  Camera
    Make
    Model
    XResolution
    YResolution
  Image Data
    Orientation
    DateTime
    Compression
  MakerNote
    Object Distance
    Time Zone
XMP
  Basic
    Advisory
    BaseURL
    CreateDate
    CreatorTool
  Rights
    Certificate
    Marked
    Owner
    UsageTerms

Comment 8 Edgard Lima 2007-10-16 12:58:47 UTC
please comment it....I would like to start implementing ASAP (may be tomorrow)

We have discussed on IRC and decide to not create a list of tags (group) type for a tag.

Indeed grouping is important and will be done in the following way (easier implementation)

strings with a separator will create the concept of groups, like bellow:

"Exif"
"Exif:Camera"
"Exif:Camera:Make"
"Exif:Camera:Model"

what is a good separator? I think ':' is a good one

....to help application a helper function should be create

GstTagList * gst_new_sorted_by_group_tag_list_from_tag_list(const GstTagList *);

also a define must be exposed to applications

#define GST_TAG_GROUP_CHAR ':'

Comment 9 Wim Taymans 2007-10-16 13:04:50 UTC
Uhm. that's completely the opposite of what we discussed on IRC. We were talking about making a field of type GstStructure inside GstStructures to get a real hierarchy. We don't need a separator then.
Comment 10 Edgard Lima 2007-10-16 19:05:17 UTC
....so, ok, fine I'm felling better with real hierarchy (GstStructure)....tomorrow I will start implementing it and then attach the patch (hopefully until Friday) here before commit

BR,
Edgard
Comment 11 Stefan Sauer (gstreamer, gtkdoc dev) 2007-10-18 13:44:12 UTC
Wim, the idea is to have a convinience API that hides the nested structures. So one can set "Video:Encoder" and it would automatically create the Video sub-structure if not there and create a Encoder element inside.

Too bad that GstStructure has no flags. If it would a flag could signal the existence of substructures. This way gst_structure_{set|get} could avoid scanning for ":" in the name. If GstStructure would have been a GObject I would have sugested to use the ChildProxy Iface. Of could we could add something like gst_structure_deep_{set|get} instead.
Comment 12 Edgard Lima 2007-10-24 08:48:53 UTC
I will not implement anymore in base/gst-libs/ext/tags

I will follow the MikeS's suggestion on #gstreamer and implement in the following way:

Create a new element, called 'metadataparser', that accepts image/jpeg , image/tiff, etc. as input

the 'metadataparser' element has higher priority than jpegdec, tiffdec and so.

them an auto-plugged would looks like this:

filesrc -(img/jpeg)-> metadataparser -(img/jpeg-metadata)-> jpegdec -(video/x-raw-yuv)-> xvimagesink

The 'jpegdec' still handles 'image/jpeg' but has lower priority than 'metadata'. In addition 'jpegdec' also hanldes 'image/jpeg-metadata'.

So, the 'metadataparser' element has knowledge about each metadata type (Exif, Iptc, Xmp) and also how the metadata is embedded into files). I will try to code it most modularized as possible.

The 'metadataparser' element doesn't change the stream, it just look into the stream and extract metadata, sending tags.

For the encode the pipeline would looks like this

videotestsrc -> jpegenc -> metadataenc -> filesink

Different from 'metadataparser', 'metadataenc' changes the stream, embedding the metadata in it.

So, in same way, 'metadataenc' has knowledge about all metadata types and file formats involved.

The metadata embedded are tags sent as events, by application or upstream elements, and mapped to metadata.

The 'metadataenc' element will have three properties, 'exif', 'iptc' and 'xmp'.
By default only exif will be 'on', the application can decide which of those options to turn 'on' or 

btw: I hope commit to plugins-bad a first version until Friday (only metadataparser with jpeg)






Comment 13 Edgard Lima 2007-10-30 13:01:02 UTC
just committed the first version in gst-plugins-bad/ext/metadata

to run it try:

$ GST_DEBUG=*metadata*:5 gst-launch-0.10 filesrc location=BlueSquare57.jpg ! metadataparse ! fakesink silent=true -v

this version still doesn't send metadata tags, but GST_LOG its.

this version only handles

jpeg (exif, iptc)

...my plan for this week is to:
1- also find xmp chunk inside jpeg file.
2- send the tag messages. . .  . . . . . . . ***

if you want to test it like this (filesrc ! jpegdec ! xvimagesink)

just add "image/jpeg-metadata" to the jpegdec sink pad

BR
Edgard

*** please lets discuss bug #482947
Comment 14 Edgard Lima 2007-10-31 21:26:15 UTC
What about the plugin name?

someone suggested

plugin: imagetag
element parser: imagetag
element writer:??????

Comment 15 Edgard Lima 2007-10-31 21:41:22 UTC
today I have committed the following changes to the element:

it sends the whole IPTC (Exif or else XMP) chunk in just one tag,
#define GST_TAG_IPTC "iptc"

this way, pipeline like this works fine:

filesrc ! metadataparse ! jpegdec ! image-processing ! jpegenc ! metadatamux ! filesink

'cause the metadatamux element will receive the tag event and write to the image file.

....now it would be good to create some default tags (bug #482947) related to images. And those tags could be mapped to/from exif,iptc and xmp metadata.

for example:

v4l2src ! jpegenc ! metadatamux ! filesink

the v4l2src element wants to send just "EXPOSURE_TIME" tag (no matter if it is iptc, exif or whatelse)...then the metadatamux-exif could just map this image general tag into one of its.

...if we don't have such new default tags to be mapped...the only thing we can do is : the application receives the tag message i.e. "EXPOSURE_TIME" and then send it back to the pipeline like this "Exif::ExposureTime" (or something like this using nested structures or whatelse)..so, in this second case, the map is up to the application

comments pls !!

BR
Edgard


Comment 16 Edgard Lima 2007-11-12 17:33:21 UTC
Hi I have just created a new dos to describe how parse and mux should operate.

please comment it

http://webcvs.freedesktop.org/gstreamer/gst-plugins-bad/ext/metadata/README?revision=1.1&view=markup

thanks,
Edgard
Comment 17 Wouter Cloetens 2007-12-14 11:59:06 UTC
Apart from emitting tags, it would also be useful to let an image metadata parser change the caps. Specifically, width and height are always present.

A use case for this is multiplexing of motion JPEG. The source of the JPEG data may not provide width and height properties in the caps, but multiplexers like avimux and matroskamux demand these properties on their source pads:
souphttpsrc location="http://webcam/mjpeg" do-timestamp=true ! multipartdemux ! metadataparse ! matroskamux ! filesink location="webcam.mkv"
Comment 18 Stefan Sauer (gstreamer, gtkdoc dev) 2008-11-18 12:24:28 UTC
Some more thinking (and discussion with tim) about the structure:
EXIF: can be in jfif (jpeg) and in tiff
  JFIF:
    APP1 (segment marker 0xFFE1), holds an entire TIFF file within
  TIFF: 
    Private Tag 0x8769 holds the Exif specified TIFF Tags,
    Private Tag 0x8825 holds GPS sub-IFD

XMP: can be in many filetypes
  XMP is most commonly serialized and stored using a subset of RDF,
  which is in turn expressed in XML

IPTC: ignore for now

What we should to is to have in gst-plugins-base/gst-libs/gst/tag
gstexiftag.{c,h}
gstxmptag.{c,h}
Those would, if feasible, implementing the standard without the external dependencies.

Two new elements:
jfifparse: takes jfif, parses exif, xmp, outputs jpeg
jfifformat: takes jpeg, adds jfif framing including exif and xmp, outputs jfif

pngdec/enc, asfmux/demux, avimux/demux, flvmux/demux, qtmux/demux, wavparse/wavenc could use libgsttag to add xmp support.
Comment 20 Edward Hervey 2009-03-11 21:29:43 UTC
We really need 'proper' support for exif/xmp at least.

The problem with the current metadata implementation and based on the last comments and discussion on IRC is that the metadata(de)mux elements are trying to do something which other elements can do (much better) and are failing abysmally at that.

It would be better to provide a convenience library to give the exif/xmp blobs to and get back tags/structures (and vice-versa) and leave to existing elements that know how to parse/mux those blobs in a given format handle that.

Ex : add it to jpegdec/jpegenc.

It would then be also much more trivial to make (if needed) parsers for specific formats.

Ex : within the jpeg plugin, you could share the code from gstjpeg{enc|dec} and make a jpegparse element which can extract/insert those tags without having to encode/decode anything.
Comment 21 Stefan Sauer (gstreamer, gtkdoc dev) 2009-03-12 09:00:16 UTC
Some links for JPEG in JFIF or EXIF format:
http://www.fileformat.info/format/jpeg/egff.htm
http://en.wikipedia.org/wiki/JPEG
http://en.wikipedia.org/wiki/JFIF
Comment 22 Stefan Sauer (gstreamer, gtkdoc dev) 2009-03-24 18:13:04 UTC
Regarding my comment #18, especialy jfif-elements - I was reading the spec wrong, the format is called jpeg too, jfif is just an app marker like exif and xmp. It would still be nice to have the app-marker handling as separate elements, so that its reusable together with basic 3rd party jpeg-codecs (e.g. a dsp based jpeg encoder/decoder should not be bothered with exif parsing).

jpegparse: parses jfif, exif, xmp app markers, outputs jpeg with app markers stripped
jpegformat: takes jpeg, adds app markers like jfif, exif and xmp
Comment 23 Stefan Sauer (gstreamer, gtkdoc dev) 2010-01-22 13:39:04 UTC
For xmp, we probably don't want to rewrite exempi (http://libopenraw.freedesktop.org/wiki/Exempi)

SLOC	Directory	SLOC-by-Language (Sorted)
34656   source          cpp=34656
1757    exempi          cpp=1737,sh=20

exempi is BSD.


Kind of similar story for libexif.
http://libexif.sourceforge.net/
and this one is LGPL.

(removed some irrelevant parts). So this would be the first case of gst-plugins-base/gst-libs/ext/tag/gst{xmp,exif}tag.{c,h}

What are the opinions one tag support libaries in base that have external deps. Basically plugins would either need some code like if gst_tag_is_exif_supported() or we could maybe even handle that in the tag libary to have stubs if the dependency is missing. Any preference here?
Comment 24 Stefan Sauer (gstreamer, gtkdoc dev) 2010-03-22 13:59:19 UTC
xmp is now implemented in gst-plugin-base/gst-libs/gst/tag/.

Exif should probably be done the same way, as besides jpeg exif can be in wav and avi files (http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/RIFF.html#Exif).
Comment 25 Thiago Sousa Santos 2010-04-01 17:39:27 UTC
FYI, I'm working on exif implementation.
Comment 26 Stefan Sauer (gstreamer, gtkdoc dev) 2010-07-03 19:34:20 UTC
Exif code has been committed. See Bug #614872. So we could close this. Wonder if we want to also kill the metadata plugin at the same time?
Comment 27 Edward Hervey 2010-07-07 14:50:01 UTC
+1 on killing metadata plugin.
Comment 28 Stefan Sauer (gstreamer, gtkdoc dev) 2010-09-07 12:38:40 UTC
Does anyone see benefits in rescuing tests/icles/metadata_editor ?

I think having a generic metadata editor would be nice, but is maybe a bit out of scope for tests/icles/.