GNOME Bugzilla – Bug 752850
matroskademux: Does not send user-supplied metadata tags from streamable files
Last modified: 2015-08-16 13:40:39 UTC
User-supplied metadata tags are not reported by matroskademux: they are not available on the bus. Easily reproducible with gst-launch. First, create an mkv with some metadata tags (note: setting streamable to true seems to be required, see Bug 752847 ): gst-launch-1.0 -e videotestsrc ! taginject tags=title=Hello,comment=world,artist=sally ! matroskamux streamable=true ! filesink location=x.mkv The title, artist, and comment tags are clearly added to the mkv, as shown by mkvinfo: mkvinfo x.mkv + EBML head |+ Doc type: matroska |+ Doc type version: 2 |+ Doc type read version: 2 + Segment, size unknown |+ Tags | + Tag | + Targets | + TrackUID: 9875838076303193444 | + Simple | + Name: TITLE | + String: Hello | + Simple | + Name: COMMENTS | + String: world | + Simple | + Name: ARTIST | + String: sally |+ Segment information | + Segment UID: 0x39 0x55 0xb0 0xa1 0x60 0x57 0x97 0xe7 0xef 0x50 0x28 0x36 0x27 0x29 0x51 0x24 | + Timecode scale: 1000000 | + Muxing application: GStreamer matroskamux version 1.5.2.1 | + Writing application: GStreamer Matroska muxer | + Date: Sat Jul 25 00:13:52 2015 UTC |+ Segment tracks | + A track | + Track number: 1 (track ID for mkvmerge & mkvextract: 0) | + Track type: video | + Track UID: 9875838076303193444 | + Default duration: 33.333ms (30.000 frames/fields per second for a video track) | + Name: Video | + Video track | + Pixel width: 320 | + Pixel height: 240 | + Colour space: length 4, data: 0x49 0x34 0x32 0x30 | + Codec ID: V_UNCOMPRESSED |+ Cluster ...avplay sees the tags: $ avplay x.mkv avplay version 9.16-6:9.16-0ubuntu0.14.04.1+fdkaac, Copyright (c) 2003-2014 the Libav developers built on Aug 11 2014 23:12:18 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1) [matroska,webm @ 0x7f8868005c60] Estimating duration from bitrate, this may be inaccurate Input #0, matroska,webm, from 'x.mkv': Duration: N/A, start: 0.000000, bitrate: N/A Stream #0.0(eng): Video: rawvideo, yuv420p, 320x240, PAR 1:1 DAR 4:3, 30 fps, 30 tbr, 1k tbn (default) Metadata: TITLE : Hello COMMENTS : world ARTIST : sally ...but matroskademux, at least according go 'gst-launch -t', does not: gst-launch-1.0 -t filesrc location=./x.mkv ! matroskademux ! fakesink Setting pipeline to PAUSED ... Pipeline is PREROLLING ... FOUND TAG : found by element "fakesink0". container format: Matroska FOUND TAG : found by element "fakesink0". video codec: Uncompressed planar YUV 4:4:4 Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock Got EOS from element "pipeline0". Execution ended after 0:00:00.286344133 Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ... ...Same results using playbin: $ gst-launch-1.0 -t playbin uri=file:///tmp/x.mkv Setting pipeline to PAUSED ... Pipeline is PREROLLING ... FOUND TAG : found by element "videosink-actual-sink-xvimage". container format: Matroska FOUND TAG : found by element "videosink-actual-sink-xvimage". video codec: Uncompressed planar YUV 4:4:4 Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock
Works for me, for non-streamable matroska. As reported, your pipeline where not closing the matroska file correctly. http://fpaste.org/248053/78755314/ I confirm tags are not reported with streamable matroska.
Created attachment 308316 [details] [review] Path to preserve forward-referencing track tacks. GObject neophyte, so I'm sure code needs improvement.
A little more info: when streamable=true, track tags injected before gst_matroska_mux_start are written out by the muxer, but they are not reported by matroska-demux. To illustrate: $ gst-launch-1.0 -c videotestsrc ! taginject tags=comment=hello ! matroskamux streamable=true ! filesink location=x.mkv C-c ....The 'comment' tag (suitably renamed 'COMMENTS') is clearly in the mkv file: $ mkvinfo x.mkv + EBML head |+ Doc type: matroska |+ Doc type version: 2 |+ Doc type read version: 2 + Segment, size unknown |+ Tags | + Tag | + Targets | + TrackUID: 1380911230444452577 | + Simple | + Name: COMMENTS | + String: hello |+ Segment information | + Segment UID: 0x5d 0x2e 0xc6 0xc5 0xb6 0x53 0x66 0x35 0x20 0x8f 0xb6 0x30 0x19 0xad 0x0c 0x4b | + Timecode scale: 1000000 | + Muxing application: GStreamer matroskamux version 1.5.2.1 | + Writing application: GStreamer Matroska muxer | + Date: Tue Jul 28 00:43:27 2015 UTC |+ Segment tracks | + A track | + Track number: 1 (track ID for mkvmerge & mkvextract: 0) | + Track type: video | + Track UID: 1380911230444452577 | + Default duration: 33.333ms (30.000 frames/fields per second for a video track) | + Name: Video | + Video track | + Pixel width: 320 | + Pixel height: 240 | + Colour space: length 4, data: 0x49 0x34 0x32 0x30 | + Codec ID: V_UNCOMPRESSED |+ Cluster ....however, this Tag appears in the .mkv before the track it belongs to...its like a forward reference. The matroska-demux reads the tag, but simply throws it away (with a FIXME log), because it does not (yet) have the corresponding track: $ export GST_DEBUG=matroska-demux:3 gst-launch-1.0 -t filesrc location=x.mkv ! matroskademux ! fakesink Setting pipeline to PAUSED ... Pipeline is PREROLLING ... 0:00:00.030284747 22173 0x145f0a0 FIXME matroskareadcommon matroska-read-common.c:2395:gst_matroska_read_common_parse_metadata_id_tag:<matroskademux0:sink> Found track-specific tag(s), but track 1380911230444452577 is not known (yet?) FOUND TAG : found by element "fakesink0". container format: Matroska FOUND TAG : found by element "fakesink0". video codec: Uncompressed planar YUV 4:4:4 Pipeline is PREROLLED ... ... ....there is no COMMENTS tag reported. Seems that matroska-demux needs code to cache the track taglist in matroska-read-common, then merge it into the taglist when the track is later parsed. Not a bug, just code thats not implemented (yet). Agove is a patch that attempts to address the issue. It adds a field to matroska-read-common.h:GstMatroskaReadCommon called "cached_track_taglists". This GstStructure that acts as a hashtable mapping track_id->taglist. If/when gst_matroska_read_common_parse_metadata_id_tag(...) finds tags for an unknown track, it adds those tags to cached_track_taglists (merging them if the unknown track has been seen before). Later, if/when the stream is found in the mkv, matroska-demux.c:gst_matroska_demux_add_stream() will throw the cached tags back into the mix. Disclaimer: while I think the logic in the patch is ok, and it tests well for me, this is the first time I've written any GObject code, so I'm sure it can be done better. In particular, the GstStructure that maps track_id->taglist converts everything to Strings and maps String->String. This is simply because I didn't find a way to use have GstStructure map the actual datatypes (guint64->GstTagList).
Review of attachment 308316 [details] [review]: ::: gst/matroska/matroska-read-common.h @@ +105,3 @@ + + /* cache for track tags that forward-reference their tracks */ + GstStructure *cached_track_taglists ; What about a GHashTable ?
Thanks, that's much nicer. Patch follows.
Created attachment 308425 [details] [review] Patch to preserve forward-referencing track tags. Rewritten using GHashTable
Created attachment 308427 [details] [review] Patch to preserve forward-referencing track tags. Rewritten using GHashTable
Review of attachment 308425 [details] [review]: About the commit message, ideally put your full name. The commit format we prefer would be something like: matroskademux: Preserve forward referenced track tags Longer description here bug_URI ::: gst/matroska/matroska-demux.c @@ +1132,3 @@ + /* check for a cached track taglist */ + cached_taglist = + (GstTagList *) g_hash_table_lookup (demux->common.cached_track_taglists, I'm not sure, if this tag list is ever going to be reused ? If not, maybe call gst_hash_table_remove() (after merging), so we don't keep it in memory. ::: gst/matroska/matroska-read-common.c @@ +2877,3 @@ g_object_unref (ctx->adapter); + g_hash_table_remove_all (ctx->cached_track_taglists); + g_hash_table_unref (ctx->cached_track_taglists); Unref shall be sufficient.
Would be nice to double check with the spec, so we make sure matroskamux is not simply generating an invalid stream (just to back this up).
Also, I wonder if there is a point where we know all stream have been created, hence the remaining cache could be removed.
Question: what, and where, is the matroska spec? I have primarily used this src: http://matroska.org/technical/specs/index.html However, the above states: " This document is not the real format specification. It's a simple draft to work. (For a simplified diagram of the layout of a Matroska file, see the Diagram page.) But since it's quite complete it is used as a reference for the development of libmatroska. An alternate version of the specification can be found here (PDF doc maintained by Alexander Noé -- may be outdated)." The content of the link to the alternate specification mentioned in this quote (http://matroska.org/files/matroska.pdf, titled "Matroska File Format (under construction!)" is,as noted out of date, i.e.. There, in a footnote, it refers to the "official documentation"...but this link in turns points back to the previously mentioned doc. Is this is the closest to an "official spec" that we can have, or is there another source out there somewhere? FYIW, my reading of the doc suggests that there is not, in fact, any prescribed order in which elements need to occur, but there are "recommended" orders: "While in EBML there is no particular order for elements, in Matroska it is necessary to make sure some elements are placed in a certain order for better playback, seeking and editing efficiency". For the placement of tags, the doc suggests that they be placed either early or at the end of the stream, depending on the use case. gstreamer's matroska mux places the tags before the segment track, if run with streamable=true, and it places the tags after the segment track element (in fact, at the very end of the file, after all the Clusters and Cues) if run with streamable=false. From my reading of the docs, both are valid, so in either case, I think that matroskamux is generating a valid stream.
Great, thanks for the research, it seems it confirms this patch is correct.
Comment on attachment 308427 [details] [review] Patch to preserve forward-referencing track tags. commit cd57697a2cb81cd04fd1b32e9b91032b6c7d8012 Author: Glen Diener <grd@loganmill.net> Date: Wed Jul 29 18:54:35 2015 -0600 matroskademux: Preserve forward referenced track tags https://bugzilla.gnome.org/show_bug.cgi?id=752850
Thanks for your contribution.