GNOME Bugzilla – Bug 615064
oggdemux: Strips free-form name in language tags
Last modified: 2018-11-03 11:16:44 UTC
There's some free-form strings associated with the audio tracks in: http://www.gnome.org/~hadess/ogg-crasher-subtitles-languages.ogg Currently, we see in Totem "English" and "English #2". We should still be able to access the free-form strings ("English" and "Commentary by Nicholas Meyer"). $ ogginfo `locate ogg-crasher-subtitles-languages.ogg` Processing file "/home/data/test-files/movies/ogg-crasher-subtitles-languages.ogg"... New logical stream (#1, serial: 00000001): type unknown New logical stream (#2, serial: 00000002): type vorbis New logical stream (#3, serial: 00000003): type vorbis Note: Stream 4 has serial number 0, which is legal but may cause problems with some tools. New logical stream (#4, serial: 00000000): type unknown Vorbis headers parsed for stream 2, information follows... Version: 0 Vendor: ogmtools v1.0.3 Channels: 2 Rate: 48000 Nominal bitrate: 160.003000 kb/s Upper bitrate not set Lower bitrate not set User comments section follows... LANGUAGE=English [eng] Vorbis headers parsed for stream 3, information follows... Version: 0 Vendor: ogmtools v1.0.3 Channels: 2 Rate: 48000 Nominal bitrate: 112.001000 kb/s Upper bitrate not set Lower bitrate not set User comments section follows... LANGUAGE=Commentary by Nicholas Meyer [eng] Warning: EOS not set on stream 1 Warning: EOS not set on stream 2 Vorbis stream 2: Total data length: 407155 bytes Playback length: 0m:23.121s Average bitrate: 140.875959 kb/s Warning: EOS not set on stream 3 Vorbis stream 3: Total data length: 286017 bytes Playback length: 0m:23.294s Average bitrate: 98.225746 kb/s Warning: EOS not set on stream 4 gst-launch-0.10 -t uridecodebin uri=file://`locate ogg-crasher-subtitles-languages.ogg` ! fakesink Setting pipeline to PAUSED ... Pipeline is PREROLLING ... FOUND TAG : found by element "oggdemux0". container format: Ogg FOUND TAG : found by element "vorbisdec0". language code: eng encoder: ogmtools v1.0.3 encoder version: 0 audio codec: Vorbis nominal bitrate: 160003 bitrate: 160003 FOUND TAG : found by element "vorbisdec1". language code: eng encoder: ogmtools v1.0.3 encoder version: 0 audio codec: Vorbis nominal bitrate: 112001 bitrate: 112001 Pipeline is PREROLLED ... Setting pipeline to PLAYING ... New clock: GstSystemClock Got EOS from element "pipeline0". Execution ended after 1018537460 ns. Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ...
Putting this information into the LANGUAGE tag feels wrong, shouldn't this be in DESCRIPTION or something like that? What would you propose how this information should be made available? We could probably add another DESCRIPTION tag for the non-language-code part of LANGUAGE tags if that's what you need ;)
It's not really a language tag to start with tbh. It's a description of the audio track.
Well, it's the LANGUAGE tag :) Unfortunately it isn't standardised it seems... Would you be fine with having something like GST_TAG_LANGUAGE=eng GST_TAG_DESCRIPTION=Commentary by Nicholas Meyer GST_TAG_DESCRIPTION=otherdescription tag
Ideally, if there is a TITLE, that's what should be displayed, as DESCRIPTION will likely be a bit long, though even TITLE might. Maybe something like (with title, language, and category being present or not): title language category display Y Y Y title (language category) Y Y - title (language) Y - Y title (category) Y - - tile - Y Y language category - Y - language - - Y category - - - hmm, just "audio #1", "subtitles #1", etc ? So in the above example, you get (or would get if the language code was ISO 639-1 as it's supposed to be): Commentary by Nicholas Meyer (English) If you don't have a title, it'd just give: English
This is still current with the latest 0.10 releases.
Still current in 1.0. I don't think that GStreamer should modify the language tag, other than helping parse it. So Sebastian's suggestion in comment 3 would be fine by me. The UIs can then act upon those (and you can offer a separate helper to transform the information into comment 4).
In 1.0 we now have GST_TAG_LANGUAGE_CODE and GST_TAG_LANGUAGE_NAME. However, it seems wrong to put this free-form info into LANGUAGE_NAME. But then the problem is that we have no way of telling whether that free-form string is actually a language name at all or whether it's descriptive. What Sebastian suggested in comment #3 would work for me, but I don't know if we can actually do it right, unless we always put any freeform strings into TAG_DESCRIPTION (but then what if it's a language name, then people will file a bug that it should've been in TAG_LANGUAGE name with it clearly being labelled as LANGUAGE=xyz after all...)
Maybe we should just add a new LANGUAGE_DESCRIPTION tag and deprecate LANGUAGE_NAME or so?
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/issues/33.