GNOME Bugzilla – Bug 447000
[id3demux] add support for reading license URL from WCOP tag
Last modified: 2007-10-11 18:10:10 UTC
Check for WZZZ tag in id3demux_id3v2_parse_frame and parse appropriately. According the the id3v2 spec, these tags (with the exception of WXXX) are ISO-8859-1 encoded and there may only be one frame per tag. Also a simple modification to gst-plugins-base mapping the WCOP tag to GST_LICENSE_TAG.
Created attachment 89867 [details] [review] Appropriately check for and parse URL link frames
Created attachment 89868 [details] [review] Map WCOP id3v2 tag to GST_LICENSE_TAG
Thanks for the patch. Do you happen to have a sample file/tag with such a frame? (use 'head --bytes=100k file.mp3 > tag.id3v2' to get the beginning of a file) Couple of things I wonder about: - the 'is always encoded in ISO-8859-1' only applies to the WXXX frame, which we are explicitly not handling; it doesn't seem to apply to all the other 'W...' frames, if I read id3v2.4.0-frames.txt correctly - but even if you were right, the code still needs to do ISO-8859-1 to UTF8 conversion, if I'm not mistaken (not that an URI should contain anything but ASCII, but still; also, I wouldn't be surprised if people actually did encode URIs in UTF-16 or whatever, regardless of the spec) - I wonder if it makes sense to introduce a new GST_TAG_LICENES_URI for that kindof thing, not sure. Need to check how we handle this for vorbis/flac etc.
-Regarding the encoding, I could be wrong, but what led me to believe this was the lack specification of the encoding in frame format. "All URL link frames have the following format: <Header for 'URL link frame', ID: "W000" - "WZZZ", excluding "WXXX" described in 4.3.2.> URL <text string>" as opposed to "<Header for 'User defined URL link frame', ID: "WXXX"> Text encoding $xx Description <text string according to encoding> $00 (00) URL <text string>" and several other specs which do specify text encoding. If I do only need to convert to UTF-8, assuming ISO-8859-1, is this how it'd be done: tag_str = g_convert (data, data_size, "UTF-8", "ISO-8859-1", NULL, NULL, NULL); -In Vorbis, the spec says: LICENSE License information, eg, 'All Rights Reserved', 'Any Use Permitted', a URL to a license such as a Creative Commons license ("www.creativecommons.org/blahblah/license.html") or the EFF Open Audio License ('distributed under the terms of the Open Audio License. see http://www.eff.org/IP/Open_licenses/eff_oal.html for details'), etc. Currently the vorbis tag parser puts this as the license.
Created attachment 89897 [details] id3v2 tag of a file with WCOP tag
Created attachment 89898 [details] [review] Assume ISO-8859-1 and convert to UTF-8
You're probably right about the encoding, I read to quickly there. I'm still a bit undecided about what to do here. I don't think it's technically correct to map WCOP to GST_TAG_LICENSE, since the link might not link to a license, and we also map TCOP to GST_TAG_COPYRIGHT, so IMHO we should map WCOP also to GST_TAG_COPYRIGHT or introduce a new GST_TAG_COPYRIGHT_URI or so. The W... frames seem rather useless and ill-specified as a whole, I wish creative-commons had just decided to go for custom frames (TXXX and WXXX). But since they've adopted the ill-specified ID3 scheme, there isn't much we can do about that now, I guess. What we could do (in addition to the above or in stead of the above) is to add a helper function or two to libgsttag to recognise a set of white-listed well-known license URIs and then set the GST_TAG_LICENSE* if we find one of those. Not very elegant though. Comments?
GST_TAG_COPYRIGHT_URI makes sense to me. If you want to alias in GST_TAG_LICENSE_URI when a value looks appropriate I'd match strings beginning with http://creativecommons.org/licenses/
GST_TAG_COPYRIGHT_URI sounds good. Also related to reading the WCOP tag is reading the WOAF (official audio file webpage) tag. Creative Commons uses this tag to specify a webpage that verifies the license specified in WCOP[1]. While we probably don't want a new GST_TAG_* for this, could it at least be added to GST_TAG_EXTENDED_COMMENTS (I notice that all unrecognized Vorbis comments are also placed in this tag)? This would be great for GStreamer-based media players as they could automatically validate MP3s[2]. [1] http://wiki.creativecommons.org/MP3 [2] http://wiki.creativecommons.org/Embedded_Metadata
> Also related to reading the WCOP tag is reading the WOAF (official audio file > webpage) tag. Creative Commons uses this tag to specify a webpage that > verifies the license specified in WCOP[1]. While we probably don't want a new > GST_TAG_* for this, could it at least be added to GST_TAG_EXTENDED_COMMENTS (I > notice that all unrecognized Vorbis comments are also placed in this tag)? > This would be great for GStreamer-based media players as they could > automatically validate MP3s[2]. What's the WOAF-equivalent vorbis comment key commonly used? CONTACT?
Created attachment 90820 [details] [review] Map WCOP id3v2 tag to GST_COPYRIGHT_URI_TAG
Just for the record, I'm waiting on the CC patches to implement CC functionality in Banshee and there have been talks about Rhythmbox.
Would be good to get a opinion from media-player developers here, if the proposal suites them.
Sorry, I completely forgot about this one. IIRC the patch is okay in general, but needs a few changes (which I can't remember right now). It also needs to be implemented in id3v2mux, since you'd otherwise lose the license information when retagging. I'll look into it, feel free to post a follow-up comment in a few days if I don't get around to it.
Here is some nice overview of copyright metadata in several fileformats: http://wiki.creativecommons.org/Tracker_CC_Indexing
2007-10-11 Tim-Philipp Müller <tim at centricular dot net> Patch by: Jason Kivlighn <jkivlighn gmail com> * gst-libs/gst/tag/gstid3tag.c: * tests/check/libs/tag.c: Map ID3v2 WCOP frame to GST_TAG_COPYRIGHT_URI (#447000). 2007-10-11 Tim-Philipp Müller <tim at centricular dot net> Based on patch by: Jason Kivlighn <jkivlighn gmail com> * gst/id3demux/id3v2frames.c: Extract license/copyright URIs from ID3v2 WCOP frames (Fixes #447000). * tests/check/elements/id3demux.c: * tests/files/Makefile.am: * tests/files/id3-447000-wcop.tag: Add simple unit test. 2007-10-11 Tim-Philipp Müller <tim at centricular dot net> * ext/taglib/gstid3v2mux.cc: Add support for license/copyright URI tags (ID3v2 WCOP frame). Prerequisite for #447000.
Also: sorry it took so long.