After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 447000 - [id3demux] add support for reading license URL from WCOP tag
[id3demux] add support for reading license URL from WCOP tag
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
git master
Other Linux
: Normal enhancement
: 0.10.7
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on: 451939
Blocks:
 
 
Reported: 2007-06-13 05:48 UTC by Jason Kivlighn
Modified: 2007-10-11 18:10 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Appropriately check for and parse URL link frames (1.42 KB, patch)
2007-06-13 05:49 UTC, Jason Kivlighn
needs-work Details | Review
Map WCOP id3v2 tag to GST_LICENSE_TAG (585 bytes, patch)
2007-06-13 05:49 UTC, Jason Kivlighn
none Details | Review
id3v2 tag of a file with WCOP tag (100.00 KB, text/plain)
2007-06-13 18:26 UTC, Jason Kivlighn
  Details
Assume ISO-8859-1 and convert to UTF-8 (1.48 KB, patch)
2007-06-13 18:37 UTC, Jason Kivlighn
committed Details | Review
Map WCOP id3v2 tag to GST_COPYRIGHT_URI_TAG (591 bytes, patch)
2007-06-28 18:07 UTC, Jason Kivlighn
committed Details | Review

Description Jason Kivlighn 2007-06-13 05:48:07 UTC
Check for WZZZ tag in id3demux_id3v2_parse_frame and parse appropriately.

According the the id3v2 spec, these tags (with the exception of WXXX) are ISO-8859-1 encoded and there may only be one frame per tag.

Also a simple modification to gst-plugins-base mapping the WCOP tag to GST_LICENSE_TAG.
Comment 1 Jason Kivlighn 2007-06-13 05:49:02 UTC
Created attachment 89867 [details] [review]
Appropriately check for and parse URL link frames
Comment 2 Jason Kivlighn 2007-06-13 05:49:36 UTC
Created attachment 89868 [details] [review]
Map WCOP id3v2 tag to GST_LICENSE_TAG
Comment 3 Tim-Philipp Müller 2007-06-13 09:37:57 UTC
Thanks for the patch.

Do you happen to have a sample file/tag with such a frame? (use 'head --bytes=100k file.mp3 > tag.id3v2' to get the beginning of a file)


Couple of things I wonder about:

 - the 'is always encoded in ISO-8859-1' only applies to the WXXX frame,
   which we are explicitly not handling; it doesn't seem to apply to all
   the other 'W...' frames, if I read id3v2.4.0-frames.txt correctly   

 - but even if you were right, the code still needs to do ISO-8859-1 to
   UTF8 conversion, if I'm not mistaken (not that an URI should contain
   anything but ASCII, but still; also, I wouldn't be surprised if people
   actually did encode URIs in UTF-16 or whatever, regardless of the spec)

 - I wonder if it makes sense to introduce a new GST_TAG_LICENES_URI for
   that kindof thing, not sure. Need to check how we handle this for
   vorbis/flac etc.
Comment 4 Jason Kivlighn 2007-06-13 18:24:49 UTC
-Regarding the encoding, I could be wrong, but what led me to believe this was the lack specification of the encoding in frame format.

"All URL link frames have the
   following format:

     <Header for 'URL link frame', ID: "W000" - "WZZZ", excluding "WXXX"
     described in 4.3.2.>
     URL              <text string>"

as opposed to

"<Header for 'User defined URL link frame', ID: "WXXX">
     Text encoding     $xx
     Description       <text string according to encoding> $00 (00)
     URL               <text string>"

and several other specs which do specify text encoding.

If I do only need to convert to UTF-8, assuming ISO-8859-1, is this how it'd be done:  tag_str = g_convert (data, data_size, "UTF-8", "ISO-8859-1",
            NULL, NULL, NULL);

-In Vorbis, the spec says:

LICENSE
    License information, eg, 'All Rights Reserved', 'Any Use Permitted', a URL to a license such as a Creative Commons license ("www.creativecommons.org/blahblah/license.html") or the EFF Open Audio License ('distributed under the terms of the Open Audio License. see http://www.eff.org/IP/Open_licenses/eff_oal.html for details'), etc.

Currently the vorbis tag parser puts this as the license.
Comment 5 Jason Kivlighn 2007-06-13 18:26:00 UTC
Created attachment 89897 [details]
id3v2 tag of a file with WCOP tag
Comment 6 Jason Kivlighn 2007-06-13 18:37:40 UTC
Created attachment 89898 [details] [review]
Assume ISO-8859-1 and convert to UTF-8
Comment 7 Tim-Philipp Müller 2007-06-19 18:41:03 UTC
You're probably right about the encoding, I read to quickly there.

I'm still a bit undecided about what to do here. I don't think it's technically correct to map WCOP to GST_TAG_LICENSE, since the link might not link to a license, and we also map TCOP to GST_TAG_COPYRIGHT, so IMHO we should map WCOP also to GST_TAG_COPYRIGHT or introduce a new GST_TAG_COPYRIGHT_URI or so.

The W... frames seem rather useless and ill-specified as a whole, I wish creative-commons had just decided to go for custom frames (TXXX and WXXX). But since they've adopted the ill-specified ID3 scheme, there isn't much we can do about that now, I guess.

What we could do (in addition to the above or in stead of the above) is to add a helper function or two to libgsttag to recognise a set of white-listed well-known license URIs and then set the GST_TAG_LICENSE* if we find one of those. Not very elegant though.

Comments?
Comment 8 Mike Linksvayer 2007-06-19 20:27:33 UTC
GST_TAG_COPYRIGHT_URI makes sense to me.

If you want to alias in GST_TAG_LICENSE_URI when a value looks appropriate I'd match strings beginning with http://creativecommons.org/licenses/
Comment 9 Jason Kivlighn 2007-06-20 23:58:51 UTC
GST_TAG_COPYRIGHT_URI sounds good.

Also related to reading the WCOP tag is reading the WOAF (official audio file webpage) tag.  Creative Commons uses this tag to specify a webpage that verifies the license specified in WCOP[1].  While we probably don't want a new GST_TAG_* for this, could it at least be added to GST_TAG_EXTENDED_COMMENTS (I notice that all unrecognized Vorbis comments are also placed in this tag)?  This would be great for GStreamer-based media players as they could automatically validate MP3s[2].

[1] http://wiki.creativecommons.org/MP3
[2] http://wiki.creativecommons.org/Embedded_Metadata
Comment 10 Tim-Philipp Müller 2007-06-27 14:58:45 UTC
> Also related to reading the WCOP tag is reading the WOAF (official audio file
> webpage) tag.  Creative Commons uses this tag to specify a webpage that
> verifies the license specified in WCOP[1].  While we probably don't want a new
> GST_TAG_* for this, could it at least be added to GST_TAG_EXTENDED_COMMENTS (I
> notice that all unrecognized Vorbis comments are also placed in this tag)? 
> This would be great for GStreamer-based media players as they could
> automatically validate MP3s[2].

What's the WOAF-equivalent vorbis comment key commonly used? CONTACT?
Comment 11 Jason Kivlighn 2007-06-28 18:07:48 UTC
Created attachment 90820 [details] [review]
Map WCOP id3v2 tag to GST_COPYRIGHT_URI_TAG
Comment 12 Luke Hoersten 2007-08-14 03:30:49 UTC
Just for the record, I'm waiting on the CC patches to implement CC functionality in Banshee and there have been talks about Rhythmbox.
Comment 13 Stefan Sauer (gstreamer, gtkdoc dev) 2007-09-07 18:50:54 UTC
Would be good to get a opinion from media-player developers here, if the proposal suites them.
Comment 14 Tim-Philipp Müller 2007-09-07 19:51:52 UTC
Sorry, I completely forgot about this one.  IIRC the patch is okay in general, but needs a few changes (which I can't remember right now).  It also needs to be implemented in id3v2mux, since you'd otherwise lose the license information when retagging.  I'll look into it, feel free to post a follow-up comment in a few days if I don't get around to it.
Comment 15 Stefan Sauer (gstreamer, gtkdoc dev) 2007-10-08 20:31:43 UTC
Here is some nice overview of copyright metadata in several fileformats:
http://wiki.creativecommons.org/Tracker_CC_Indexing
Comment 16 Tim-Philipp Müller 2007-10-11 18:09:34 UTC
2007-10-11  Tim-Philipp Müller  <tim at centricular dot net>

        Patch by: Jason Kivlighn  <jkivlighn gmail com>

        * gst-libs/gst/tag/gstid3tag.c:
        * tests/check/libs/tag.c:
          Map ID3v2 WCOP frame to GST_TAG_COPYRIGHT_URI (#447000).


2007-10-11  Tim-Philipp Müller  <tim at centricular dot net>

        Based on patch by: Jason Kivlighn  <jkivlighn gmail com>

        * gst/id3demux/id3v2frames.c:
          Extract license/copyright URIs from ID3v2 WCOP frames
          (Fixes #447000).

        * tests/check/elements/id3demux.c:
        * tests/files/Makefile.am:
        * tests/files/id3-447000-wcop.tag:
          Add simple unit test.

2007-10-11  Tim-Philipp Müller  <tim at centricular dot net>

        * ext/taglib/gstid3v2mux.cc:
          Add support for license/copyright URI tags (ID3v2 WCOP frame).
          Prerequisite for #447000.

Comment 17 Tim-Philipp Müller 2007-10-11 18:10:10 UTC
Also: sorry it took so long.