After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 351309 - New Element: gst-puid
New Element: gst-puid
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
git master
Other Linux
: Normal enhancement
: 0.10.7
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2006-08-14 15:37 UTC by Milosz Derezynski
Modified: 2008-03-19 18:14 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
fingerprint bugfix, exposed fingerprint property (2.47 KB, patch)
2008-01-02 22:18 UTC, Eric Buehl
none Details | Review
fingerprint bugfix, exposed fingerprint property (2.47 KB, patch)
2008-01-02 22:18 UTC, Eric Buehl
none Details | Review
bugfixes; cumulative (4.49 KB, patch)
2008-01-03 19:42 UTC, Eric Buehl
none Details | Review
CVS patch for gstofa inclusion (18.46 KB, patch)
2008-01-14 23:13 UTC, Eric Buehl
committed Details | Review

Description Milosz Derezynski 2006-08-14 15:37:54 UTC
This element allows for PUID finding for a particular track using MusicnDNS/MusicIP and MusicBrainz. It's the replacement for the currently in use TRM audio fingerprinting system used by MusicBrainz. For more information on both please check here: http://wiki.musicbrainz.org/HowPUIDsWork, http://wiki.musicbrainz.org/TRM, and this page explains why TRM is being phased out: http://wiki.musicbrainz.org/GettingRidOfTRM.

Short description of the element: libofa (Open Fingerprint Architecture, http://musicdns.org), requires 135 seconds of raw pcm data of a file to be scanned to create a fingerprint. The element requires the data in a specific format, that's why its sink caps requirements are so strict. You can plug it into a pipeline like filesrc ! decodebin ! audioconvert ! puid ! fakesink (with sync=0 preferrably). It will automatically EOS after 135 seconds and create the fingerprint and request the metadata from musicdns using libneon.

A practical, real world example usage is:

filesrc location=<file> ! decodebin ! audioconvert ! puid musicdns-id=<musicdns-id> ! fakesink sync=0

You can obtain a free musicdns-id right here without any kind of application form or anything: http://musicdns.org/get_client_key/prd_clientid/congrats.

After it is done running, the metadata is available as properties 'artist', 'title', and 'puid', as well as the whole xml document as property 'xmldoc'

It is _not_ suitable for generally plugging it into any possible pipeline, even if the automatic EOS restriction was to be removed because it needs insane amounts of memory (there is no way to feed the data to libofa in chunks, so it has to buffer all the 135 seconds of raw pcm data until it can feed it to libofa, in addition libofa uses libfftw3, which does some kind of FT/FA so this uses even more memory).

To my code: I'd appreciate constructive comments how to improve what's wrong. Aside from potential general cleanliness issues, there are a few things that are probably badly/wrong implemented, which are mostly:

- How to know when i should stop and clean up all my local data (when parent state changes to NULL?)

- Currently i decided to not make the resulting metadata available as tags sent as a bus message, for one because GST has no tag for PUIDs, and secondly an artist and title tag might and most likely will clash with existing artist and title tags; Instead, the XML document with the metadata received is available trough a property, as well as the individual parsed fields artist, title, PUID (parsed from the XML using GMarkup parser).

(- Code formatting?)

- I'm rather sure that the place in the code which checks if caps are already negotiated and present is wrong (in transform_ip), what would be the more appropriate place?
Comment 1 Milosz Derezynski 2006-08-14 15:40:46 UTC
http://musicdns.org/get_client_key/

This is the correct page to obtain a client key.
Comment 2 Aurélien Mino 2007-05-30 16:50:01 UTC
This is really interesting since MusicBrainz is going to shut down his TRM audio fingerprinting service really soon (in a month or two).

However you've forgotten to attach your patch...
I think the code is available here: http://freshmeat.net/projects/gstpuid/
Comment 3 Milosz Derezynski 2007-07-17 00:09:16 UTC
Ok the up to date code is in current BMP SVN, and the one in the separate repository and release isn't up to date yet; furthermore the plugin needs a rewrite, it's based off basetransform which doesn't exactly make sense.

I'll try to rewrite it as soon as possible seeing the now impending doom of TRM.

Current code (not rewritten yet, but more up to date than on Freshmeat):

http://websvn.beep-media-player.org/filedetails.php?repname=bmpx&path=%2Ftrunk%2Fsrc%2Fgstpuid.h
http://websvn.beep-media-player.org/filedetails.php?repname=bmpx&path=%2Ftrunk%2Fsrc%2Fgstpuid.c

Despite being somewhat weirdly written it works very well and can calculate a working fingerprint for all stream types GStreamer can decode.
Comment 4 Milosz Derezynski 2007-10-29 21:04:55 UTC
I'd like to bring to attention that TRMs (the old fingerprinting system used by Musicbrainz) is going to be phased out (well more like turned off for good) in 90 days (as of this writing; source is the #musicbrainz channel on Freenode, it's in the topic, i couldn't find a reference on the website yet).

The inclusion of this element here (gst-puid) should be really considered now (it serves its duties very well inside BMPx for over a year; i am not an experienced GStreamer element developer so the code could most likely be improved, but generally it does work).
Comment 5 Eric Buehl 2008-01-01 18:58:02 UTC
I have found this plugin to be EXTREMELY useful.  The biggest problem I ran into when using any sort of libofa connection was the decoding of arbitrary audio files.  Of course, Gstreamer is the logical solution to this problem.  Kudos!  I look forward to future developments and hope to see gst-puid officially included.  This is a logical replacement to the currently included TRM element since it is no loner used/maintained.

As was already noted, it is unclear what this plugin should do at the end of streams.  I have noticed gstreamer crashing when the location parameter of this pipeline is changed:

filesrc location=<file> ! decodebin ! audioconvert ! puid
musicdns-id=<musicdns-id> ! fakesink sync=0

(Even when setting the pipeline status to NULL at EOS)
Comment 6 Eric Buehl 2008-01-02 19:17:00 UTC
Just one comment on the general design:

It seems as though it would be nice to be able to query more metadata from the MusicIP servers on fingerprint submission.  I think the best way to achieve this is probably to remove all lookup code and only return the fingerprint (not puid).  Then rely on the client application do to that extra processing.  Also, that would make it easier to include metadata that is already known (derived from tags) as is required, if known, by the MusicIP TOS.
Comment 7 Eric Buehl 2008-01-02 22:18:18 UTC
Created attachment 102017 [details] [review]
fingerprint bugfix, exposed fingerprint property

Fixed bug preventing multiple stream fingerprinting without reconstructing pipeline.  Added property to expose the fingerprint generated by libofa for external querying.  If no musicdns_id is provided, no request is made.
Comment 8 Eric Buehl 2008-01-02 22:18:29 UTC
Created attachment 102018 [details] [review]
fingerprint bugfix, exposed fingerprint property

Fixed bug preventing multiple stream fingerprinting without reconstructing pipeline.  Added property to expose the fingerprint generated by libofa for external querying.  If no musicdns_id is provided, no request is made.
Comment 9 Tim-Philipp Müller 2008-01-02 23:55:04 UTC
I looked at the code in SVN a while ago (link doesn't work any longer it seems, even with s/.c/.cc/), but IIRC there were quite a few (IMHO completely unneeded, in this case) c++ dependencies like boost and things like the internal minisoup stuff.

If someone was to strip this down a bit and to make a patch against gst-plugins-bad, that'd be awesome (bonus points for C or boost-less c++ and deriving from GstAudioFilter or GstBaseTransform). Just returning the fingerprint as a first step seems like a good idea to me too (any http interaction could be added at a later time).
Comment 10 Eric Buehl 2008-01-03 19:42:55 UTC
Created attachment 102061 [details] [review]
bugfixes; cumulative

Fixed a bug where fingerprints were not being created for streams with a duration shorter than the libofa imposed 135 second limit.  This exposed another bug in which an incorrect parameter was being passed to libofa (num samples).  I have "fixed" this problem, however there may be a cleaner solution.

This patch is cumulative
Comment 11 Eric Buehl 2008-01-14 23:13:30 UTC
Created attachment 102863 [details] [review]
CVS patch for gstofa inclusion

I believe this should correctly include fingerprinting capability to the gstreamer CVS tree.  I have stripped the http functionality which obviates the need for neon.  The only required library is now libofa.  Since this plugin no longer generates PUIDs (as retrieved by MusicDNS) but instead only the fingerprint generated by libofa, I have renamed it to gstofa.  At the simplest, it is merely a wrapper for the libofa.  The client application would now be responsible for any musicDNS interactions.
Comment 12 Sebastian Dröge (slomo) 2008-03-19 18:14:11 UTC
Ok, I've greatly cleaned this up now and ported it to GstAudioFilter, removed the useless identity stuff (sync to clock, perfect timestamp detection), used GstAdapter for keeping the data, flush data on NEWSEGMENT/FLUSH_STOP events, etc.

Ah, and I output the fingerprint as a tag too now, additional to the property.

Everything is now in gst-plugins-bad CVS. Thanks for your work :)

Please test with latest CVS if everything works as excepted... at least it's valgrind clean and gives fingerprints ;)



2008-03-19  Sebastian Dröge  <slomo@circular-chaos.org>

	Based on a patch by: Eric Buehl <eric dot buehl at gmail dot com>

	* configure.ac:
	* ext/ofa/Makefile.am:
	* ext/ofa/gstofa.c: (gst_ofa_base_init), (gst_ofa_finalize),
	(gst_ofa_class_init), (create_fingerprint), (gst_ofa_event),
	(gst_ofa_init), (gst_ofa_transform_ip), (gst_ofa_get_property),
	(plugin_init):
	* ext/ofa/gstofa.h:
	Add an OFA element, the successor of MusicBrainz TRM fingerprinting.
	Fixes bug #351309.