GNOME Bugzilla – Bug 657183
Cue sheet parsing for GStreamer extractor
Last modified: 2011-10-03 10:34:22 UTC
Hi! Here's a patch set with the first draft of my cue sheet extraction code. In short: - it works for the GStreamer extractor only - adds a new optional dependency on libcue[1]. - supports both external and embedded (as Vorbis comment) cuesheets I took a guess how to model an audio file that contains multiple tracks: basically the file itself is stored as nfo:Audio only while each track is stored as nmm:MusicPiece. The nie:url of the tracks has a fragment identifier marking where the particular track starts in the audio. Let me know if this is a good approach or if there's a better way. Another concern is that the time values aren't going to be sample-accurate; the offset specified in the track nie:url is a double seconds value (which isn't too bad) but the nie:duration is for some reason an int64 that holds seconds; for an audio player this is of no real use, and it seems a shame to require them to parse the cuesheet again just because we lack precision. I don't really have any ideas how to fix this (or any direct need to, it just worries me). The patches are mostly refactoring the existing gstreamer extractor to prepare for the new code, only the last two actually add functionality. It adds an API which should be relatively compatible with that proposed in #540890. I plan to add some functional tests, I'll push these in a feature branch at a later date since there'll be binary data etc. Thanks! [1] http://www.sourceforge.net/projects/libcue
Created attachment 194509 [details] [review] tracker-extract-gstreamer: Remove 'needs_audio' flag Refactor away this variable, it's not needed
Created attachment 194510 [details] [review] tracker-extract-gstreamer: Remove check for GST_VERSION > 0.10.20 Tracker now requires at least 0.10.30 in any case.
Created attachment 194511 [details] [review] tracker-extract-gstreamer: Use gst_tag_list_insert() Remove add_tags() function which needlessly reimplements gst_tag_list_insert().
Created attachment 194512 [details] [review] tracker-extract-gstreamer: Don't pass file URL needlessly Also some changes to avoid confusion: * Rename 'uri' to 'file_url' when there are lots of URIs around * Rename add_date_time_gst_tag() (which still uses the file URL to guarantee metadata) to add_date_time_gst_tag_with_mtime_fallback()
Created attachment 194513 [details] [review] tracker-extract-gstreamer: Rework album info reading * Use GST_TAG_ALBUM_ARTIST for album artist if available (otherwise fall back on performer/artist) * Be consistant with variable names * Remove unused 'scount' parameter
Created attachment 194514 [details] [review] tracker-extract-gstreamer: Avoid duplicate nmm:Artist inserts Maintain an internal list to avoid creating more than one INSERT per nmm:Artist. Once CUE reading lands, a single file may contain 15 songs by one artist and previously this would have resulted in 15 identical INSERT statements in the preupdate.
Created attachment 194515 [details] [review] tracker-extract-gstreamer: Break up extract_metadata() into functions
Created attachment 194516 [details] [review] tracker-extract-gstreamer: Factor out type setting code
Created attachment 194517 [details] [review] tracker-extract-gstreamer: Pass tag list explicitly This allows smarter processing in future using information from cue sheets and other sources.
Created attachment 194518 [details] [review] libtracker-common: Add tracker_case_match_filename_without_extension() This is used by the cue sheet parser to match audio files which are listed with incorrect extensions.
Created attachment 194519 [details] [review] tracker-extract: Add libcue-based CUE sheet parser This is currently only usable by the GStreamer extractor due to the use of GstTagList.
Created attachment 194520 [details] [review] tracker-extract-gstreamer: Use ToC information Fixes GB#657183
Created attachment 194577 [details] [review] tracker-extract-gstreamer: Simplify tagcache behaviour Avoid having a NULL tagcache, which allows us to use gst_tag_list_insert() to merge new tags.
Created attachment 194578 [details] [review] tracker-extract-gstreamer: Don't pass file URL needlessly Also some changes to avoid confusion: * Rename 'uri' to 'file_url' when there are lots of URIs around * Rename add_date_time_gst_tag() (which still uses the file URL to guarantee metadata) to add_date_time_gst_tag_with_mtime_fallback()
Created attachment 194579 [details] [review] tracker-extract-gstreamer: Rework album info reading * Use GST_TAG_ALBUM_ARTIST for album artist if available (otherwise fall back on performer/artist) * Be consistant with variable names * Remove unused 'scount' parameter
Created attachment 194580 [details] [review] tracker-extract-gstreamer: Avoid duplicate nmm:Artist inserts Maintain an internal list to avoid creating more than one INSERT per nmm:Artist. Once CUE reading lands, a single file may contain 15 songs by one artist and previously this would have resulted in 15 identical INSERT statements in the preupdate.
Created attachment 194581 [details] [review] tracker-extract-gstreamer: Break up extract_metadata() into functions
Created attachment 194582 [details] [review] tracker-extract-gstreamer: Factor out type setting code
Created attachment 194583 [details] [review] tracker-extract-gstreamer: Pass tag list explicitly This allows smarter processing in future using information from cue sheets and other sources.
Created attachment 194584 [details] [review] libtracker-common: Add tracker_case_match_filename_without_extension() This is used by the cue sheet parser to match audio files which are listed with incorrect extensions.
Created attachment 194585 [details] [review] tracker-extract: Add libcue-based CUE sheet parser This is currently only usable by the GStreamer extractor due to the use of GstTagList.
Created attachment 194586 [details] [review] tracker-extract-gstreamer: Use ToC information Fixes GB#657183
Created attachment 194641 [details] [review] tracker-extract: Add libcue-based CUE sheet parser This is currently only usable by the GStreamer extractor due to the use of GstTagList.
Created attachment 194642 [details] [review] tracker-extract-gstreamer: Use ToC information Fixes GB#657183
(Note these patches are also available in the cuesheets branch of Tracker git)
I've rebased this branch on top of 0.12 and pushed it as cuesheets-0.12 I have also found and fixed a couple of bugs
Branch is updated, hopefully this is the final version. I submitted a bug to Nepomuk upstream about the nfo:containerStartTime property: https://sourceforge.net/apps/trac/oscaf/ticket/123
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report. This will be in 0.12.4.