After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 726419 - tracker-extract reports sparql update errors for jpeg with dc:indenfitier
tracker-extract reports sparql update errors for jpeg with dc:indenfitier
Status: RESOLVED FIXED
Product: tracker
Classification: Core
Component: Extractor
0.17.x
Other Linux
: Normal normal
: ---
Assigned To: tracker-extractor
Depends on:
Blocks:
 
 
Reported: 2014-03-15 14:47 UTC by Torsten Scholak
Modified: 2014-03-19 17:16 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Torsten Scholak 2014-03-15 14:47:47 UTC
I am running tracker-extract manually and in debug mode within gdb to find out why a lot of my files don't get indexed. I discovered problems with small gifs that produce crashes (Bug 726355) and now it seems tracker gets stuck indefinitely after some unsuccessful database transactions. (According to the man page of tracker-extract, if called with no arguments, tracker-extract will eventually quit and not poll forever. I therefore assume that what I see is not the expected behaviour.)

The log is redonkulously  long and I have no idea what exactly is important. I won't post the whole thing here, because it contains private stuff. But have a look at this excerpt:

--

$ gdb /usr/libexec/tracker-extract
GNU gdb (Gentoo 7.6.2 p1) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /usr/libexec/tracker-extract...Reading symbols from /usr/lib64/debug/usr/libexec/tracker-extract.debug...done.
done.
(gdb) run -v 3
Starting program: /usr/libexec/tracker-extract -v 3
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffea01d700 (LWP 29402)]
[New Thread 0x7fffe981c700 (LWP 29403)]
[New Thread 0x7fffe901b700 (LWP 29404)]
Tracker-Message: Starting tracker-extract 0.17.5
Tracker-Message: General options:
Tracker-Message:   Verbosity  ............................  3
Tracker-Message:   Sched Idle  ...........................  2
Tracker-Message:   Max bytes (per file)  .................  1048576
Tracker-Message: Setting priority nice level to 19
Tracker-Message: Setting memory limitations: total is 33.4 GB, minimum is 256 MB, recommended is ~1 GB
Tracker-Message:   Virtual/Heap set to 16.7 GB (50% of total or MAXLONG)
Tracker-Message: Loading extractor rules... (/usr/share/tracker/extract-rules)
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-abw.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-dvi.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-flac.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-gif.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-html.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-ico.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-jpeg.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-mp3.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-pdf.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-png.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-ps.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-svg.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-tiff.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-vorbis.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '10-xmp.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '11-iso.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '15-gstreamer-guess.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '90-gstreamer-audio-generic.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '90-gstreamer-image-generic.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '90-gstreamer-video-generic.rule'
(tracker-extract:29398): Tracker-DEBUG:   Loaded rule '90-text-generic.rule'
Tracker-Message: Extractor rules loaded
[New Thread 0x7fffe881a700 (LWP 29405)]
[New Thread 0x7fffdbfff700 (LWP 29406)]
[New Thread 0x7fffdb7fe700 (LWP 29407)]
[New Thread 0x7fffdaffd700 (LWP 29408)]
[New Thread 0x7fffda7fc700 (LWP 29409)]
[New Thread 0x7fffd9ffb700 (LWP 29410)]
[New Thread 0x7fffd97fa700 (LWP 29411)]
[New Thread 0x7fffd8ff9700 (LWP 29412)]
[New Thread 0x7fffd87f8700 (LWP 29413)]
[New Thread 0x7fffd7ff7700 (LWP 29414)]
Tracker-Message: Initializing Storage...
Tracker-Message: Mount monitors set up for to watch for added, removed and pre-unmounts...
Tracker-Message: No mounts found to iterate
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:33: Waiting for service to become available...
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:41: Service is ready
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:43: Constructing connection
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:183: Using backend = 'AUTO'
(tracker-extract:29398): Tracker-DEBUG: Locale 'TRACKER_LOCALE_LANGUAGE' was set to 'en_CA.utf8'
(tracker-extract:29398): Tracker-DEBUG: Locale 'TRACKER_LOCALE_TIME' was set to 'en_CA.utf8'
(tracker-extract:29398): Tracker-DEBUG: Locale 'TRACKER_LOCALE_COLLATE' was set to 'en_CA.utf8'
(tracker-extract:29398): Tracker-DEBUG: Locale 'TRACKER_LOCALE_NUMERIC' was set to 'en_CA.utf8'
(tracker-extract:29398): Tracker-DEBUG: Locale 'TRACKER_LOCALE_MONETARY' was set to 'en_CA.utf8'
Tracker-Message: Current and DB locales match: 'en_CA.utf8'
Tracker-Message: Setting database locations
Tracker-Message: Checking database files exist
Tracker-Message: Opened sqlite3 database:'/home/tscholak/.cache/tracker/meta.db'
(tracker-extract:29398): Tracker-DEBUG: Resetting collator in db interface 0x62ac30
(tracker-extract:29398): Tracker-DEBUG: [ICU collation] Initializing collator for locale 'en_CA.utf8'
(tracker-extract:29398): Tracker-DEBUG: Preparing query: 'PRAGMA journal_mode = WAL;'
Tracker-Message:   Setting page size to 8192
Tracker-Message:   Setting cache size to 250
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:45: Backend is ready
Tracker-Message: Registering D-Bus object...
Tracker-Message:   Path:'/org/freedesktop/Tracker1/Miner/Extract'
Tracker-Message:   Object Type:'TrackerExtractDecorator'
[New Thread 0x7fffd77f6700 (LWP 29415)]
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (rdf:type) { }'
(tracker-extract:29398): Tracker-DEBUG: Preparing query: 'SELECT COALESCE((SELECT ID FROM Resource WHERE Uri = ?), 0) FROM (SELECT 1)'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (nie:dataSource) { }'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (<http://www.tracker-project.org/ontologies/tracker#extractor-data-source>) { }'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (nfo:Document) { }'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (nfo:Audio) { }'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (nfo:Image) { }'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (nfo:Video) { }'
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:75: Tracker.Sparql.Backend.query(): 'select tracker:id (nfo:FilesystemImage) { }'
libmediaart-Message: Initializing Storage...
libmediaart-Message: Mount monitors set up for to watch for added, removed and pre-unmounts...
libmediaart-Message: No mounts found to iterate
(tracker-extract:29398): Tracker-DEBUG: tracker-backend.vala:84: Tracker.Sparql.Backend.query_async(): 'SELECT tracker:id(?urn) tracker:id(?type) {   ?urn a rdfs:Resource ;        a ?type. FILTER (! EXISTS { ?urn nie:dataSource <http://www.tracker-project.org/ontologies/tracker#extractor-data-source> } && ?type IN (nfo:Document,nfo:Audio,nfo:Image,nfo:Video,nfo:FilesystemImage) && BOUND(tracker:available(?urn)))}'
(tracker-extract:29398): Tracker-DEBUG: Preparing query: 'SELECT "1_u", "2_u" FROM (SELECT "rdfs:Resource1"."ID" AS "1_u", "rdfs:Resource_rdf:type2"."rdf:type" AS "2_u" FROM "rdfs:Resource" AS "rdfs:Resource1", "rdfs:Resource_rdf:type" AS "rdfs:Resource_rdf:type2" WHERE "rdfs:Resource1"."ID" = "rdfs:Resource_rdf:type2"."ID" AND ((NOT ((EXISTS (SELECT 1 FROM "nie:DataObject_nie:dataSource" AS "nie:DataObject_nie:dataSource3" WHERE "1_u" = "nie:DataObject_nie:dataSource3"."ID" AND "nie:DataObject_nie:dataSource3"."nie:dataSource" = (SELECT ID FROM Resource WHERE Uri = ?)))) AND "2_u" IN (COALESCE((SELECT ID FROM Resource WHERE Uri = ?), 0), COALESCE((SELECT ID FROM Resource WHERE Uri = ?), 0), COALESCE((SELECT ID FROM Resource WHERE Uri = ?), 0), COALESCE((SELECT ID FROM Resource WHERE Uri = ?), 0), COALESCE((SELECT ID FROM Resource WHERE Uri = ?), 0))) AND ((SELECT "tracker:available" FROM "nie:DataObject" WHERE ID = "1_u") IS NOT NULL)))'

[... extracting ...]

(tracker-extract:29398): Tracker-WARNING **: Task 39, error: Not a ISO 8601 date string. Allowed form is [-]CCYY-MM-DDThh:mm:ss[Z|(+|-)hh:mm]

(tracker-extract:29398): Tracker-WARNING **: Sparql update was:
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:artist:Two%20Door%20Cinema%20Club> a nmm:Artist ;
	 nmm:artistName "Two Door Cinema Club" .
}
}
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird> a nmm:Artist ;
	 nmm:artistName "Alex Trimble, Sam Halliday & Kevin Baird" .
}
}
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> a nmm:MusicAlbum ;
	 nmm:albumTitle "Tourist History" ;
	 nmm:albumArtist <urn:artist:Two%20Door%20Cinema%20Club> .
}
}
DELETE {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> nmm:albumTrackCount ?unknown .
}
WHERE {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> nmm:albumTrackCount ?unknown .
}
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> nmm:albumTrackCount 10 .
}
}
DELETE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:setNumber ?unknown .
}
WHERE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:setNumber ?unknown .
}
DELETE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:albumDiscAlbum ?unknown .
}
WHERE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:albumDiscAlbum ?unknown .
}
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> a nmm:MusicAlbumDisc ;
	 nmm:setNumber 1 ;
	 nmm:albumDiscAlbum <urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> .
}
}

INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:uuid:bbe43046-09a7-02d6-f23b-7a7add8f1cfd> nie:dataSource <http://www.tracker-project.org/ontologies/tracker#extractor-data-source> .
<urn:uuid:bbe43046-09a7-02d6-f23b-7a7add8f1cfd> a nfo:Audio , nmm:MusicPiece ;
	 nfo:genre "Alternative" ;
	 nie:title "Undercover Martyn" ;
	 nie:contentCreated "1924-09-21T11:48:07-500" ;
	 nie:copyright "℗ 2010 Two Door Cinema Club under exclusive license to Kitsuné France, Under exclusive license to Cooperative Music for Europe. Cooperative Music is a division of V2 Records International." ;
	 nmm:trackNumber 7 ;
	 nfo:codec "MPEG-4 AAC audio" ;
	 nmm:performer <urn:artist:Two%20Door%20Cinema%20Club> ;
	 nmm:composer <urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird> ;
	 nmm:musicAlbum <urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> ;
	 nmm:musicAlbumDisc <urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> ;
	 nfo:channels 2 ;
	 nfo:sampleRate 44100 ;
	 nfo:duration 166 .
}
}



(tracker-extract:29398): Tracker-WARNING **: Task 40, error: Property 'http://purl.org/dc/elements/1.1/indentifier' not found in the ontology

(tracker-extract:29398): Tracker-WARNING **: Sparql update was:
INSERT { GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> { _:tag a nao:Tag ; nao:prefLabel "TELEVISION SERIES" }  }
WHERE { FILTER (NOT EXISTS { ?tag a nao:Tag ; nao:prefLabel "TELEVISION SERIES" }) }
INSERT { GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> { _:tag a nao:Tag ; nao:prefLabel "OUTPUT" }  }
WHERE { FILTER (NOT EXISTS { ?tag a nao:Tag ; nao:prefLabel "OUTPUT" }) }
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:contact:ART%20STREIBER> a nco:Contact ;
	 nco:fullname "ART STREIBER" .
}
}
INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:uuid:3a6ea874-982b-4d99-86d6-e2efbbde25ce> a nco:PostalAddress ;
	 nco:country "USA" .
}
}

INSERT {
GRAPH <urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540> {
<urn:uuid:e34faa57-7c9b-c19c-74a6-3d37f959a27b> nie:dataSource <http://www.tracker-project.org/ontologies/tracker#extractor-data-source> .
<urn:uuid:e34faa57-7c9b-c19c-74a6-3d37f959a27b> a nfo:Image ;
	 a nmm:Photo ;
	 nfo:width 2707 ;
	 nfo:height 3000 ;
	 nmm:dlnaProfile "JPEG_LRG" ;
	 nmm:dlnaMime "image/jpeg" ;
	 dc:indentifier "ASAP ENTERTAINMENT ARRESTED DEV                                " ;
	 nao:hasTag ?tag1 ;
	 nao:hasTag ?tag2 ;
	 nie:title "ARRESTED DEVELOPMENT" ;
	 nie:contentCreated "2011-10-04T19:38:48Z" ;
	 nie:description "It's OK honey, they're in heaven now. The Arrested Development cast sporting their white after Labor Day best. (AP Photo/HO/Fox Broadcasting Co./Art Streiber)" ;
	 nco:creator <urn:contact:ART%20STREIBER> ;
	 nie:comment "Generated by  IJG JPEG Library" ;
	 slo:location [ a slo:GeoLocation ;
	 slo:postalAddress <urn:uuid:3a6ea874-982b-4d99-86d6-e2efbbde25ce>] ;
	 nfo:horizontalResolution 200 ;
	 nfo:verticalResolution 200 .
}
}
WHERE {
?tag1 a nao:Tag ; nao:prefLabel "TELEVISION SERIES" .
?tag2 a nao:Tag ; nao:prefLabel "OUTPUT" .
}

[Thread 0x7fffc6faf700 (LWP 29452) exited]
^C
Program received signal SIGINT, Interrupt.
0x00007ffff60b0d3d in poll () at ../sysdeps/unix/syscall-template.S:81
81	../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) where
  • #0 poll
    at ../sysdeps/unix/syscall-template.S line 81
  • #1 g_main_context_poll
    at /var/tmp/portage/dev-libs/glib-2.39.91/work/glib-2.39.91/glib/gmain.c line 4028
  • #2 g_main_context_iterate
    at /var/tmp/portage/dev-libs/glib-2.39.91/work/glib-2.39.91/glib/gmain.c line 3729
  • #3 g_main_loop_run
    at /var/tmp/portage/dev-libs/glib-2.39.91/work/glib-2.39.91/glib/gmain.c line 3928
  • #4 main

--

As you can see, there are two problems here. First, the date string in contentCreated of an mp4 audio file is invalid. And second, some property of an image is not appreciated.

After that, nothing further happens. tracker-control says that the extractor is idle. I think it should eventually quit. The problem is exactly reproducible. tracker-extract crawls through the same files again after it has been called another time.
Comment 1 Martyn Russell 2014-03-18 14:00:35 UTC
Hmm, you cut some of the debugging out so I can't tell which extractor you're using to get these errors.

The generated SPARQL is clearly a problem here.
Can you tell me:

a) which extractor you're using here.
b) provide a copy of the file breaking here so I can test.

It looks like some updates are needed to fix using broken ontology or syntax.

Thanks :)
Comment 2 Torsten Scholak 2014-03-18 19:54:32 UTC
I cannot reproduce the error with tracker 0.17.6 :(
The result for contentCreated is clearly wrong, but this I reckon is a fault of Gstreamer.

BTW, the song's title is coincidence....

$ /usr/libexec/tracker-extract -v 3 -f 07\ Undercover\ Martyn.m4a 
Locale 'TRACKER_LOCALE_LANGUAGE' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_TIME' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_COLLATE' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_NUMERIC' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_MONETARY' was set to 'en_CA.utf8'
Initializing Storage...
Mount monitors set up for to watch for added, removed and pre-unmounts...
No mounts found to iterate
Setting priority nice level to 19
Loading extractor rules... (/usr/share/tracker/extract-rules)
  Loaded rule '10-abw.rule'
  Loaded rule '10-dvi.rule'
  Loaded rule '10-flac.rule'
  Loaded rule '10-gif.rule'
  Loaded rule '10-html.rule'
  Loaded rule '10-ico.rule'
  Loaded rule '10-jpeg.rule'
  Loaded rule '10-mp3.rule'
  Loaded rule '10-pdf.rule'
  Loaded rule '10-png.rule'
  Loaded rule '10-ps.rule'
  Loaded rule '10-svg.rule'
  Loaded rule '10-tiff.rule'
  Loaded rule '10-vorbis.rule'
  Loaded rule '10-xmp.rule'
  Loaded rule '11-iso.rule'
  Loaded rule '15-gstreamer-guess.rule'
  Loaded rule '90-gstreamer-audio-generic.rule'
  Loaded rule '90-gstreamer-image-generic.rule'
  Loaded rule '90-gstreamer-video-generic.rule'
  Loaded rule '90-text-generic.rule'
Extractor rules loaded
Setting memory limitations: total is 33.4 GB, minimum is 256 MB, recommended is ~1 GB
  Virtual/Heap set to 16.7 GB (50% of total or MAXLONG)
MIME type guessed as 'audio/mp4' (from GIO)
Using /usr/lib64/tracker-1.0/extract-modules/libextract-gstreamer.so...
GStreamer backend in use:
  Discoverer/GUPnP-DLNA
Retrieving geolocation metadata...
Processing media art: artist:'Two Door Cinema Club', title:'Tourist History', type:'album', uri:'file:///home/tscholak/iTunes/iTunes%20Media/Music/Two%20Door%20Cinema%20Club/Tourist%20History/07%20Undercover%20Martyn.m4a'. Buffer is 0 bytes, mime:'(null)'
Album art already exists for uri:'file:///home/tscholak/iTunes/iTunes%20Media/Music/Two%20Door%20Cinema%20Club/Tourist%20History/07%20Undercover%20Martyn.m4a' as '/home/tscholak/.cache/media-art/album-4ba7608e444143db98bc78e6db45db19-34ea019223e9568c0ab3e503598bb09e.jpeg'
Done (15 objects added)


SPARQL pre-update:
--
INSERT {
<urn:artist:Two%20Door%20Cinema%20Club> a nmm:Artist ;
	 nmm:artistName "Two Door Cinema Club" .
}
INSERT {
<urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird> a nmm:Artist ;
	 nmm:artistName "Alex Trimble, Sam Halliday & Kevin Baird" .
}
INSERT {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> a nmm:MusicAlbum ;
	 nmm:albumTitle "Tourist History" ;
	 nmm:albumArtist <urn:artist:Two%20Door%20Cinema%20Club> .
}
DELETE {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> nmm:albumTrackCount ?unknown .
}
WHERE {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> nmm:albumTrackCount ?unknown .
}
INSERT {
<urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> nmm:albumTrackCount 10 .
}
DELETE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:setNumber ?unknown .
}
WHERE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:setNumber ?unknown .
}
DELETE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:albumDiscAlbum ?unknown .
}
WHERE {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> nmm:albumDiscAlbum ?unknown .
}
INSERT {
<urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> a nmm:MusicAlbumDisc ;
	 nmm:setNumber 1 ;
	 nmm:albumDiscAlbum <urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> .
}
--

SPARQL item:
--
 a nfo:Audio , nmm:MusicPiece ;
	 nfo:genre "Alternative" ;
	 nie:title "Undercover Martyn" ;
	 nie:contentCreated "1924-09-21T11:48:07-500" ;
	 nie:copyright "℗ 2010 Two Door Cinema Club under exclusive license to Kitsuné France, Under exclusive license to Cooperative Music for Europe. Cooperative Music is a division of V2 Records International." ;
	 nmm:trackNumber 7 ;
	 nfo:codec "MPEG-4 AAC audio" ;
	 nmm:performer <urn:artist:Two%20Door%20Cinema%20Club> ;
	 nmm:composer <urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird> ;
	 nmm:musicAlbum <urn:album:Tourist%20History:Two%20Door%20Cinema%20Club> ;
	 nmm:musicAlbumDisc <urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1> ;
	 nfo:channels 2 ;
	 nfo:sampleRate 44100 ;
	 nfo:duration 166 .
--

SPARQL where clause:
--
--

SPARQL post-update:
--
--
Comment 3 Torsten Scholak 2014-03-18 20:03:54 UTC
also:

$ /usr/libexec/tracker-extract -v 3 -f 04-arrested-development.jpg 
Locale 'TRACKER_LOCALE_LANGUAGE' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_TIME' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_COLLATE' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_NUMERIC' was set to 'en_CA.utf8'
Locale 'TRACKER_LOCALE_MONETARY' was set to 'en_CA.utf8'
Initializing Storage...
Mount monitors set up for to watch for added, removed and pre-unmounts...
No mounts found to iterate
Setting priority nice level to 19
Loading extractor rules... (/usr/share/tracker/extract-rules)
  Loaded rule '10-abw.rule'
  Loaded rule '10-dvi.rule'
  Loaded rule '10-flac.rule'
  Loaded rule '10-gif.rule'
  Loaded rule '10-html.rule'
  Loaded rule '10-ico.rule'
  Loaded rule '10-jpeg.rule'
  Loaded rule '10-mp3.rule'
  Loaded rule '10-pdf.rule'
  Loaded rule '10-png.rule'
  Loaded rule '10-ps.rule'
  Loaded rule '10-svg.rule'
  Loaded rule '10-tiff.rule'
  Loaded rule '10-vorbis.rule'
  Loaded rule '10-xmp.rule'
  Loaded rule '11-iso.rule'
  Loaded rule '15-gstreamer-guess.rule'
  Loaded rule '90-gstreamer-audio-generic.rule'
  Loaded rule '90-gstreamer-image-generic.rule'
  Loaded rule '90-gstreamer-video-generic.rule'
  Loaded rule '90-text-generic.rule'
Extractor rules loaded
Setting memory limitations: total is 33.4 GB, minimum is 256 MB, recommended is ~1 GB
  Virtual/Heap set to 16.7 GB (50% of total or MAXLONG)
MIME type guessed as 'image/jpeg' (from GIO)
Using /usr/lib64/tracker-1.0/extract-modules/libextract-jpeg.so...
Done (19 objects added)


SPARQL pre-update:
--
INSERT { _:tag a nao:Tag ; nao:prefLabel "TELEVISION SERIES" }
WHERE { FILTER (NOT EXISTS { ?tag a nao:Tag ; nao:prefLabel "TELEVISION SERIES" }) }
INSERT { _:tag a nao:Tag ; nao:prefLabel "OUTPUT" }
WHERE { FILTER (NOT EXISTS { ?tag a nao:Tag ; nao:prefLabel "OUTPUT" }) }
INSERT {
<urn:contact:ART%20STREIBER> a nco:Contact ;
	 nco:fullname "ART STREIBER" .
}
INSERT {
<urn:uuid:56f2a92b-da43-4540-b559-f8daed82e0ee> a nco:PostalAddress ;
	 nco:country "USA" .
}
--

SPARQL item:
--
 a nfo:Image ;
	 a nmm:Photo ;
	 nfo:width 2707 ;
	 nfo:height 3000 ;
	 nmm:dlnaProfile "JPEG_LRG" ;
	 nmm:dlnaMime "image/jpeg" ;
	 dc:indentifier "ASAP ENTERTAINMENT ARRESTED DEV                                " ;
	 nao:hasTag ?tag1 ;
	 nao:hasTag ?tag2 ;
	 nie:title "ARRESTED DEVELOPMENT" ;
	 nie:contentCreated "2014-03-18T20:00:39Z" ;
	 nie:description "It's OK honey, they're in heaven now. The Arrested Development cast sporting their white after Labor Day best. (AP Photo/HO/Fox Broadcasting Co./Art Streiber)" ;
	 nco:creator <urn:contact:ART%20STREIBER> ;
	 nie:comment "Generated by  IJG JPEG Library" ;
	 slo:location [ a slo:GeoLocation ;
	 slo:postalAddress <urn:uuid:56f2a92b-da43-4540-b559-f8daed82e0ee>] ;
	 nfo:horizontalResolution 200 ;
	 nfo:verticalResolution 200 .
--

SPARQL where clause:
--
?tag1 a nao:Tag ; nao:prefLabel "TELEVISION SERIES" .
?tag2 a nao:Tag ; nao:prefLabel "OUTPUT" .
--

SPARQL post-update:
--
--
Comment 4 Torsten Scholak 2014-03-18 20:15:15 UTC
regarding gstreamer and the mp4 file:

$ gst-discoverer-1.0 07\ Undercover\ Martyn.m4a
Analyzing file:///home/tscholak/iTunes/iTunes%20Media/Music/Two%20Door%20Cinema%20Club/Tourist%20History/07%20Undercover%20Martyn.m4
a
Done discovering file:///home/tscholak/iTunes/iTunes%20Media/Music/Two%20Door%20Cinema%20Club/Tourist%20History/07%20Undercover%20Martyn.m4a

Topology:
  container: MPEG-4 AAC
    audio: MPEG-4 AAC

Properties:
  Duration: 0:02:46.579954648
  Seekable: yes
  Tags: 
      datetime: 1924-09-21T11:48:07-0500
      title: Undercover Martyn
      composer: Alex Trimble, Sam Halliday & Kevin Baird
      artist: Two Door Cinema Club
      album artist: Two Door Cinema Club
      album: Tourist History
      copyright: ? 2010 Two Door Cinema Club under exclusive license to Kitsun? France, Under exclusive license to Cooperative Music for Europe. Cooperative Music is a division of V2 Records International.
      date: 2010-03-01
      track number: 7
      track count: 10
      disc number: 1
      disc count: 1
      genre: Alternative
      preview image: [... hex ...]
      ... ? ...
      QT atom: [... hex ...]
      container format: ISO MP4/M4A
      audio codec: MPEG-4 AAC audio
      maximum bitrate: 326944
      bitrate: 256000
      language code: en
Comment 5 Torsten Scholak 2014-03-18 20:31:49 UTC
Argh, I really hate to spam the bug tracker, but I now reran tracker-extract without arguments (other than -v 3), and the problems reappear. Maybe I just don't understand how "Task" error warnings and Sparql update warnings are related... I thought the Sparql warning is there to allow to pinpoint the origin of the "Task" error. If this is indeed the case, then I again must conclude that the mp4 file raises the "Not a ISO 8601 date string." error, whereas the jpg gives "Property '...' not found in the ontology" --
in contradiction to the results obtained above from applying tracker-extract to a single file.
Comment 6 Martyn Russell 2014-03-19 09:08:18 UTC
(In reply to comment #5)
> Argh, I really hate to spam the bug tracker,

Not a problem, that's what it is for ;)

> but I now reran tracker-extract
> without arguments (other than -v 3), and the problems reappear. Maybe I just
> don't understand how "Task" error warnings and Sparql update warnings are
> related... I thought the Sparql warning is there to allow to pinpoint the
> origin of the "Task" error. If this is indeed the case, then I again must

So, running tracker-extract manually just tells you what output it generates for a file. It is not verified. It may not work. It totally depends on the ontology installed and used by the database.

The task errors occur when we actually try to insert the output of tracker-extract into the database. The errors they give are based on what the store tells us about mismatches with the ontology or tables/data in the DB.

> conclude that the mp4 file raises the "Not a ISO 8601 date string." error,
> whereas the jpg gives "Property '...' not found in the ontology" --
> in contradiction to the results obtained above from applying tracker-extract to
> a single file.

In your particular case, there are 2 problems:

1. The gstreamer extractor is not using the correct ISO8601 time format. All times / dates inserted into Tracker must use this standard (mainly so we have consistency across timezones and queries). I think the problem there (upon initial inspection), is a missing Z or T in the string. I need to double check that.

2. The property not found error is about the extractor trying to use a property that no longer exists. This is entirely possible as the ontology (or DB schema) matures over time.

These two should be fairly easy to fix.

To be sure I have fixed them, can you attach the file creating these errors here to test with?

Thanks :)
Comment 7 Torsten Scholak 2014-03-19 13:01:30 UTC
you can get the jpg here:
http://pixel.nymag.com/content/dam/daily/vulture/2013/04/04/04-arrested-development.jpg

i'm a bit reluctant to upload the mp4, since it's drm-laden ...
Comment 8 Martyn Russell 2014-03-19 17:15:48 UTC
(In reply to comment #7)
> you can get the jpg here:
> http://pixel.nymag.com/content/dam/daily/vulture/2013/04/04/04-arrested-development.jpg
> 
> i'm a bit reluctant to upload the mp4, since it's drm-laden ...

OK, I've tested the two files...

First the MP4: This is what I get:

"""
JHBUILD: martyn@prunus: ~> tracker-search undercover
GLib-GIO-Message: Using the 'memory' GSettings backend.  Your settings will not be saved or shared with other applications.
Results:
  file:///home/martyn/Music/07%20Undercover%20Martyn.m4a
  07 Undercover Martyn.m4a


JHBUILD: martyn@prunus: ~> tracker-info file:///home/martyn/Music/07%20Undercover%20Martyn.m4a
Querying information for entity:'file:///home/martyn/Music/07%20Undercover%20Martyn.m4a'
  'urn:uuid:46c4268d-6d9b-8952-5352-1eb47114bd13'
Results:
  'http://purl.org/dc/elements/1.1/contributor' = 'urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird'
  'http://purl.org/dc/elements/1.1/contributor' = 'urn:artist:Two%20Door%20Cinema%20Club'
  'http://purl.org/dc/elements/1.1/date' = '1924-09-21T16:48:07Z'
  'http://purl.org/dc/elements/1.1/date' = '2014-03-19T15:32:52Z'
  'http://purl.org/dc/elements/1.1/date' = '2014-03-19T15:33:12Z'
  'http://purl.org/dc/elements/1.1/rights' = '℗ 2010 Two Door Cinema Club under exclusive license to Kitsuné France, Under exclusive license to Cooperative Music for Europe. Cooperative Music is a division of V2 Records International.'
  'http://purl.org/dc/elements/1.1/source' = 'http://www.tracker-project.org/ontologies/tracker#extractor-data-source'
  'http://purl.org/dc/elements/1.1/source' = 'urn:nepomuk:datasource:9291a450-1d49-11de-8c30-0800200c9a66'
  'http://purl.org/dc/elements/1.1/title' = 'Undercover Martyn'
  'tracker:added' = '2014-03-19T17:12:02Z'
  'tracker:modified' = '606'
  'rdf:type' = 'http://www.w3.org/2000/01/rdf-schema#Resource'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#DataObject'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#InformationElement'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Media'
  'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Audio'
  'rdf:type' = 'http://www.tracker-project.org/temp/nmm#MusicPiece'
  'nie:byteSize' = '6107668'
  'nie:dataSource' = 'http://www.tracker-project.org/ontologies/tracker#extractor-data-source'
  'nie:dataSource' = 'urn:nepomuk:datasource:9291a450-1d49-11de-8c30-0800200c9a66'
  'nie:isPartOf' = 'urn:uuid:346665e2-af93-1618-d120-ac4bafa1828b'
  'nie:url' = 'file:///home/martyn/Music/07%20Undercover%20Martyn.m4a'
  'nfo:belongsToContainer' = 'urn:uuid:346665e2-af93-1618-d120-ac4bafa1828b'
  'tracker:available' = 'true'
  'nie:contentCreated' = '1924-09-21T16:48:07Z'
  'nie:copyright' = '℗ 2010 Two Door Cinema Club under exclusive license to Kitsuné France, Under exclusive license to Cooperative Music for Europe. Cooperative Music is a division of V2 Records International.'
  'nie:informationElementDate' = '1924-09-21T16:48:07Z'
  'nie:isLogicalPartOf' = 'urn:album:Tourist%20History:Two%20Door%20Cinema%20Club'
  'nie:isLogicalPartOf' = 'urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1'
  'nie:isStoredAs' = 'urn:uuid:46c4268d-6d9b-8952-5352-1eb47114bd13'
  'nie:legal' = '℗ 2010 Two Door Cinema Club under exclusive license to Kitsuné France, Under exclusive license to Cooperative Music for Europe. Cooperative Music is a division of V2 Records International.'
  'nie:mimeType' = 'audio/mp4'
  'nie:title' = 'Undercover Martyn'
  'nco:contributor' = 'urn:artist:Two%20Door%20Cinema%20Club'
  'nco:contributor' = 'urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird'
  'nfo:fileLastAccessed' = '2014-03-19T15:33:12Z'
  'nfo:fileLastModified' = '2014-03-19T15:32:52Z'
  'nfo:fileName' = '07 Undercover Martyn.m4a'
  'nfo:fileSize' = '6107668'
  'nfo:codec' = 'MPEG-4 AAC'
  'nfo:duration' = '166'
  'nfo:genre' = 'Alternative'
  'nfo:channels' = '2'
  'nfo:sampleRate' = '44100.0'
  'nmm:composer' = 'urn:artist:Alex%20Trimble,%20Sam%20Halliday%20&%20Kevin%20Baird'
  'nmm:musicAlbum' = 'urn:album:Tourist%20History:Two%20Door%20Cinema%20Club'
  'nmm:musicAlbumDisc' = 'urn:album-disc:Tourist%20History:Two%20Door%20Cinema%20Club:Disc1'
  'nmm:performer' = 'urn:artist:Two%20Door%20Cinema%20Club'
  'nmm:trackNumber' = '7'
"""

So it's definitely working with the latest Tracker and the latest GStreamer.

For the JPEG, I've committed a patch. It was a wording mismatch. "dc:indentifier" should have been "dc:identifier". Fixed in master :)

This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report.