GNOME Bugzilla – Bug 643388
Sqlite constraint violation
Last modified: 2013-03-12 17:56:04 UTC
Hi, When plugging my USB key on my laptop, it gets mounted at /media/USB. Once indexed by tracker, here is what I get with tracker-info : $ tracker-info file:///media/USB Querying information for entity:'file:///media/USB' 'urn:uuid:11de066a-310d-10e6-a6e9-88f6ac731b34' Results: 'http://purl.org/dc/elements/1.1/date' = '1981-06-05T02:20:00Z' 'tracker:added' = '2011-02-27T01:03:29Z' 'tracker:modified' = '529' 'rdf:type' = 'http://www.w3.org/2000/01/rdf-schema#Resource' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#DataObject' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#InformationElement' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#DataContainer' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#Folder' 'nie:url' = 'file:///media/USB' 'nie:isStoredAs' = 'urn:uuid:11de066a-310d-10e6-a6e9-88f6ac731b34' 'nie:mimeType' = 'inode/directory' 'nfo:fileLastModified' = '1981-06-05T02:20:00Z' You can notice that nie:dataSource is missing. There is indeed a slight warning in tracker-store while trying to add the directory to the database. Attached the interesting part of the miner log as well as the store log.
Created attachment 182024 [details] Miner-fs log
Created attachment 182025 [details] Store log
It seems to be a cross referencing problem between the datasource's tracker:mountPoint property and the FileDataObject representing the mount point's nie:dataSource property.
ping ?
This is most likely triggered by the inverse functional property restriction of nie:url and therefore a bug in the miner, not in the store.
Isn't that problem related to the underlaying usage of SQL ? Where a different kind of backend would have handle that correctly ?
Restricting nie:url was done intentionally as two resources with the same nie:url cause all kinds of problems in the miner. The error message should be better but we get that straight from SQLite.
*** Bug 676154 has been marked as a duplicate of this bug. ***
I met this bug while executing (slighly modified) tracker-tests/310-fts-indexing.py. The error can be seen in the output of `dbus-monitor --session`. For some reason I do not find it in log file *sometimes*. method call sender=:1.46 -> dest=org.freedesktop.Tracker1 serial=73 path=/org/freedesktop/Tracker1/Steroids; interface=org.freedesktop.Tracker1.Steroids; member=UpdateArray (dbus-monitor too dumb to decipher arg type 'h') method return sender=:1.48 -> dest=:1.46 reply_serial=73 array [ string "org.freedesktop.Tracker1.SparqlError.Internal" string "column nie:url is not unique (strerror of errno (not necessarily related): No such file or directory)" ] Steps to reproduce: Key is to repeat steps (1) remove indexed file, (2) recreate it, (3) change its content. In my case the error is triggered when I the steps two times. After that, the error appears every time the content of the file changes, i.e., steps (1) and (2) are no more required to repeat. *** Do not run the `rm -rf` inside your actual home directory! *** # start with clean environment $ tracker-control -t $ rm -rf `find ~/ -iwholename '*tracker*'` # this is how 310-fts-indexing.py setups tracker configuration $ gconftool-2 /org/freedesktop/tracker/miner/files/index-recursive-directories -s --type=list --list-type=string '[/home/nemo/tracker-tests/test-monitored]' $ gconftool-2 /org/freedesktop/tracker/miner/files/index-single-directories -s --type=list --list-type=string '[]' $ gconftool-2 /org/freedesktop/tracker/miner/files/index-optical-discs -s --type=bool false $ gconftool-2 /org/freedesktop/tracker/miner/files/index-removable-devices -s --type=bool false $ gconftool-2 /org/freedesktop/tracker/miner/files/throttle -s --type=int 5 $ mkdir -p tracker-tests/test-monitored $ tracker-control -s # After every step wait couple of seconds (until tracker gets idle) $ echo automobile > tracker-tests/test-monitored/xxx.txt $ rm tracker-tests/test-monitored/xxx.txt $ echo automobile > tracker-tests/test-monitored/xxx.txt $ echo autooomobile > tracker-tests/test-monitored/xxx.txt $ rm tracker-tests/test-monitored/xxx.txt $ echo automobile > tracker-tests/test-monitored/xxx.txt $ echo autooomobile > tracker-tests/test-monitored/xxx.txt
Could be that when the new file arrives the data from the old file hasn't been removed and there is a conflict in nie:url (which is the same for both and must be unique in the system). The bug is reported for 0.10. Any chance to try this with 0.14? There has been some important changes in the crawling.
(In reply to comment #10) > Could be that when the new file arrives the data from the old file hasn't been > removed and there is a conflict in nie:url (which is the same for both and must > be unique in the system). > > The bug is reported for 0.10. Any chance to try this with 0.14? There has been > some important changes in the crawling. My testing, as reported in comment #9, has been done with 0.14 :-) (tracker-0.14.4-2.3.Nemo from Nemo Mobile)
Created attachment 236763 [details] Steps to reproduce with associated backtraces and sparql queries being executed by tracker-store. Comments inside
The error seems to be in miner (or what calls GetMetadataFast). In the output of dbus-monitor, I can see GetMetadataFast being called always with the same resource URN passed - even at the time the resource is already destroyed and a new one created. Example output: method call sender=:1.234 -> dest=org.freedesktop.Tracker1.Extract serial=74 path=/org/freedesktop/Tracker1/Extract; interface=org.freedesktop.Tracker1.Extract; member=GetMetadataFast string "file:///home/nemo/tracker-tests/test-monitored/xxx.txt" string "text/plain" string "urn:uuid:472ed0cc-40ff-4e37-9c0c-062d78656540" (dbus-monitor too dumb to decipher arg type 'h') method return sender=:1.238 -> dest=:1.234 reply_serial=74 method call sender=:1.234 -> dest=org.freedesktop.Tracker1 serial=75 path=/org/freedesktop/Tracker1/Steroids; interface=org.freedesktop.Tracker1.Steroids; member=UpdateArray (dbus-monitor too dumb to decipher arg type 'h') method return sender=:1.236 -> dest=:1.234 reply_serial=75 array [ string "" string "" ] My assumption is miner calls GetMetadataFast, gets result in form of sparql query (with wrong resource urn) and forwards the result to store to be applied. Does miner do some sort of URN caching?
(In reply to comment #13) > The error seems to be in miner (or what calls GetMetadataFast). In the output > of dbus-monitor, I can see GetMetadataFast being called always with the same > resource URN passed - even at the time the resource is already destroyed and a > new one created. > Not true -- it is URN of graph, not of the resource :-\
Created attachment 236903 [details] [review] Patch to forget removed regular files, not only directories. This patch fixes the error for me. Not sure if it is 100% valid/without side effects. The problem I found is the GFile instance is not removed from cache when a single regular file is deleted. The instance is attached (g_object_set_data()) certain data like the URN. When the file is recreated, it gets new URN assigned, but the cached GFile instance still serves the old one. The cached invalid URN is then used on first update of the file.
Created attachment 237070 [details] [review] Patch for tracker-miner-fs to deal with data inserted by other apps Another case for the failed UNIQUE constraint on "nie:url" - this one met while executing tracker-tests/600-applications-camera.py . In this test case: 1) meta data for test file is manually inserted 2) test file is created 3) test file creation is notified by tracker-miner-fs 4) item_add_or_update() is passed is_new=TRUE so it does not query the store for existing URN (existing meta data in general) 5) without existing URN it tries to INSERT while it should UPDATE (i.e. remove existing metadata and insert what it has extracted) --> INSERT fails - there is already meta data stored for the test file. I am not sure what performance impact does the patch have. This case might be more related to the original case with USB key insertion then the case described in my comment #9.
Review of attachment 236903 [details] [review]: Good catch, if a regular file's GFile is still accessible in the cache, that likely means a reference to the GFile is still being held by either the TrackerFileNotifier or the processing queues, it makes sense to put the file out of the cache when removed, as the processing queues will do the same when getting TrackerFileNotifier::file-deleted
Review of attachment 237070 [details] [review]: There should be little impact with that change when doing regular crawling, as TrackerFileNotifier preemptively gets that data when preparing for crawling, so they'd be available during the lifetime of the crawling operation. On monitor operations this was invariantly called anyway, so I think just removing the // comments and indenting properly will do
This problem has been fixed in the development version. The fix will be available in the next major software release. Thank you for your bug report. This will be in 0.15.4 and 0.16.0. Thank you for the patch Martin!