GNOME Bugzilla – Bug 395277
tracking yielding odd results on FUSE filesystem
Last modified: 2010-05-17 13:33:38 UTC
Tracker crashes when indexing an encfs (encrypted) FUSE filesystem. However, by changing MAP_SHARED to MAP_PRIVATE in xdmmimecache.c and depot.c, Tracker seems to run through its paces and index as usual. However, once indexing is done, tracker-search only yields results for NEW words added to the index since trackerd was last run. Here's a terminal session results that demonstrate the behaviour. Corresponding debug log for this session will be attached. francis@francis-laptop:~/temp$ cat test.txt hello another francis@francis-laptop:~/temp$ tracker-search another francis@francis-laptop:~/temp$ echo 'this is yet another testing' >>test.txt francis@francis-laptop:~/temp$ tracker-search another francis@francis-laptop:~/temp$ tracker-search testing /home/francis/temp/test.txt francis@francis-laptop:~/temp$ touch test2.txt francis@francis-laptop:~/temp$ echo 'this is yet another testing' >>test2.txt francis@francis-laptop:~/temp$ tracker-search another francis@francis-laptop:~/temp$ tracker-search testing /home/francis/temp/test.txt /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ echo 'introducing a new word: esophagus' >>test2.txt francis@francis-laptop:~/temp$ tracker-search esophagus /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ tracker-search word /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ tracker-search introducing /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ echo 'adding a word previously in index: another' >>test2.txt francis@francis-laptop:~/temp$ tracker-search another francis@francis-laptop:~/temp$ tracker-search previously /home/francis/temp/test2.txt < kill trackerd in other terminal > francis@francis-laptop:~/temp$ killall trackerd trackerd: no process killed < start trackerd in other terminal > francis@francis-laptop:~/temp$ ps ax | grep trackerd 30479 pts/0 SNl+ 0:02 trackerd --debug 30498 pts/1 S+ 0:00 grep trackerd francis@francis-laptop:~/temp$ tracker-search previously francis@francis-laptop:~/temp$ tracker-search introducing francis@francis-laptop:~/temp$ echo 'add new word to index: propagate' >>test2.txt francis@francis-laptop:~/temp$ tracker-search propagate /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ tracker-search index /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ tracker-search word /home/francis/temp/test2.txt francis@francis-laptop:~/temp$ cat test2.txt this is yet another testing introducing a new word: esophagus adding a word previously in index: another add new word to index: propagate francis@francis-laptop:~/temp$ tracker-search esophagus francis@francis-laptop:~/temp$ tracker-search another francis@francis-laptop:~/temp$
Created attachment 80026 [details] Tracker log for test session This is the debug log for the tracker test session above.
I was able to replicate the bug. Same problem here.
Just to add some additional information: There is no problem to index files residing on a encfs/fuse filesystem. You only get problems when the tracker database files ~/.Tracker/database/* are on a encfs/fuse filesystem. As a workaround you could create this directory on a non-encfs/fuse fs and link that to ~/.Tracker/database.
.Tracker databases should now be fine on FUSE mounted folders I could not find anything wrong so am setting this NEEDINFO If you still have problems with latest subversion source then please provide more details on your: 1) setup - is your home dir completely in FUSE? 2) does search work at all? 3) have you waited for tracker to finish indexing? (on first run it will say optimising database at the end)
(In reply to comment #4) > .Tracker databases should now be fine on FUSE mounted folders > > I could not find anything wrong so am setting this NEEDINFO > > If you still have problems with latest subversion source then please provide > more details on your: > > 1) setup - is your home dir completely in FUSE? > 2) does search work at all? > 3) have you waited for tracker to finish indexing? (on first run it will say > optimising database at the end) > 1. Yes. 2. Yes, with the latest svn, but with limits* 3. Yes. *) You can only search for the first time you index your files. Once you restart trackerd you can no longer search for any previously indexed files. Only newly created files can be found if you try to search. Also, once you've let trackerd run for the first time and let it index all files, you can no longer search after the indexing have been _completed_. You can only search during the actual indexing. I've tried both using the built-in SQLite and an external thread-safe SQLite (3.3.5). There are no differences between the two. By the way, Trackerd always complains about "Checking tracker DB version...Current version is 14 and needed version is 13" (yes, even if I've removed everything related to an external SQLite).
(In reply to comment #5) > (In reply to comment #4) > > .Tracker databases should now be fine on FUSE mounted folders > > > > I could not find anything wrong so am setting this NEEDINFO > > > > If you still have problems with latest subversion source then please provide > > more details on your: > > > > 1) setup - is your home dir completely in FUSE? > > 2) does search work at all? > > 3) have you waited for tracker to finish indexing? (on first run it will say > > optimising database at the end) > > > > 1. Yes. > 2. Yes, with the latest svn, but with limits* > 3. Yes. > > *) You can only search for the first time you index your files. Once you > restart trackerd you can no longer search for any previously indexed files. > Only newly created files can be found if you try to search. > > Also, once you've let trackerd run for the first time and let it index all > files, you can no longer search after the indexing have been _completed_. You > can only search during the actual indexing. I noticed exactly the same behaviour. As long as trackerd is still indexing, searching yields results. As soon as trackerd has finished indexing, tracker-search returns nothing anymore. Killing trackerd and restarting does not help. > > I've tried both using the built-in SQLite and an external thread-safe SQLite > (3.3.5). There are no differences between the two. Yes, I also tried both and haven't noticed a difference > By the way, Trackerd always complains about "Checking tracker DB > version...Current version is 14 and needed version is 13" (yes, even if I've > removed everything related to an external SQLite). > Michael
(In reply to comment #6) > > Also, once you've let trackerd run for the first time and let it index all > > files, you can no longer search after the indexing have been _completed_. You > > can only search during the actual indexing. > > I noticed exactly the same behaviour. As long as trackerd is still indexing, > searching yields results. As soon as trackerd has finished indexing, > tracker-search returns nothing anymore. Killing trackerd and restarting does > not help. Can we separate what works with partial mapped Home and fully mapped Home with FUSE? I cannot replicate any of the above on a partial mapped Home folder.
(In reply to comment #7) > Can we separate what works with partial mapped Home and fully mapped Home with > FUSE? > > I cannot replicate any of the above on a partial mapped Home folder. > The tests I made showed, that it is not important, if the to be indexed data is on an fuse/encfs fs or not. That works just fine. The problem is, if the database (everything in ~/.Tracker/) is on the encfs. To test it, you can try the following: 1.) modprobe fuse (as root) 2.) encfs ~/.crypt ~/crypt 3.) killall trackerd 4.) rm -rf .Tracker 5.) mkdir ~/crypt/.Tracker 6.) ln -s ~/crypt/.Tracker . 7.) trackerd With all versions prior to r398 this led to an immediate crash. Since r398, the indexing runs just fine. And during the initial index run I'm also able to get search results with tracker-search. But as soon as the initial indexing run has finished, tracker-search returns nothing anymore. Restarting trackerd doesn't change that. I tested this on a i386 platform, kernel 2.6.19.2, fuse-utils-2.5.3, tracker-0.5.4_svn399 on my Debian unstable system (but could reproduce it on a Ubuntu feisty system). Michael
(In reply to comment #5) > > *) You can only search for the first time you index your files. Once you > restart trackerd you can no longer search for any previously indexed files. > Only newly created files can be found if you try to search. I am experiencing the exact same behavior. My home folder is encfs encrypted. However, I have experienced the same even when creating the .Tracker folder elsewhere and having a symlink to it in ~/.Tracker. > > Also, once you've let trackerd run for the first time and let it index all > files, you can no longer search after the indexing have been _completed_. You > can only search during the actual indexing. In my experience searching yields no results even while indexing, but perhaps my testing was faulty on this. It definitely yields no results once indexing has done -- except on new files created during that trackerd session. I find it easy to recreate the problem. The folder in question is not in my encfs encrypted home folder, though I get the same behaviour with my home folder. 1. Let trackerd finish indexing. 2. $ echo "testing" >text.html (a new file) 3. $ tracker-search testing --> /data/web/test.html 4. $ killall trackerd 5. $ trackerd --debug (finishes loading/indexing) 6. $ tracker-search testing --> (no results) 7. $ echo "more testing" >>test.html (adding to file created in step 2) 8. $ tracker-search testing --> /data/web/test.html The only thing I can find in the log that seems odd is the following entry, which comes when step 7 is run above: File thread awoken file extension is html Indexing /data/web/test.html with service Documents and mime text/html (existing) word data has old score 1 and new score 0 so updating with total score -1 word web has old score 1 and new score 0 so updating with total score -1 word data has old score 1 and new score 1 so updating with total score 0 word web has old score 1 and new score 1 so updating with total score 0 word text has old score 25 and new score 0 so updating with total score -25 word html has old score 25 and new score 0 so updating with total score -25 word text has old score 25 and new score 25 so updating with total score 0 word html has old score 25 and new score 25 so updating with total score 0 word test has old score 20 and new score 0 so updating with total score -20 word html has old score 20 and new score 0 so updating with total score -20 word test has old score 20 and new score 20 so updating with total score 0 word html has old score 20 and new score 20 so updating with total score 0 word html has old score 50 and new score 0 so updating with total score -50 word html has old score 50 and new score 50 so updating with total score 0 extracting text for /data/web/test.html using filter /usr/lib/tracker/filters/text/html_filter word test has old score 2 and new score 3 so updating with total score 1 compressed full text size of 34 to 25 I don't know how the scoring works, but I don't understand the second to last line. It found the word "test" 2 times previously and now 3 times (since I added "more testing" to the file), so the final score is 1 ? I don't know if that's an error or just normal behaviour but I thought to mention it just in case. > > I've tried both using the built-in SQLite and an external thread-safe SQLite > (3.3.5). There are no differences between the two. I'm running SQLite 3.3.8 on Ubuntu Edgy. Compiling tracker from SVN (with all modules showing "yes" after running autogen).
> File thread awoken > file extension is html > Indexing /data/web/test.html with service Documents and mime text/html > (existing) > word data has old score 1 and new score 0 so updating with total score -1 > word web has old score 1 and new score 0 so updating with total score -1 > word data has old score 1 and new score 1 so updating with total score 0 > word web has old score 1 and new score 1 so updating with total score 0 > word text has old score 25 and new score 0 so updating with total score -25 > word html has old score 25 and new score 0 so updating with total score -25 > word text has old score 25 and new score 25 so updating with total score 0 > word html has old score 25 and new score 25 so updating with total score 0 > word test has old score 20 and new score 0 so updating with total score -20 > word html has old score 20 and new score 0 so updating with total score -20 > word test has old score 20 and new score 20 so updating with total score 0 > word html has old score 20 and new score 20 so updating with total score 0 > word html has old score 50 and new score 0 so updating with total score -50 > word html has old score 50 and new score 50 so updating with total score 0 > extracting text for /data/web/test.html using filter > /usr/lib/tracker/filters/text/html_filter > word test has old score 2 and new score 3 so updating with total score 1 > compressed full text size of 34 to 25 > > I don't know how the scoring works, but I don't understand the second to last > line. It found the word "test" 2 times previously and now 3 times (since I > added "more testing" to the file), so the final score is 1 ? I don't know if > that's an error or just normal behaviour but I thought to mention it just in > case. > tracker is a differential indexer so an update of total score of 1 is really +1 to the existing score. Likewise a negative score of 30 is -30 applied to the existing score. A negative total score which is equal to the old score means the word is deleted from the index EG old score 25 and new score 0 so updating with total score -25 would delete the word from the index as total score of -25 when applied tot he old score 25 which would total zero. Anyway I will try and find time to investigate this further...
All these bugs under Fuse also occur without it. mmap cannot be switched from MAP_SHARED to MAP_PRIVATE as the latter will not write anything to the file accoprding to the man pages for mmap (the optimisation method which is called at the end of the first tracker run uses mmap so with PRIVATE it never wrote an optimised index but simply destroyed the original index hence no search results). Changing it back to shared cures this outside of FUSE. As FUSE does not support SHARED mmap, I have found a define switch in our inlined copy of qdbm to disable use of mmap altogether. I have hardcoded this define in latest svn. Please test latest svn and see if it cures this problem.
(In reply to comment #11) > All these bugs under Fuse also occur without it. You are right. When I tested r401, as soon as trackerd flushed out the data, it didn't show up in the search result anymore. > mmap cannot be switched from MAP_SHARED to MAP_PRIVATE as the latter will not > write anything to the file accoprding to the man pages for mmap (the > optimisation method which is called at the end of the first tracker run uses > mmap so with PRIVATE it never wrote an optimised index but simply destroyed the > original index hence no search results). Changing it back to shared cures this > outside of FUSE. > > As FUSE does not support SHARED mmap, I have found a define switch in our > inlined copy of qdbm to disable use of mmap altogether. I have hardcoded this > define in latest svn. > > Please test latest svn and see if it cures this problem. > I tested r404 (with .Tracker on encfs) and it works absolutely fine, no problems anymore. And the speed improvements are outstanding. Great work! Looking forward to 0.5.4.
(In reply to comment #12) > > I tested r404 (with .Tracker on encfs) and it works absolutely fine, no > problems anymore. > > And the speed improvements are outstanding. Great work! > Looking forward to 0.5.4. > Unfortunately it seemed to work for me too at first, but then I noticed that I cannot search for newly added files (which are added when trackerd are running). Please, see if you can replicate it?
(In reply to comment #13) > (In reply to comment #12) > > > > I tested r404 (with .Tracker on encfs) and it works absolutely fine, no > > problems anymore. > > > > And the speed improvements are outstanding. Great work! > > Looking forward to 0.5.4. > > > > Unfortunately it seemed to work for me too at first, but then I noticed that I > cannot search for newly added files (which are added when trackerd are > running). Please, see if you can replicate it? > Perhaps I was a bit hasty when claiming that it didn't work. I should let it index everything first, so please ignore my previous comment.
Yep, it seems to work just fine now. Nice. :)
marking as fixed
Was away for a few days so am late to the party, but I wanted to confirm that I tried the latest svn version and it all works great with FUSE/encfs. Thanks, Jamie. You da man!
Moving "Indexer" component bugs to "General" since "Indexer" refers to the old 0.6 architecture