After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 395277 - tracking yielding odd results on FUSE filesystem
tracking yielding odd results on FUSE filesystem
Status: RESOLVED FIXED
Product: tracker
Classification: Core
Component: General
unspecified
Other Linux
: Normal normal
: ---
Assigned To: Jamie McCracken
Jamie McCracken
Depends on:
Blocks:
 
 
Reported: 2007-01-11 04:42 UTC by zerohalo
Modified: 2010-05-17 13:33 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Tracker log for test session (26.34 KB, text/x-log)
2007-01-11 14:15 UTC, zerohalo
Details

Description zerohalo 2007-01-11 04:42:13 UTC
Tracker crashes when indexing an encfs (encrypted) FUSE filesystem. However, by changing MAP_SHARED to MAP_PRIVATE in xdmmimecache.c and depot.c, Tracker seems to run through its paces and index as usual. However, once indexing is done, tracker-search only yields results for NEW words added to the index since trackerd was last run.

Here's a terminal session results that demonstrate the behaviour. Corresponding debug log for this session will be attached.

francis@francis-laptop:~/temp$ cat test.txt
hello
 another
francis@francis-laptop:~/temp$ tracker-search another
francis@francis-laptop:~/temp$ echo 'this is yet another testing' >>test.txt
francis@francis-laptop:~/temp$ tracker-search another
francis@francis-laptop:~/temp$ tracker-search testing
/home/francis/temp/test.txt
francis@francis-laptop:~/temp$ touch test2.txt
francis@francis-laptop:~/temp$ echo 'this is yet another testing' >>test2.txt
francis@francis-laptop:~/temp$ tracker-search another
francis@francis-laptop:~/temp$ tracker-search testing
/home/francis/temp/test.txt
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ echo 'introducing a new word: esophagus' >>test2.txt
francis@francis-laptop:~/temp$ tracker-search esophagus
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ tracker-search word
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ tracker-search introducing
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ echo 'adding a word previously in index: another' >>test2.txt
francis@francis-laptop:~/temp$ tracker-search another
francis@francis-laptop:~/temp$ tracker-search previously
/home/francis/temp/test2.txt
< kill trackerd in other terminal >
francis@francis-laptop:~/temp$ killall trackerd
trackerd: no process killed
< start trackerd in other terminal >
francis@francis-laptop:~/temp$ ps ax | grep trackerd
30479 pts/0    SNl+   0:02 trackerd --debug
30498 pts/1    S+     0:00 grep trackerd
francis@francis-laptop:~/temp$ tracker-search previously
francis@francis-laptop:~/temp$ tracker-search introducing
francis@francis-laptop:~/temp$ echo 'add new word to index: propagate' >>test2.txt
francis@francis-laptop:~/temp$ tracker-search propagate
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ tracker-search index
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ tracker-search word
/home/francis/temp/test2.txt
francis@francis-laptop:~/temp$ cat test2.txt
this is yet another testing
introducing a new word: esophagus
adding a word previously in index: another
add new word to index: propagate
francis@francis-laptop:~/temp$ tracker-search esophagus
francis@francis-laptop:~/temp$ tracker-search another
francis@francis-laptop:~/temp$
Comment 1 zerohalo 2007-01-11 14:15:05 UTC
Created attachment 80026 [details]
Tracker log for test session

This is the debug log for the tracker test session above.
Comment 2 Fredrik Blom 2007-01-12 01:14:46 UTC
I was able to replicate the bug. Same problem here.
Comment 3 Michael Biebl 2007-01-13 00:43:04 UTC
Just to add some additional information:
There is no problem to index files residing on a encfs/fuse filesystem.
You only get problems when the tracker database files ~/.Tracker/database/* are on a encfs/fuse filesystem.
As a workaround you could create this directory on a non-encfs/fuse fs and link that to ~/.Tracker/database.

 
Comment 4 Jamie McCracken 2007-01-15 02:18:48 UTC
.Tracker databases should now be fine on FUSE mounted folders

I could not find anything wrong so am setting this NEEDINFO

If you still have problems with latest subversion source then please provide more details on your:

1) setup - is your home dir completely in FUSE?
2) does search work at all?
3) have you waited for tracker to finish indexing? (on first run it will say optimising database at the end)
Comment 5 Fredrik Blom 2007-01-15 19:53:39 UTC
(In reply to comment #4)
> .Tracker databases should now be fine on FUSE mounted folders
> 
> I could not find anything wrong so am setting this NEEDINFO
> 
> If you still have problems with latest subversion source then please provide
> more details on your:
> 
> 1) setup - is your home dir completely in FUSE?
> 2) does search work at all?
> 3) have you waited for tracker to finish indexing? (on first run it will say
> optimising database at the end)
> 

1. Yes.
2. Yes, with the latest svn, but with limits*
3. Yes.

*) You can only search for the first time you index your files. Once you restart trackerd you can no longer search for any previously indexed files. Only newly created files can be found if you try to search.

Also, once you've let trackerd run for the first time and let it index all files, you can no longer search after the indexing have been _completed_. You can only search during the actual indexing.

I've tried both using the built-in SQLite and an external thread-safe SQLite (3.3.5). There are no differences between the two.

By the way, Trackerd always complains about "Checking tracker DB version...Current version is 14 and needed version is 13" (yes, even if I've removed everything related to an external SQLite).
Comment 6 Michael Biebl 2007-01-15 21:08:59 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > .Tracker databases should now be fine on FUSE mounted folders
> > 
> > I could not find anything wrong so am setting this NEEDINFO
> > 
> > If you still have problems with latest subversion source then please provide
> > more details on your:
> > 
> > 1) setup - is your home dir completely in FUSE?
> > 2) does search work at all?
> > 3) have you waited for tracker to finish indexing? (on first run it will say
> > optimising database at the end)
> > 
> 
> 1. Yes.
> 2. Yes, with the latest svn, but with limits*
> 3. Yes.
> 
> *) You can only search for the first time you index your files. Once you
> restart trackerd you can no longer search for any previously indexed files.
> Only newly created files can be found if you try to search.
> 
> Also, once you've let trackerd run for the first time and let it index all
> files, you can no longer search after the indexing have been _completed_. You
> can only search during the actual indexing.

I noticed exactly the same behaviour. As long as trackerd is still indexing, searching yields results. As soon as trackerd has finished indexing, tracker-search returns nothing anymore. Killing trackerd and restarting does not help.

> 
> I've tried both using the built-in SQLite and an external thread-safe SQLite
> (3.3.5). There are no differences between the two.

Yes, I also tried both and haven't noticed a difference

> By the way, Trackerd always complains about "Checking tracker DB
> version...Current version is 14 and needed version is 13" (yes, even if I've
> removed everything related to an external SQLite).
> 

Michael
Comment 7 Jamie McCracken 2007-01-15 22:19:38 UTC
(In reply to comment #6)

> > Also, once you've let trackerd run for the first time and let it index all
> > files, you can no longer search after the indexing have been _completed_. You
> > can only search during the actual indexing.
> 
> I noticed exactly the same behaviour. As long as trackerd is still indexing,
> searching yields results. As soon as trackerd has finished indexing,
> tracker-search returns nothing anymore. Killing trackerd and restarting does
> not help.

Can we separate what works with partial mapped Home and fully mapped Home with FUSE?

I cannot replicate any of the above on a partial mapped Home folder.


Comment 8 Michael Biebl 2007-01-15 23:40:40 UTC
(In reply to comment #7)

> Can we separate what works with partial mapped Home and fully mapped Home with
> FUSE?
> 
> I cannot replicate any of the above on a partial mapped Home folder.
> 

The tests I made showed, that it is not important, if the to be indexed data is on an fuse/encfs fs or not. That works just fine. The problem is, if the database (everything in ~/.Tracker/) is on the encfs.

To test it, you can try the following:
1.) modprobe fuse (as root)
2.) encfs ~/.crypt ~/crypt
3.) killall trackerd
4.) rm -rf .Tracker
5.) mkdir ~/crypt/.Tracker
6.) ln -s ~/crypt/.Tracker .
7.) trackerd

With all versions prior to r398 this led to an immediate crash.
Since r398, the indexing runs just fine. And during the initial index run I'm also able to get search results with tracker-search. But as soon as the initial indexing run has finished, tracker-search returns nothing anymore. Restarting trackerd doesn't change that.

I tested this on a i386 platform, kernel 2.6.19.2, fuse-utils-2.5.3, tracker-0.5.4_svn399 on my Debian unstable system (but could reproduce it on a Ubuntu feisty system).

Michael 
 
Comment 9 zerohalo 2007-01-16 16:20:32 UTC
(In reply to comment #5)
> 
> *) You can only search for the first time you index your files. Once you
> restart trackerd you can no longer search for any previously indexed files.
> Only newly created files can be found if you try to search.

I am experiencing the exact same behavior. My home folder is encfs encrypted. However, I have experienced the same even when creating the .Tracker folder elsewhere and having a symlink to it in ~/.Tracker.

> 
> Also, once you've let trackerd run for the first time and let it index all
> files, you can no longer search after the indexing have been _completed_. You
> can only search during the actual indexing.

In my experience searching yields no results even while indexing, but perhaps my testing was faulty on this. It definitely yields no results once indexing has done -- except on new files created during that trackerd session.

I find it easy to recreate the problem. The folder in question is not in my encfs encrypted home folder, though I get the same behaviour with my home folder.

1. Let trackerd finish indexing.
2. $ echo "testing" >text.html  (a new file)
3. $ tracker-search testing
--> /data/web/test.html
4. $ killall trackerd
5. $ trackerd --debug
(finishes loading/indexing)
6. $ tracker-search testing
--> (no results)
7. $ echo "more testing" >>test.html  (adding to file created in step 2)
8. $ tracker-search testing
--> /data/web/test.html

The only thing I can find in the log that seems odd is the following entry, which comes when step 7 is run above:

File thread awoken
file extension is html
Indexing /data/web/test.html with service Documents and mime text/html (existing)
word data has old score 1 and new score 0 so updating with total score -1
word web has old score 1 and new score 0 so updating with total score -1
word data has old score 1 and new score 1 so updating with total score 0
word web has old score 1 and new score 1 so updating with total score 0
word text has old score 25 and new score 0 so updating with total score -25
word html has old score 25 and new score 0 so updating with total score -25
word text has old score 25 and new score 25 so updating with total score 0
word html has old score 25 and new score 25 so updating with total score 0
word test has old score 20 and new score 0 so updating with total score -20
word html has old score 20 and new score 0 so updating with total score -20
word test has old score 20 and new score 20 so updating with total score 0
word html has old score 20 and new score 20 so updating with total score 0
word html has old score 50 and new score 0 so updating with total score -50
word html has old score 50 and new score 50 so updating with total score 0
extracting text for /data/web/test.html using filter /usr/lib/tracker/filters/text/html_filter
word test has old score 2 and new score 3 so updating with total score 1
compressed full text size of 34 to 25

I don't know how the scoring works, but I don't understand the second to last line. It found the word "test" 2 times previously and now 3 times (since I added "more testing" to the file), so the final score is 1 ? I don't know if that's an error or just normal behaviour but I thought to mention it just in case.

> 
> I've tried both using the built-in SQLite and an external thread-safe SQLite
> (3.3.5). There are no differences between the two.

I'm running SQLite 3.3.8 on Ubuntu Edgy. Compiling tracker from SVN (with all modules showing "yes" after running autogen). 

Comment 10 Jamie McCracken 2007-01-16 16:31:59 UTC
> File thread awoken
> file extension is html
> Indexing /data/web/test.html with service Documents and mime text/html
> (existing)
> word data has old score 1 and new score 0 so updating with total score -1
> word web has old score 1 and new score 0 so updating with total score -1
> word data has old score 1 and new score 1 so updating with total score 0
> word web has old score 1 and new score 1 so updating with total score 0
> word text has old score 25 and new score 0 so updating with total score -25
> word html has old score 25 and new score 0 so updating with total score -25
> word text has old score 25 and new score 25 so updating with total score 0
> word html has old score 25 and new score 25 so updating with total score 0
> word test has old score 20 and new score 0 so updating with total score -20
> word html has old score 20 and new score 0 so updating with total score -20
> word test has old score 20 and new score 20 so updating with total score 0
> word html has old score 20 and new score 20 so updating with total score 0
> word html has old score 50 and new score 0 so updating with total score -50
> word html has old score 50 and new score 50 so updating with total score 0
> extracting text for /data/web/test.html using filter
> /usr/lib/tracker/filters/text/html_filter
> word test has old score 2 and new score 3 so updating with total score 1
> compressed full text size of 34 to 25
> 
> I don't know how the scoring works, but I don't understand the second to last
> line. It found the word "test" 2 times previously and now 3 times (since I
> added "more testing" to the file), so the final score is 1 ? I don't know if
> that's an error or just normal behaviour but I thought to mention it just in
> case.
> 

tracker is a differential indexer so an update of total score of 1 is really +1 to the existing score. Likewise a negative score of 30 is -30 applied to the existing score.

A negative total score which is equal to the old score means the word is deleted from the index

EG

old score 25 and new score 0 so updating with total score -25

would delete the word from the index as total score of -25 when applied tot he old score 25 which would total zero.

Anyway I will try and find time to investigate this further...

Comment 11 Jamie McCracken 2007-01-20 00:57:14 UTC
All these bugs under Fuse also occur without it. 

mmap cannot be switched from MAP_SHARED to MAP_PRIVATE as the latter will not write anything to the file accoprding to the man pages for mmap (the optimisation method which is called at the end of the first tracker run uses mmap so with PRIVATE it never wrote an optimised index but simply destroyed the original index hence no search results). Changing it back to shared cures this outside of FUSE.

As FUSE does not support SHARED mmap, I have found a define switch in our inlined copy of qdbm to disable use of mmap altogether. I have hardcoded this define in latest svn. 

Please test latest svn and see if it cures this problem.

Comment 12 Michael Biebl 2007-01-20 05:00:08 UTC
(In reply to comment #11)
> All these bugs under Fuse also occur without it. 

You are right. When I tested r401, as soon as trackerd flushed out the data, it didn't show up in the search result anymore.

> mmap cannot be switched from MAP_SHARED to MAP_PRIVATE as the latter will not
> write anything to the file accoprding to the man pages for mmap (the
> optimisation method which is called at the end of the first tracker run uses
> mmap so with PRIVATE it never wrote an optimised index but simply destroyed the
> original index hence no search results). Changing it back to shared cures this
> outside of FUSE.
> 
> As FUSE does not support SHARED mmap, I have found a define switch in our
> inlined copy of qdbm to disable use of mmap altogether. I have hardcoded this
> define in latest svn. 
> 
> Please test latest svn and see if it cures this problem.
> 

I tested r404 (with .Tracker on encfs) and it works absolutely fine, no problems anymore.

And the speed improvements are outstanding. Great work!
Looking forward to 0.5.4.

Comment 13 Fredrik Blom 2007-01-20 16:01:44 UTC
(In reply to comment #12)
> 
> I tested r404 (with .Tracker on encfs) and it works absolutely fine, no
> problems anymore.
> 
> And the speed improvements are outstanding. Great work!
> Looking forward to 0.5.4.
> 

Unfortunately it seemed to work for me too at first, but then I noticed that I cannot search for newly added files (which are added when trackerd are running). Please, see if you can replicate it?
Comment 14 Fredrik Blom 2007-01-20 16:19:28 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > 
> > I tested r404 (with .Tracker on encfs) and it works absolutely fine, no
> > problems anymore.
> > 
> > And the speed improvements are outstanding. Great work!
> > Looking forward to 0.5.4.
> > 
> 
> Unfortunately it seemed to work for me too at first, but then I noticed that I
> cannot search for newly added files (which are added when trackerd are
> running). Please, see if you can replicate it?
> 

Perhaps I was a bit hasty when claiming that it didn't work. I should let it index everything first, so please ignore my previous comment.
Comment 15 Fredrik Blom 2007-01-20 16:46:39 UTC
Yep, it seems to work just fine now. Nice. :)
Comment 16 Jamie McCracken 2007-01-20 16:47:42 UTC
marking as fixed
Comment 17 zerohalo 2007-01-23 13:29:08 UTC
Was away for a few days so am late to the party, but I wanted to confirm that I tried the latest svn version and it all works great with FUSE/encfs. Thanks, Jamie. You da man!
Comment 18 Martyn Russell 2010-05-17 13:33:38 UTC
Moving "Indexer" component bugs to "General" since "Indexer" refers to the old 0.6 architecture