GNOME Bugzilla – Bug 341841
Beagle doesn't index all files in a very large directory
Last modified: 2006-09-06 22:07:59 UTC
With a large directory containing ~5350 mp3 files, beagle for a test artist search only indexes about 4 of 50 or so files. I ran beagled --fg --debug with BEAGLE_EXERCISE_THE_DOG=1 in one terminal and beagle-status in another from scratch, and then tested the search with both the beagle-search gui and beagle-query, which returned a total of 16 results (some outside the large directory). I then ran a mv of the files with the artist name to a new test directory, at which point beagled sprang to life and indexed the files, and now all of the files were found from beagle-search (64 results). I mv'ed them back, and now, all 64 files are found with the files in the original location.
Just tested a different query on a different directory, with a similar result. This time the directory has 3179 mp3 files and resides in my home directory instead of outside it. mv'ing to a test directory makes beagle index the files, and mv'ing back works as expected, i.e. beagle tracks the files. Additional info: Distribution: Ubuntu Dapper Flight 7 + updates to 15th May Filesystem: ReiserFS v.3 with extended attributes on
Possibly connected: mv'ing the contents of either whole directory fails with 'Argument list is too long' (ARG_MAX=131072).
Can you attach the logs from ~/.beagle/Log while indexing this directory?
I'm attaching the output of beagled --fg for both tested directories above.
Created attachment 66027 [details] Output of first directory tested
Created attachment 66028 [details] Output of second directory tested
If you run beagle-index-info, does it look like the number of mp3 files is included in the index? There are some exceptions in your logs that look a little suspicious.
Not before moving the files, no. Suspicious??
Any chance we can confirm this again and get another sample of the logs? I'm just thinking its more a Lucene issue than anything else, but any chance you could wipe ~/.beagle and try it again?
Can't reproduce. Made a directory /mnt/media/files/mp3 with 3000 mp3 (copies of the same file, has id3 tag artist "tiger ..."). Added /mnt/media/files/mp3 as the only root to beagle config. Ran "beagled --fg --debug --backend Files", with EXERCISE variable set. beagle-index-info reports increasing number of files being indexed. And "beagle-query tiger" always prints the latest files getting indexed. Possibly there is more to it than merely a directory with large number of files. What if you remove all your roots and add your large_mp3_directory as the only root, and then do the indexing ? Do you see the earlier problem or you see the same situation as mine ? The "'Argument list is too long' (ARG_MAX=131072)" thing is a linux kernel limitation on the length of command line parameters. It can be easily avoid by using xargs and has nothing related to this bug, IMO.
This has been fixed. *** This bug has been marked as a duplicate of 354161 ***