GNOME Bugzilla – Bug 310464
beagle-search shouldn't filter out by backend, but by type
Last modified: 2007-01-13 20:17:05 UTC
Distribution/Version: Debian SID When FilterMail goes over an email file (the log message is: Successfully filtered <emailfilename> with Beagle.Filters.FilterMail ) the file is indexed correctly, but appears as a 'file' and not 'mail'. This was verified both by doing a search with 'best' (I can find the file when searching in 'files' but not when searching in 'mail') and by: beagle-index-info (the files count is up, but the mail count stays 0). This was checked on both the CVS version and 0.12.
*** This bug has been marked as a duplicate of 310462 ***
If the file system crawler indexes a mail in a maildir on your harddrive, it will not increase the mail count (as opposed to a mail in .evolution, as it is handled by the separate mail backend), but the file count. This is the intended behaviour. However, the UI bug of not showing up when selecting mail, is valid.
I've been running some tests, and it seems that it's not a sorting problem - the emails filtered through FilterMail don't appear in the 'best' search results at all. I need to confirm this, but in the meanwhile - would you like to close this bug and have me open another once I have more info, or just edit this one with the findings?
Just edit this one. We know what's going on.
*** Bug 310992 has been marked as a duplicate of this bug. ***
Ok, just to clarify. Indexing is done well - no problem there. The only problem is in the way the results show up in best - emails that are indexed with FilterMail are shown as regular files (ugly title - the long filename instead of email subject, no keyword highlighting, etc). Clicking on the filename launches the mail viewer - as expected.
Fixed in cvs by registering the message/rfc822 mimetype in the mail tile.
Reopening this bug. While dsd's change does cause message/rfc822 files to display using the right tile, if you filter out certain types of hits using the option menu the right thing doesn't happen. That is, if you choose "in Mail" you'll only see hits from the mail backend, not the file backend. This is not what we want.
*** Bug 300325 has been marked as a duplicate of this bug. ***
Filtering by type would also be a step towards the "Don't show multimedia" option I desire. (I don't want multimedia clogging up a search when I know that I want something from my collection of electronic fiction.)
That would involve keeping a mapping of mimetype and document type: e.g. message/rfc822 <-> mail audio/mpeg <-> multimedia application/postscript <-> document application/x-archive <-> archive etc. Is there a reliable place to obtain the mimetype-filetype mapping ?
I think maintaining a property called "document type" on filters will help us get the desired mapping. We can even store them as a property during "DoPullProperties". We will have to see the "cost" of storing it in the index, though.
This will all be handled on the UI frontend in Holmes. Soon to land!
Varadhan, I dont think storing it will be costly - currently there is all kind of trash in the index anyway. Lukas, great. Excited for holmes release. I think you mean Holmes will do the filtering, but the type still needs to be stored in the index - right ?
Yes, right I misread again.
On second taught, we can handle this on the UI side as well. But it will be a bit difficult to map all the types and make them work in a transparent way. For example in beagle-search it is possible to do this like: AddSupportedFlavor (new HitFlavor (null, "File", "message/rfc822")); Adding this to the mail tile will make it handle the file like a mail message and it will appear in the mail message group.
Yes. That would enable filtering. But that has few shortcomings: 1) The effort needs to duplicated in all frontends. 2) The type is an information - best known by the Filter that filtered this file, and not storing this is not storing some information. Beagle is supposed to squeeze any juicy information available. 3) Querying by type: putting type=audio or type:audio (some fixed syntax) should only return results of type audio. This can be emulated in the frontend, but given that daemon returns only first 100 results, there are cases where frontend filtering will miss _the_one_ file that I seek. Implementation proposal: One approach is this. Let the filters handle filetypes . Filter.cs by default will add a filetype of "File" or "Document" (default filetype) to each indexable. Backends/derived Filters will have the opportunity to override the filetype. If multiple filetypes needs to be supported (cant think of any example), then little more care is needed.
Yes, that's the most likely approach. It would also simplify many stuff in the frontends.
I believe beagle-search already categorizes by type and not by backend. Implementing comment#11 would make the implementation easier but as far as the bug is concerned, its not present in beagle-search.