GNOME Bugzilla – Bug 612725
tracker: return more meaningful results (no fulltext search)
Last modified: 2012-07-20 13:14:40 UTC
Created attachment 155985 [details] [review] proposed patch Subject: [PATCH] tracker: Don't use fulltext search We only match basename of the filename now, this should give us more precise and meaningful results. Previously, files containing the keyword in file contents, were included in the results. Also, make path matching actually work. The above applies only to tracker 0.7 series. -- Would be great to have this patch get some testing before we push it. Also note it's hard code freeze on Monday. Proposing as 2.30 target. I hope people didn't get used to fulltext search results much, this patch will give them basically the same results as with simple search engine.
Comment on attachment 155985 [details] [review] proposed patch Thanks for the patch. Did you test it? It looks broken in places. Let me explain: >We only match basename of the filename now, this should give us more >precise and meaningful results. Previously, files containing the keyword >in file contents, were included in the results. Hmm, that really depends on what you're looking for. The basename alone is no good if I am looking for all files stored in bar where the path is /foo/bar/baz.txt. In addition, as a user, I might want to find that file that has sliff somewhere in a file's content, with this, you can't do that. You should know that fts:match doesn't only match the content of the file, other items in the ontology are full text search indexed. Only important properties of course. >Also, make path matching actually work. I am not sure how useful that is, surely if you know the patch you want to match you just put that in the location bar? Unless you mean partial matching? But your patch does prefix matching, not partial matching. > s = tracker_sparql_escape (str); > >- g_string_append_c (sparql, '"'); > g_string_append (sparql, s); >- g_string_append_c (sparql, '"'); Why do this? It just adds more g_string_*() calls everywhere else (arguably they might already be needed I know, but it makes more sense to keep this here IMO). >+ sparql = g_string_new ("SELECT ?url WHERE { ?file a nfo:FileDataObject ; nie:url ?url; "); Why did you replace the nie:url(?url) here? Note that now nie:url MUST exist, previously it is optional. This is quite unlikely however that it doesn't exist. >+ if (mime_count > 0) >+ g_string_append (sparql, "nie:mimeType ?mime ; "); >+ g_string_append (sparql, "nfo:fileName ?fname . FILTER ( fn:contains(?fname, \""); >+ sparql_append_string_literal (sparql, search_text); >+ g_string_append (sparql, "\") "); >+ >+ if (location) { >+ g_string_append (sparql, " && fn:starts-with(?url, \"file://"); >+ sparql_append_string_literal (sparql, location); >+ g_string_append (sparql, "\")"); >+ } Reading the code above not included in this patch, location is a URI if 0.7 && !0.8 (that logic I don't actually understand by the way, 0.7 is 0.8 only stable): location = (tracker_07 && !tracker_08) ? g_strdup (location_uri) : g_filename_from_uri (location_uri, NULL, NULL); This means that you are doing fn:starts-with on with "file://" + location, which would be something like "file://file:///home/sliff/sloff.txt". >+ g_string_append (sparql, "\""); All these additional appends are not really necessary if you don't remove them from the _string_literal() function.
I don't get this? Why would you *not* want to do fulltext searching if you have tracker indexing going. Thats the main reason we have search at all, letting you find things on something more meaningful than filename and pathname. Perhaps the current UI is not ideal and we should allow you to specify whether to search on the filename or contents (and perhaps both, with different strings). However, just demoting tracker indexing to a fast grep seems like a bad idea.
(In reply to comment #2) > I don't get this? Why would you *not* want to do fulltext searching if you have > tracker indexing going. Thats the main reason we have search at all, letting > you find things on something more meaningful than filename and pathname. > > Perhaps the current UI is not ideal and we should allow you to specify whether > to search on the filename or contents (and perhaps both, with different > strings). However, just demoting tracker indexing to a fast grep seems like a > bad idea. Right, it's all matter of personal taste and expectation. Martyn has presented several use cases, apart of mine dumb-fast-grep usage. I think we should tune the UI a little bit and present user more options (if desired, we're Gnome after all and don't want to get user confused with too many options), such as turning the fulltext search on and off. The SPARQL query language is pretty flexible and we can do almost whatever we want.
(In reply to comment #1) > Thanks for the patch. Thanks for the review, see my answers inline. > Hmm, that really depends on what you're looking for. The basename alone is no > good if I am looking for all files stored in bar where the path is > /foo/bar/baz.txt. Yes, I get your point, please see my comment above. There are users who would prefer that way or others who would prefer the other way. Maybe we should ask usability experts... The full location of results can be displayed in the List View mode but it's hidden in Icons and Compact views. It could be difficult for users to understand why the particular file was included in the results. > I am not sure how useful that is, surely if you know the patch you want to > match you just put that in the location bar? Unless you mean partial matching? > But your patch does prefix matching, not partial matching. Yes, that's right, user can define what folder to search in, so it must be the prefix matching here. I've copied that part of the query from tracker sources after all (the compat tracker_search_metadata_by_text_and_location_async() function). > >+ sparql = g_string_new ("SELECT ?url WHERE { ?file a nfo:FileDataObject ; nie:url ?url; "); > > Why did you replace the nie:url(?url) here? Note that now nie:url MUST exist, > previously it is optional. This is quite unlikely however that it doesn't > exist. Either way we always need to get a URL of the result so I don't think making it mandatory is a problem. We use it for path (prefix) matching too. > Reading the code above not included in this patch, location is a URI if 0.7 && > !0.8 (that logic I don't actually understand by the way, 0.7 is 0.8 only > stable): > > location = (tracker_07 && !tracker_08) ? g_strdup (location_uri) : > g_filename_from_uri (location_uri, NULL, NULL); Stupid me, fixed. > >Also, make path matching actually work. There's a bug in current nautilus sources making search limitation not working: > ?file a nfo:FileDataObject ..... > fn:starts-with(?file, - that never worked for me, we need to do path string (URL) matching. Perhaps we should fix this separately for the moment.
Created attachment 156563 [details] [review] proposed patch Updated patch, fixing some of Martyn's concerns. I've also made versioning more clean, inspired by tracker support in GtkFileChooser. Queries haven't changed.
(In reply to comment #3) > (In reply to comment #2) > Right, it's all matter of personal taste and expectation. Martyn has presented > several use cases, apart of mine dumb-fast-grep usage. I think we should tune > the UI a little bit and present user more options (if desired, we're Gnome > after all and don't want to get user confused with too many options), such as > turning the fulltext search on and off. > > The SPARQL query language is pretty flexible and we can do almost whatever we > want. An interesting idea. Actually, I wanted to write an application which was as good as the Mac's spotlight search results window. But actually as a team, we think it is better to integrate with GNOME than write new applications for these things. So I would rather have this support in Nautilus. To give you an idea of what I mean, see this video on Spotlight and notice in the results window the "Contents" and "File Name" options. http://homeoffice.consumerelectronicsnet.com/articles/viewarticle.jsp?id=540649 These two options allow the switching I believe we both want. We should be able to make the nautilus integration much more powerful than it currently is.
Created attachment 158298 [details] [review] proposed patch I've committed big portion of the previous patch, leaving the queries using FTS. Attaching new patch, a leftover, changing queries to filename-based matching (see above). Not saying it's correct, just tidying up.
What about the case when you have tracker installed but want to search a location that is not indexed? I tried that today, searching a removable drive with pictures etc., and even if the drive was explicitly specified as the Place to search in, nautilus search bar just returned nothing. This is certainly not the expected behaviour. Same with beagle integration. As a side note, I'd really like to be able to specify more search criteria, most importantly date ranges. As useful as tracker and beagle are, there currently is no frontend for any of the two that allows you to do this (apart from entering query language).
I don't think there is any need for this bug to stay open.