GNOME Bugzilla – Bug 746916
File chooser search does not work over remote network shares
Last modified: 2015-05-16 14:39:18 UTC
Back in 3.14 when we still had the type-ahead search, I could easily find my files (in a remote share, such as SFTP) when trying to open them with any GTK app... Now in 3.16, the search functionality simply doesn't work. Even if the file is in the directory I'm looking at, typing its name produces no results whatsoever. For comparison, viewing the same directory and typing the same string does produce useful search results.
It's not only network shares that are the problem, btw, but any non-indexed location. Nautilus gets it right and seems to fall back to slower non-tracker search in unindexed locations. GTK should too. As a usecase example that doesn't include remote locations, tracker is not set to index $HOME recursively by default, so if you look for files in a subdirectory of your home folder which is not an XDG standard folder (meaning not Videos, Music, and so on), the file chooser search will fail completely (compared with the type-ahead search in previous releases, which just worked)
Adding Version information is welcome via the Version field.
Makes sense to me. A patch to realign gtk-search-engine* with nautilus-search-engine* feature-wise, would be very welcome.
Created attachment 302689 [details] [review] Rewrite simple search engine to use GFile and breadth order This fixes an unrelated issue (depth-first search vs breadth-first search) for me, but would also, hopefully, enable access to things like network shares
Review of attachment 302689 [details] [review]: not a detailed review, but 2 quick comments: - It seems to me that you are processing the directories in the opposite order in which they are returned by the enumerator. Not sure it makes a huge difference - A bit ugly to do all that string concatenation here and pass absolute paths around. You have the GFile for the parent dir and the child name, so you could very easily just get the GFile for the child dir and keep a list of GFiles
(In reply to Matthias Clasen from comment #5) > Review of attachment 302689 [details] [review] [review]: > > not a detailed review, but 2 quick comments: > > - It seems to me that you are processing the directories in the opposite > order in which they are returned by the enumerator. Not sure it makes a huge > difference Totally forgot to reverse the list. Sorry. > - A bit ugly to do all that string concatenation here and pass absolute > paths around. You have the GFile for the parent dir and the child name, so > you could very easily just get the GFile for the child dir and keep a list > of GFiles That was an option, but i wasn't sure how much of a performance loss/gain that would be (construction of GFiles vs construction of just filenames). Also note that breadth-first search takes progressively more memory as the number of directories on each level of the *whole* tree (not just the number of subdirectories of a particular directory) increases (which is usually the case). My hunch is that an absolute filename takes less memory than a GFile.
Created attachment 302693 [details] [review] Rewrite simple search engine to use GFile and breadth order v2: * Reverse newly-formed directory list to search in the right order * Fix iteration logic to correctly free the enumerator
Created attachment 302696 [details] [review] Rewrite simple search engine to use GFile and breadth order v3: * Check data->cancelled at least once per file instead of once per non-hidden non-symlink file * Use g_filename_display_basename() to get basename for matching This means less code and is also more correct, because data->words[] actually is UTF-8-encoded
Review of attachment 302696 [details] [review]: . ::: gtk/gtksearchenginesimple.c @@ +186,3 @@ +typedef gboolean (visitfunc) (const char *fpath, gpointer user_data); + We typically use camel case for function typedefs, so this would look less foreign as (VisitFunc) @@ +257,3 @@ + g_object_unref (direnum); + g_free (dirpath); + g_free (info); This is wrong, and I get segfaults in this g_free() call. The docs for g_file_enumerator_iterate say: @out_info: (out) (transfer none) (allow-none): Output location... Even if it was not transfer none, you'd have to call g_object_unref.
Review of attachment 302696 [details] [review]: I would still prefer to keep a list of GFile objects, like nautilus does
Created attachment 302712 [details] [review] Rewrite simple search engine to use GFile and breadth order v4: * visitfunc -> VisitFunc * Keep a list of GFiles instead of absolute directory paths * don't g_free() the info object owned by the enumerator * don't leak dirpath when g_file_enumerate_children() returns NULL
Review of attachment 302712 [details] [review]: ::: gtk/gtksearchenginesimple.c @@ +227,3 @@ + } + + g_list_free (root_dirs); I would prefer to move the list freeing right after the loop over the list, and also change it to g_list_free_full (root_dirs, g_object_unref) seeing the list eleements freed in the loop without removing them from the list makes me itchy. @@ +272,3 @@ + fullname = g_build_filename (dirpath, basename, NULL); + + keep_going = func ((const char *) fullname, user_data); It seems a bit silly that you are concatenating strings here to get a full path... @@ +273,3 @@ + + keep_going = func ((const char *) fullname, user_data); + It is also confusing that the VisitFunc returns a 'keep_going' value here - the only reason it can return FALSE is if data->cancelled got set, and we already check that in this function anyway. I would suggest to make the VisitFunc return void. @@ +308,3 @@ + display_basename = g_filename_display_basename (fpath); + lower_name = g_utf8_casefold (display_basename, -1); + g_free (display_basename); ...only to pull out the basename again here. I think we should pass both strings in and avoid a bunch of allocation noise.
(In reply to Matthias Clasen from comment #12) > Review of attachment 302712 [details] [review] [review]: > > ::: gtk/gtksearchenginesimple.c > @@ +227,3 @@ > + } > + > + g_list_free (root_dirs); > > I would prefer to move the list freeing right after the loop over the list, > and also change it to > > g_list_free_full (root_dirs, g_object_unref) > > seeing the list eleements freed in the loop without removing them from the > list makes me itchy. Is it safe to g_object_unref (NULL)? I unref objects inside the loop for efficiency - to free their memory as soon as they are not needed anymore. That leaves NULLs in the list in place of unreffed objects, hence the unref-of-NULL question. > > @@ +272,3 @@ > + fullname = g_build_filename (dirpath, basename, NULL); > + > + keep_going = func ((const char *) fullname, user_data); > > It seems a bit silly that you are concatenating strings here to get a full > path... What are the alternatives? Make a child GFile first, then g_file_get_path()? > > @@ +273,3 @@ > + > + keep_going = func ((const char *) fullname, user_data); > + > > It is also confusing that the VisitFunc returns a 'keep_going' value here - > the only reason it can return FALSE is if data->cancelled got set, and we > already check that in this function anyway. I would suggest to make the > VisitFunc return void. I considered this. There's one difference between data->cancelled and keep_going. data->cancelled is set when search is cancelled, keep_going is set (by the visit function via return value) when search is cancelled *or* by the directory scanner when child enumeration fails. Unless it's ok to consider these two cases identical, i'd rather keep and check two different variables. > > @@ +308,3 @@ > + display_basename = g_filename_display_basename (fpath); > + lower_name = g_utf8_casefold (display_basename, -1); > + g_free (display_basename); > > ...only to pull out the basename again here. I think we should pass both > strings in and avoid a bunch of allocation noise. There's a subtle difference here. g_filename_display_basename() returns UTF-8-encoded display name, g_file_info_get_name() returns filename-encoding-encoded basename. They are not interchangeable.
(In reply to LRN from comment #13) > (In reply to Matthias Clasen from comment #12) > > Review of attachment 302712 [details] [review] [review] [review]: > > > > ::: gtk/gtksearchenginesimple.c > > @@ +227,3 @@ > > + } > > + > > + g_list_free (root_dirs); > > > > I would prefer to move the list freeing right after the loop over the list, > > and also change it to > > > > g_list_free_full (root_dirs, g_object_unref) > > > > seeing the list eleements freed in the loop without removing them from the > > list makes me itchy. > > Is it safe to g_object_unref (NULL)? > > I unref objects inside the loop for efficiency - to free their memory as > soon as they are not needed anymore. That leaves NULLs in the list in place > of unreffed objects, hence the unref-of-NULL question. But it doesn't leave NULLs in the list - it leaves dangling pointers. Thus my itchyness. > > > > @@ +272,3 @@ > > + fullname = g_build_filename (dirpath, basename, NULL); > > + > > + keep_going = func ((const char *) fullname, user_data); > > > > It seems a bit silly that you are concatenating strings here to get a full > > path... > > What are the alternatives? Make a child GFile first, then g_file_get_path()? I was thinking func (dirpath, basename, user_data) > > @@ +273,3 @@ > > + > > + keep_going = func ((const char *) fullname, user_data); > > + > > > > It is also confusing that the VisitFunc returns a 'keep_going' value here - > > the only reason it can return FALSE is if data->cancelled got set, and we > > already check that in this function anyway. I would suggest to make the > > VisitFunc return void. > > I considered this. There's one difference between data->cancelled and > keep_going. data->cancelled is set when search is cancelled, keep_going is > set (by the visit function via return value) when search is cancelled *or* > by the directory scanner when child enumeration fails. Unless it's ok to > consider these two cases identical, i'd rather keep and check two different > variables. Well, after keep_going = func (....) The meaning of keep_going is exactly what func returns, which is !data->cancelled. > > > > @@ +308,3 @@ > > + display_basename = g_filename_display_basename (fpath); > > + lower_name = g_utf8_casefold (display_basename, -1); > > + g_free (display_basename); > > > > ...only to pull out the basename again here. I think we should pass both > > strings in and avoid a bunch of allocation noise. > > There's a subtle difference here. g_filename_display_basename() returns > UTF-8-encoded display name, g_file_info_get_name() returns > filename-encoding-encoded basename. They are not interchangeable. Well, just pass the correctly encode one down - I'm just complaining about repeated concatenating and pulling apart of the same strings.
(In reply to Matthias Clasen from comment #14) > (In reply to LRN from comment #13) > > (In reply to Matthias Clasen from comment #12) > > > Review of attachment 302712 [details] [review] [review] [review] [review]: > > > > > > ::: gtk/gtksearchenginesimple.c > > > @@ +227,3 @@ > > > + } > > > + > > > + g_list_free (root_dirs); > > > > > > I would prefer to move the list freeing right after the loop over the list, > > > and also change it to > > > > > > g_list_free_full (root_dirs, g_object_unref) > > > > > > seeing the list eleements freed in the loop without removing them from the > > > list makes me itchy. > > > > Is it safe to g_object_unref (NULL)? > > > > I unref objects inside the loop for efficiency - to free their memory as > > soon as they are not needed anymore. That leaves NULLs in the list in place > > of unreffed objects, hence the unref-of-NULL question. > > But it doesn't leave NULLs in the list - it leaves dangling pointers. Thus > my itchyness. Yeah, i'm a bit lagging here. Previous version did set root->data to NULL explicitly. This version does not, as it doesn't do g_list_free_full (root_dirs, <cleaner_func>) for the reasons stated above. > > > > > > @@ +272,3 @@ > > > + fullname = g_build_filename (dirpath, basename, NULL); > > > + > > > + keep_going = func ((const char *) fullname, user_data); > > > > > > It seems a bit silly that you are concatenating strings here to get a full > > > path... > > > > What are the alternatives? Make a child GFile first, then g_file_get_path()? > > I was thinking > > func (dirpath, basename, user_data) > > > > > > > @@ +308,3 @@ > > > + display_basename = g_filename_display_basename (fpath); > > > + lower_name = g_utf8_casefold (display_basename, -1); > > > + g_free (display_basename); > > > > > > ...only to pull out the basename again here. I think we should pass both > > > strings in and avoid a bunch of allocation noise. > > > > There's a subtle difference here. g_filename_display_basename() returns > > UTF-8-encoded display name, g_file_info_get_name() returns > > filename-encoding-encoded basename. They are not interchangeable. > > Well, just pass the correctly encode one down - I'm just complaining about > repeated concatenating and pulling apart of the same strings. There's no "correctly encoded one". We need display basename (specifically basename and specifically UTF-8-encoded) for matching *and* we need filesystem-encoding-encoded fullname for g_filename_to_uri(). To get filesystem-encoding-encoded fullname we need to either do what my code does, or construct a GFile via g_file_get_child (dir, basename) (we get basename practically for free from the info object), which would do the same thing internally, and then g_file_get_path(). To get UTF-8-encoded name we need to g_filename_display_basename() on a filesystem-encoding-encoded full path (see above) or g_file_get_display_name() on a gfile (see above), which would just call g_filename_display_basename() or its equivalent internally.
Review of attachment 302712 [details] [review]: lets push this as-is for now
Attachment 302712 [details] pushed as 4acbcf9 - Rewrite simple search engine to use GFile and breadth order
This is not fixed. Searching while in a remote share now searches my home directory, and not the remote share :(
The bug is RESOLVED FIXED, in any case. I you disagree with that, the first step would be to reopen it.
Reopening.
search in remote shared fixed for real now. I hope