After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 639345 - [patch] [filesystem] implement search
[patch] [filesystem] implement search
Status: RESOLVED FIXED
Product: grilo
Classification: Other
Component: plugins
git master
Other All
: Normal enhancement
: ---
Assigned To: grilo-maint
grilo-maint
Depends on:
Blocks:
 
 
Reported: 2011-01-12 19:02 UTC by Guillaume Emont (guijemont)
Modified: 2011-01-18 11:26 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
filesystem: implemented search (9.11 KB, patch)
2011-01-12 19:02 UTC, Guillaume Emont (guijemont)
needs-work Details | Review
filesystem: implemented search (9.23 KB, patch)
2011-01-14 11:48 UTC, Guillaume Emont (guijemont)
none Details | Review
filesystem: implemented search (9.53 KB, patch)
2011-01-17 20:48 UTC, Guillaume Emont (guijemont)
none Details | Review
filesystem: added depth limit to search (9.79 KB, patch)
2011-01-17 20:49 UTC, Guillaume Emont (guijemont)
none Details | Review

Description Guillaume Emont (guijemont) 2011-01-12 19:02:59 UTC
Created attachment 178170 [details] [review]
filesystem: implemented search

The filesystem plugin would be better off with the search functionality implemented.
Comment 1 Guillaume Emont (guijemont) 2011-01-12 19:06:17 UTC
Attached a first version of the patch, with the following known issues:
 - skip is ignored
 - use create_content() which is blocking, but that's a general issue in the fs plugin, since this is used in other places
 - it might be cleaner to use a GCancellable, but its doc says it cannot be used in multiple concurrent operations, which I think is the case here
Comment 2 Guillaume Emont (guijemont) 2011-01-12 22:48:42 UTC
Other notable limitation: does not handle symlinks
Comment 3 Iago Toral 2011-01-13 11:28:37 UTC
Review of attachment 178170 [details] [review]:

A few comments:

1) A search operation, must *always* be finished by invoking the user callback with remaining_count=0. This means:
  - When the operation is stopped because of an error, we must invoke the user callback with the error and remaining_count=0.
  - When the operation is cancelled, we must stop the operation and invoke the user callback with NULL error, reamining_count=0, and NULL media.

2) I think you are not closing some files properly or at least there is some problem with that, after a few tests I got lots of these (even after restarting test-ui).

"grl-filesystem.c:737: Got error while searching: Too many files opened"

3) The plugin does not filter media files and returns all kind of regular files (see file_is_valid_content).

4) Cancellation support is missing. We really need this because search operations on the filesystem plugin can be pretty expensive, but this can be added later on.

5) Maybe we should do better chunking of the result emission to improve responsiveness (see for example produce_from_path), but we could investigate this later.

I think we should fix 1-3 before commiting the patch, then we can commit and investigate 4-5.
Comment 4 Guillaume Emont (guijemont) 2011-01-13 11:58:09 UTC
Regarding 1):
When the search operation is finished because we've crawled the whole fs hierarchy we had to crawl, is it acceptable to have the same behaviour as a cancelled operation?
In other words: can we return "-1" as "remaining" for each call to the callback with a media, and then return "remaining=0" with a NULL media when we've gone through the whole hierarchy?
I'm asking that because the documentation does not seem to talk about these kind of cases, I think it should (but maybe we should move that discussion to the ml)
Comment 5 Iago Toral 2011-01-13 12:06:18 UTC
(In reply to comment #4)
> Regarding 1):
> When the search operation is finished because we've crawled the whole fs
> hierarchy we had to crawl, is it acceptable to have the same behaviour as a
> cancelled operation?
> In other words: can we return "-1" as "remaining" for each call to the callback
> with a media, and then return "remaining=0" with a NULL media when we've gone
> through the whole hierarchy?

That would be ok.

> I'm asking that because the documentation does not seem to talk about these
> kind of cases, I think it should (but maybe we should move that discussion to
> the ml)

Yes, we should update the docs to clarify these situations.
Comment 6 Guillaume Emont (guijemont) 2011-01-13 17:38:57 UTC
About 2), it's an algorithmic issue: we should scan the fs tree depth-first, so that the number of open files for that scan is at most the height of the tree at a given moment.
Comment 7 Iago Toral 2011-01-14 09:11:54 UTC
> About 2), it's an algorithmic issue: we should scan the fs tree depth-first, so
> that the number of open files for that scan is at most the height of the tree
> at a given moment.

As discussed on IRC, let's try to use a FIFO to queue directories and process them one by one to avoid the problem.

Also, for the record, let's have a configuration option for the plugin to limit the depth of the search. I propose a integer value so that:
-1  => No limit
 0  => only search the root directoy
 1  => one search first-level subdirs
 ...

We could set the default to some sane value, maybe something between 5-10? I guess that would be better than setting the default to -1.
Comment 8 Guillaume Emont (guijemont) 2011-01-14 11:48:00 UTC
Created attachment 178308 [details] [review]
filesystem: implemented search

Here's a new version, using a FIFO of directories.
It makes the code much simpler (especially for memory management) and avoids having too many open files.
Also noteworthy: I've removed the cancelling and error members of SearchOperation. error was never really used (none of the error we get is individually fatal to the crawling). cancelling is not needed any more because of the change of architecture. Support for cancel() should be put in search_next_directory(), probably using a GCancellable (now possible since we don't use concurrent operations).
Also note that, depending on circumstances/hardware, the FIFO search as is might be slower. We could make it faster by having several search_next_directory() calls running in parallel, but that could strain the system too much, and would prevent us from using GCancellable.

File type filtering and depth limit still need to be implemented, will have a look next week
Comment 9 Iago Toral 2011-01-14 12:41:51 UTC
Review of attachment 178308 [details] [review]:

I think we should call search_next_directory from got_children in case the call to g_file_enumerate_children_finish raises an error. Otherwise, if the first directory fails, for example, the search is aborted (search_next_directory is never invoked again) and no user callback is ever called.

I would wait until you add at least the support for filtering media content before pushing the patch, but other than that the patch looks good to me.

BTW, using a FIFO for crawling makes the system way more responsive while the search process is on-going :)
Comment 10 Guillaume Emont (guijemont) 2011-01-17 20:48:47 UTC
Created attachment 178559 [details] [review]
filesystem: implemented search
Comment 11 Guillaume Emont (guijemont) 2011-01-17 20:49:12 UTC
Created attachment 178560 [details] [review]
filesystem: added depth limit to search
Comment 12 Guillaume Emont (guijemont) 2011-01-17 20:52:54 UTC
Updated patch, now handle file filtering.
Added another patch that adds depth limit.

There still one bug I need to investigage in it: after doing a filesystem search with grilo-test-ui, when quitting I get the following warnings:


(lt-grilo-test-ui:30683): GLib-GObject-WARNING **: invalid uninstantiatable type `(null)' in cast to `GObject'

(lt-grilo-test-ui:30683): GLib-GObject-CRITICAL **: g_object_get_data: assertion `G_IS_OBJECT (object)' failed


This makes me think there should be one unref too many somewhere...
Comment 13 Iago Toral 2011-01-18 07:51:35 UTC
Pushed both patches.

For the record, cancellation support is still missing, should be added in the future.

---

commit ce984d37dad6f36b01b7d6d300b7e4e52889f671
Author: Guillaume Emont <gemont@igalia.com>
Date:   Mon Jan 17 18:11:57 2011 +0100

    filesystem: added depth limit to search
    
    It is controlled by a "maximum-search-depth" configuration option.
    
    Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>

commit 8b6c7a053f728737ae6595b8862ca2b6ef998999
Author: Guillaume Emont <gemont@igalia.com>
Date:   Wed Jan 12 18:47:09 2011 +0100

    filesystem: implemented search
    
    Fixes https://bugzilla.gnome.org/show_bug.cgi?id=639345
    
    Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Comment 14 Guillaume Emont (guijemont) 2011-01-18 11:26:52 UTC
On the refcount bug I was mentioning, it seems to reproduce when doing search
with other plugins, so it's either a bug common to many plugins, or a bug in
grilo-test-ui (more likely).
Therefore, this bug is definitely fixed. For the cancellation thing, I think it
deserves a separate bug, especially since it's an issue for all operations in
the filesystem plugin, not only search.