After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 319412 - Archive filter
Archive filter
Status: RESOLVED FIXED
Product: beagle
Classification: Other
Component: General
unspecified
Other All
: Normal enhancement
: ---
Assigned To: Beagle Bugs
Beagle Bugs
: 364706 (view as bug list)
Depends on:
Blocks: 319259
 
 
Reported: 2005-10-21 13:40 UTC by Veerapuram Varadhan
Modified: 2006-11-24 21:49 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Archive filter + FilterFactory fix suggested by Jon. (13.67 KB, patch)
2005-10-21 13:53 UTC, Veerapuram Varadhan
none Details | Review
Archive filter + FilterFactory fix suggested by Jon. (14.28 KB, patch)
2005-10-27 09:25 UTC, Veerapuram Varadhan
none Details | Review
FilterArchive.cs Revival (13.67 KB, patch)
2006-06-19 16:55 UTC, Kevin Kubasik
none Details | Review

Description Veerapuram Varadhan 2005-10-21 13:40:22 UTC
Beagle should be able to filter "files inside archives".
Comment 1 Veerapuram Varadhan 2005-10-21 13:53:11 UTC
Created attachment 53730 [details] [review]
Archive filter + FilterFactory fix suggested by Jon.

Archive filter and a fix to "enable" transient files to be filtered using their
"path".  The fix, as suggested by Jon, will enable the Transient files to be
indexed using their path.

NOTE: Transient files should set their "indexable.Timestamp" to that of their
parent to keep the consistency.  In archive filter, the temporary files that
are created will actually have the "Parent's" timestamp.
Comment 2 Veerapuram Varadhan 2005-10-21 13:58:13 UTC
You may also want to look at <a href="show_bug.cgi?id=315056">this</a> fix,
which is a must, when operating on reader/stream of the indexable.
Comment 3 Veerapuram Varadhan 2005-10-21 15:03:58 UTC
Ok, here is the correct link.. http://bugzilla.gnome.org/show_bug.cgi?id=315056
Comment 4 Joe Shaw 2005-10-24 18:36:25 UTC
So I tested out the patch, and it looks like the files are being decompressed
fine, but they're not being indexed at all.
Comment 5 Joe Shaw 2005-10-24 18:54:39 UTC
hmm, they work with beagle-extract-content, but not in the daemon.  i'll try to
track down further.
Comment 6 Joe Shaw 2005-10-24 19:01:18 UTC
Ok, found it.  The FIXME in the FSQ isn't addressed.  Look at PreChildAddHook()
in FileSystemQueryable.cs.  Maybe trow can comment on this.

Also, based on beagle-extract-content, the fixme:name property of the child
files use the tmpfile instead of the real file name.  The tmpfile isn't very
useful data to index. :)

Lastly, it'd be nice to clean up the camelCase to conform to under_scores that
we're using as our coding convention now.
Comment 7 Veerapuram Varadhan 2005-10-25 09:27:04 UTC
Joe: Thanks for your comments.  I am going to post a new archive filter by
keeping Archive.cs, as FilterOpenOffice and FilterK(word/spread/*) will use
archive filter instead of directly using the SharpZipLib APIs.  In the meantime,
you may want to look at the path posted
http://bugzilla.gnome.org/show_bug.cgi?id=315056 by Daniel that enables
child-indexables in the FSQ.
Comment 8 Veerapuram Varadhan 2005-10-27 09:25:58 UTC
Created attachment 53941 [details] [review]
Archive filter + FilterFactory fix suggested by Jon.

Hmm.. I tried overriding the ZipFile.GetEntry () stuff using GetNextEntry ()
for tar also and it didn't work as expected.  So, I think, its better we remove
Archive.cs from Util/ directory.  Also, I have corrected the FilterArchive.cs
according to Joe's comments.  Here is the updated filter.
Comment 9 Joe Shaw 2005-10-27 20:23:08 UTC
You have FilterMusic.cs listed twice in the Makefile.am.  Other than that, looks
fine to me.  We need the changes to the FSQ before this can be checked in, though.
Comment 10 Kevin Kubasik 2006-06-19 11:35:24 UTC
Definatly something we should work on.
Comment 11 Joe Shaw 2006-06-19 14:49:50 UTC
(reopening; verified means that the fix has been verified)
Comment 12 Debajyoti Bera 2006-06-19 15:01:04 UTC
Verified means fix has been verified. "New" is the status most likely you meant.

http://bugzilla.gnome.org/page.cgi?id=bug-status.html#status

And yes, this bug is in pipeline. Its sort of mostly done (rather, was sort of mostly done some months back) and is awaiting the FSQ changes (which is a rather non-trivial one).

I'd suggest you read the list of comments before marking/adding comments to bugs. That'll give you a clear idea of the current status of a bug. And thanks for periodically scanning the bugs; it definitely reduces the number of bugs, keep everybody on their toes and makes all aware of what needs to be fixed.
Comment 13 Kevin Kubasik 2006-06-19 16:55:07 UTC
Created attachment 67643 [details] [review]
FilterArchive.cs Revival

Ok, I'm not entirely sure why, but massive amounts of ChildIndexable stuff seems to have seeped into the FSQ over time, I dunno if it was supposed to be there or not of what, but here is a somewhat updated and i think mildly working patch.

There is nothing on the UI side,  but it does work fine. Let me know what you think.
Comment 14 Kevin Kubasik 2006-06-19 16:56:39 UTC
Almost forgot, didn't diff the Makefile.am in Util, just keep that in mind before building. You need to add FilterArchive.cs.
Comment 15 Daniel Drake 2006-07-02 16:13:09 UTC
There are some issues which remain to be resolved with your patch (which seems to have been derived from mine in bug 315056), such as handling of children inside children and the definition of the lucene fields. Jon also raised some more important issues which I have forgotten, something related to handling of children when the parent gets modified or goes away..?
Comment 16 Kevin Kubasik 2006-10-16 03:42:00 UTC
(In reply to comment #15)
> There are some issues which remain to be resolved with your patch (which seems
> to have been derived from mine in bug 315056), 
Yeah, just an update to keep it applying cleanly so we might be able to encourage people to hack on it.
> such as handling of children
> inside children and the definition of the lucene fields. Jon also raised some
> more important issues which I have forgotten, something related to handling of
> children when the parent gets modified or goes away..?
Yeah, we don't seem to do anything real intelligent when this happens, in a perfect world, we could just query for all the children, then iterate through them and update properties. (or even better, just change the properties in a metadata store) but at the moment, we lack the metadata support to make this really feasible (I think). 

My guess is once we get the new metadata system in place, we should have a lot more options when it comes to dealing with this, and I hope we can make it work then, as we seem to be coming to a point were beagle should be mature enough to handle archived files.
Comment 17 Kevin Kubasik 2006-10-25 11:41:04 UTC
*** Bug 364706 has been marked as a duplicate of this bug. ***
Comment 18 Debajyoti Bera 2006-11-24 21:49:05 UTC
I checked in slight modified versions of these patches and some more stuff to CVS. Marking this closed. Unfortunately the right thing to do depends somewhat on use cases - please open new bugzilla entry for future enhancement requests and bugs.