GNOME Bugzilla – Bug 361320
Beagle should index labyrinth notes
Last modified: 2006-10-14 01:06:31 UTC
Labyrinth is a mindmapping/notetaking tool similar to tomboy.
Created attachment 74452 [details] [review] Add labyrinth support to beagle Let me know what you think.
Bug against labyrinth which allows us to open notes based on a uri scheme. http://code.google.com/p/labyrinth/issues/detail?id=3&can=2&q=
Generally looks okay, although I think that maybe doing this as a backend and a filter would make more sense.
I thought about that, and doing something like what the tomboy backend does (ie. making a 'Note' object that we then pull data from in the filter) but I'm having a hard time seeing the reasoning behind this redundancy, and it has come back to bite us in the ass with the IM filters, so I wasn't really sure. If you want, I could retractor that stuff out pretty easily, but since there really aren't any properties included in the notes (just the basics) theres not much to extract. However, once we get all that sexy new metadata stuff in beagle, labyrinth 'links' notes and can create some pretty awesome/dynamic relationships, so that might be worth looking at, but at the moment, we really don't have the metadata system in place for such relationships, although its very possible I could hack a way to do that, I'd rather wait to see what comes of it.
The reason I brought it up is because of code like this: + if (doc.NodeType == XmlNodeType.Text) { + sb.Append (doc.ReadString()); + continue; We essentially buffer all the contents in memory and then send them over to the index helper process, which is wasteful of memory and can be a big problem if the text is large. By moving this into the filter we (a) move the entire process into the index helper process and (b) have access to the underlying streams and can feed the text straight into the indexer without buffering it all ahead of time. If the amount of text is small, then this probably doesn't matter. (BTW, I think the problem with the IM filters is more because the code is crappy rather than any split between the backend and the filter. The mail split is an example of one that works pretty well.)
Ok, sounds good, I can make the split happen, theres probably not a major performance hit that we can find in 99.9% of the use cases with something like this, (since almost no one is going to have pages and pages of text in a simple one-pane note taking application) but as we have all learned, there will always be those 3 users who push our app to the limit and beyond. Besides anything that keeps mem usage down is good in my book. My only concern originally was that I was making some unneeded code complexity/mem use, but I guess thats really the other way around. I'll attach a patch later tonight.
Created attachment 74626 [details] [review] Updated Labyrinth Patch Ok, this one has a separate Filter and Backend, let me know what you think.
Another thing I didn't notice the first time through: + foreach (FileInfo file in dir.GetFiles ()) { + if (file.Extension == ".map") { + IndexNote (file, Scheduler.Priority.Delayed); + ++count; + } + } We should stop using this construct and start using DirectoryWalker and IndexableGenerators because this doesn't scale. Once you have more than 50 or so files to index this starts to become a performance bottleneck. See the Gaim log backend for an example of this (it previously worked this way and was recently switched to being an indexable generator). Mime type should probably be "x-beagle/x-labrynth-note". The filter leaks a file descriptor if it encounters an exception. I suggest moving the doc.Close() out into a finally block. I'd like there to be naming consistency between the backend and the filter; I suggest renaming the filter to FilterLabrynth (or FilterLabrynthNote if there are other types of Labrynth data to index).
Created attachment 74651 [details] [review] Update, Indexable Generator and Closing the Streams Ok, I think I took care of everything, let me know what you think!
The "FilterLabNote.cs" comment at the top of the file needs fixing. :) You don't want to call Finished() in the finally block, because that block will *always* be executed regardless of whether or not you hit an exception, which means that you would call both Error() and Finished() in the error case. Move that back into the try block. The crawling stuff doesn't make sense in the backend. The reason why it's all there in the Gaim backend is to build up a list of directories and attach inotify watches to them. You don't need any of that for these notes since they're all in one directory. If you want to keep crawling around in the non-inotify case, that's fine, but this code isn't useful for that. The old code probably made more sense there. IndexNoteNow() doesn't make sense as the method name, since all it does is basically convert a note to an indexable; it doesn't index it at all. I suggest changing it to NoteToIndexable(). The HitIsValid() method probably isn't necessary.
Created attachment 74662 [details] [review] More Updates Think I got all that. ;)
You need to move Finished() into the try block; that's the only way to guarantee it's run only in the non-error case. And now that I look at it, the finally block isn't necessary. You can just move doc.Close() outside the try/catch blocks entirely. (ie, where Finished() is now) + foreach (FileInfo file in DirectoryWalker.GetFileInfos (lab_dir)) + if (file.Name.EndsWith("map")) + IndexNote (file, Scheduler.Priority.Delayed); You want to remove this block completely; the entire point of the indexable generator is so that you don't walk the list of files creating indexables and scheduler tasks for every file up front. :) You can just move the inotify watch code back into StartWorker() then too. You don't need IsUpToDate(), as it's defined on the parent LuceneFileQueryable class. IndexNote() isn't used outside of the inotify function, you might want to just fold the code in.
Created attachment 74664 [details] [review] More Updates Just a updating away! ;) Lemme know how it looks, I'm thinking once we get this in working order, I'm gonna go back and refractor the Tomboy to act similarly.
Finished() still needs to be moved into the try block. In the queryable, polling_interval_in_seconds isn't used. You probably want to multiply it by 1000 and pass it into GLib.Timeout.Add(). Once those are fixed, it's fine to go in.
Ok, fixed that and committed, let me know if theres anything I missed, but at least its working here.