GNOME Bugzilla – Bug 561742
Thunderbird indexing only for headers
Last modified: 2018-07-03 09:50:03 UTC
In addition to the fact that Beagle is only indexing the headers for me in Thunderbird, I notice these symptoms: 1. Thunderbird displays a Beagle icon for error messages. Clicking on it I learn that there are 8779 [!] unread warnings. These seem to all be of the form : An error occurred while indexing. Error description: [Exception... "Component returned failure code: 0x80004003 (NS_ERROR_INVALID_POINTER) [nslMsgDBHdr.author]" nsresult: "0x80004003 (NS_ERROR_INVALID_POINTER)" location: "JS frame :: chrome://beagle/content/beagleindexer.js :: anonymous :: line 343" data: no' 2. Clicking on a link in Beagle to a mail item generates a message from Kmail saying that the relevant Thunderbird imap inbox doesn't exist. It does exist, and Kmail isn't my default email application. I'm using beagle 0.3.3-2ubuntu1, thunderbird 2.0.0.17+nobinonly, kerry 1:0.2.1-0ubuntu5, thunderbird-beagle 0.3.3-2ubuntu1 Any suggestions as to what's wrong? Gordon
I am experiencing the same problem. The inability to index e-mail content is pretty serious. It cripples Beagle. On Ubuntu 9.10, I've got Thunderbird 2.0.0.23, Beagle 0.3.9 and Beagle Indexer (TB Add-on) 0.1.3. Per Gordon's note 1: I noticed dozens of unread warnings shortly after I compacted folders. Not sure if it was a serious problem, but it was easy enough to clear. My logs are showing exceptions. I do not know if they are related. 0091218 17:25:49.2565 02909 Beagle DEBUG: New Thunderbird indexable generator launched for /home/ronl/.beagle/Indexes/ThunderbirdIndex/ToIndex 20091218 17:25:49.2589 02909 Beagle DEBUG: Caught ResponseMessageException: Connection refused 20091218 17:25:49.2590 02909 Beagle DEBUG: InnerException is SocketException -- we probably need to launch a helper 20091218 17:25:49.2591 02909 Beagle DEBUG: Launching helper process 20091218 17:25:49.2901 02909 Beagle DEBUG: IndexHelper PID is 2809 20091218 17:25:50.2920 02909 Beagle DEBUG: Found IndexHelper (2809) in 1.00s 20091218 19:07:42.6534 02909 Beagle DEBUG: Caught ResponseMessageException: Connection refused 20091218 19:07:42.6535 02909 Beagle DEBUG: InnerException is SocketException -- we probably need to launch a helper 20091218 19:07:42.6536 02909 Beagle DEBUG: Launching helper process 20091218 19:07:42.6809 02909 Beagle DEBUG: IndexHelper PID is 8268 20091218 19:07:43.1815 02909 Beagle DEBUG: Found IndexHelper (8268) in .50s Also, plenty on non-exceptions: 20091218 10:35:23.9123 02909 Beagle DEBUG: New Thunderbird indexable generator launched for /home/ronl/.beagle/Indexes/ThunderbirdIndex/ToIndex Index Helper log has lots of lines like: 20091218 08:26:41.5053 03297 IndexH DEBUG: +file:///home/ronl/.mozilla-thunderbird/abcdehn2s.default/Mail/Local Folders/Sent/?id=11095762 (file:///tmp/tmp7e01194c.tmp)
The Thunderbird add-on is writing only header information -- not e-mail content -- to temporary files in .beagle/Indexes/ThunderbirdIndex/ToIndex , where the Beagle daemon picks them up for indexing. I took a look in /usr/lib/thunderbird/extensions/{b656ef18-fd76-45e6-95cc-8043f26361e7}/chrome/beagle.jar. In that jar, /content/beagleIndexer.js dates back to November 26, 2007 -- around the time of the last major rewrite and version 0.3.0 release. I am no expert in JavaSript, but as far as I can tell, that code was never intended to index anything but the header information (unless code has been removed since then.) The Beagle-project web site might be accused of false advertising, but I wonder how much of the information there predates the rewrite. I am running Thunderbird 2.0.0.23 under Ubuntu 9.10 on an AMD64 machine. My home directory is in an ext4 partition. The Thunderbird-Beagle Indexer extension is 0.1.3, though Synaptic numbers it 0.3.9-3ubuntu1. Thunderbird 3.0, which is now available, boasts many improvements -- you can judge for yourself -- but it may not be compatible with the current add-on. ------------------------- I've recently tried a couple of things, unsuccessfully, to fix the problem on my own. Make no mistake, this is flailing. I have little understanding of what I am doing, but in the absence of expert support.... 1) I downloaded an older copy of the Thunderbird-Beagle add-on, version 0.1.2 . Same problem. Reinstall thunderbird-beagle 0.1.3. 2) I edited the contents of /usr/lib/thunderbird/extensions/{b656ef18-fd76-45e6-95cc-8043f26361e7}/chrome/beagle.jar in the following way: a) sudo file-roller b) open the file c) gedit beagleIndexer.js d) Based on a discussion I read ( https://bugzilla.gnome.org/show_bug.cgi?id=530632 ), describing a possible fix to the extension for Thunderbird 3 [Note: I am still using TB 2], I replaced 3 instances of "path.unixStyleFilePath" with "nativePath" . Save. Okay to update the jar. Close. e) This produced exceptions in the Beagle Log that I did not used to have. Basically, it broke the Beagle backend. Reinstall thunderbird-beagle to undo the changes to beagleIndexer.js . f) What we learn from this experience is that Beagle's Index Helper process apparently runs Mono to re-open Thunderbird's folders and read the contents. If this is so, the locus of the main problem (i.e., not indexing e-mail content) is the Beagle backend rather than the Thunderbird add-on. That is a body of code that I dare not touch. It's too bad, though, that the add-on does not just dump the e-mail content while it has the chance. ----------------------------- I've done a little more digging, in case anyone else cares to jump in. Source code for the Beagle Thunderbird backend is here: http://vbox4.gnome.org/browse/beagle/tree/beagled/ThunderbirdQueryable/ThunderbirdQueryable.cs?h=beagle-tbird-soc07 . The action seems to be around lines 343-347 (called from lines 381-383 and carried forward at line 398.) A partial explanation of what is going on can be found here: http://beagle-project.org/Filter_Tutorial . (See the section "DoOpen() and DoClose().") As I understand it, the file is just a big buffer. Mail messages are laid end-to-end in the file, so the only way to retrieve a particular message is to specify its offset postion and its size. This stream is handed off to parser.ConstructMessage, etc. I hoping that some reader will have enough experience to spot a potential weakness with respect to Ubuntu 9.X, AMD64, etc.
The Beagle community (including Novell) seems unable to offer any further support. Also, I found no evidence that Canonical has changed its support for desktop search apps since introducing Ubuntu 10.04 yesterday. So after reviewing all of the information I could find comparing strengths and weaknesses, I shut down the Beagle daemon and installed Recoll. So far, I am very pleased. Documentation is good. Installation was a snap. Minor customization (e.g., setting a parm to enable Recoll to work with the Firefox Beagle add-on) was easy. The whole conversion was very uneventful. Creation of the initial index was reasonably fast (under an hour for about 250,000 files.) The GUI is clean and clear. All of the stuff that Beagle caught was there. And all of the stuff that Beagle missed -- like Thunderbird message content -- showed up. Compare Recoll news... * 2010-01-05 : a 1.13.04 is out. It fixes a nasty bug (broken stemming) in 1.13.02. * 2010-01-29 : the full Recoll source repository is now hosted on Bitbucket, along with a Wiki and an issues tracking system. Hopefully, this new channel for reporting bugs and make suggestions will increase the feedback rate... * 2010-01-05 : a 1.13.02 is out. It brings some nice improvements and new functions. Please try it and report any problems. * 2009-12-10 : 1.12.4 is out. It fixes a problem in the preview window search function (qt4 only). * 2008-05-22 : we now have a mailing list: o Subscription management o Archives ... to Beagle news: 21 Jan 2010 Beagle status: Beagle isn't in active development. It is getting some occasional maintenance done by Novell. 26 Jan 2009 Beagle 0.3.9 released. 15 July 2008 Beagle 0.3.8 released. 7 Jun 2008 Added initial support for indexing removable medium. 21 May 2008 Added read-only RDF overlay on Beagle index. This RDF store can handle RDF queries and supports different query formats like SPARQL, N3 etc. 14 May 2008 Kio-beagle ported to KDE4. The future is Recoll.
Beagle is not under active development anymore and had its last code changes in early 2011. Its codebase has been archived (see bug 796735): https://gitlab.gnome.org/Archive/beagle/commits/master "tracker" is an available alternative. Closing this report as WONTFIX as part of Bugzilla Housekeeping to reflect reality. Please feel free to reopen this ticket (or rather transfer the project to GNOME Gitlab, as GNOME Bugzilla is deprecated) if anyone takes the responsibility for active development again.