GNOME Bugzilla – Bug 317997
indexer crashes on spam html, maybe
Last modified: 2005-10-06 15:53:21 UTC
Version details: 0.1.1 Distribution/Version: Fedora Core 3 index the attached file, maybe
Created attachment 53045 [details] File which crashes the indexer
Here's the backtrace: DEBUG: +file:///disk/b/mail/Mail/group-2002-02/351 Unhandled Exception: System.ArgumentNullException: null key Parameter name: key in [0x000c7] System.Collections.Hashtable:Find (System.Object key) in [0x00002] (at /tmp/scratch/BUILD/mono-1.1.8.3/mcs/class/corlib/System.Collect ions/Hashtable.cs:395) System.Collections.Hashtable:Contains (System.Object key) in <0x00231> Beagle.Util.Scheduler:Worker () in (wrapper delegate-invoke) System.MulticastDelegate:invoke_void ()
I am not at all convinced that the file shown in the DEBUG log as the most recently indexed is the problem -- I've seen another half-dozen of these crashes, some of the files look very inocuous, and in at least one case there was no indexing going on at all, i.e. the log looks like this: DEBUG: Done crawling '/disk/b/mail/Mail/pers-2002-05' DEBUG: Done crawling '/disk/b/mail/Mail/pers-2002-06' DEBUG: Done crawling '/disk/b/mail/Mail/pers-2002-07' DEBUG: Done crawling '/disk/b/mail/Mail/pers-2002-08' DEBUG: Done crawling '/disk/b/mail/Mail/pers-2002-09' Unhandled Exception: System.ArgumentNullException: null key Parameter name: key in [0x000c7] System.Collections.Hashtable:Find (System.Object key) in [0x00002] (at /tmp/scratch/BUILD/mono-1.1.8.3/mcs/class/corlib/System.Collect ions/Hashtable.cs:395) System.Collections.Hashtable:Contains (System.Object key) in <0x00231> Beagle.Util.Scheduler:Worker () in (wrapper delegate-invoke) System.MulticastDelegate:invoke_void () INFO: NetBeagleConfigurationChanged EventHandler invoked INFO: WebServicesConfigurationChanged EventHandler invoked
Created attachment 53091 [details] [review] Patch to catch the null Source problem
So it now seems clear the problem is not with the data I have now been running for 12 hours with the attached patch, which has fired 20 times (Tag always empty also) in the course of indexing 65000 files in 250 directories. Any suggestions about more information to log when the null Source is detected would be welcome.
Created attachment 53108 [details] [review] Always enforce non-null task source Thanks for investigating that. I tracked it down to KopeteQueryable running without inotify, and raddy on IRC happened to report this at the same time. This is now fixed in CVS. I'm attaching another patch for review which would prevent this problem reoccurring so subtly in future.
Created attachment 53109 [details] [review] IndexSynchronization fix This fixes another source of tasks without Source. This one is different as the task is created in static context. This sets the source to the lock object, which should do the trick..?
Yeah, both of these patches are good. Please commit.