GNOME Bugzilla – Bug 338165
TermBuffer class is not thread safe but appears to be used concurrently
Last modified: 2006-04-26 17:52:00 UTC
Please describe the problem: I am getting exceptions like the following: 060411 1431437994 07987 Beagle WARN EX: System.ArgumentException: length 060411 1431437994 07987 Beagle WARN EX: in [0x00247] System.Array:Copy (System.Array sourceArray, Int32 sourceIndex, Syste m.Array destinationArray, Int32 destinationIndex, Int32 length) 060411 1431437994 07987 Beagle WARN EX: in [0x00166] (at /var/tmp/portage/beagle-999/work/beagle/beagled/Lucene.Net/Index/ TermBuffer.cs:70) Lucene.Net.Index.TermBuffer:SetTextLength (Int32 newLength) 060411 1431437994 07987 Beagle WARN EX: in [0x0001b] (at /var/tmp/portage/beagle-999/work/beagle/beagled/Lucene.Net/Index/ TermBuffer.cs:82) Lucene.Net.Index.TermBuffer:Read (Lucene.Net.Store.IndexInput input, Lucene.Net.Index.FieldInfos fieldInf os) 060411 1431437994 07987 Beagle WARN EX: in [0x00050] (at /var/tmp/portage/beagle-999/work/beagle/beagled/Lucene.Net/Index/ SegmentTermEnum.cs:130) Lucene.Net.Index.SegmentTermEnum:Next () Reviewing the TermBuffer class this should never happend. I added some debugging code and it appears the fields are being updated in multiple threads. See this log (I will attach the patch): 060411 1431437905 07987 Beagle DEBUG: *** Remove '/home/double/src/kerry-0.09' 'config.log' (file) 060411 1431437943 07987 Beagle DEBUG: *** Remove '/home/double/src/kerry-0.09' 'subdirs' (file) 060411 1431437950 07987 Beagle ERROR: textLength (179) > newLength (124) (text.Length = 179) 060411 1431437957 07987 Beagle WARN: Caught exception calling DoQuery on 'Files' 060411 1431437980 07987 Beagle WARN: Could resolve unique id of 'subdirs' in '/home/double/src/kerry-0.09' for removal, it is probably already gone 060411 1431437983 07987 Beagle DEBUG: *** Remove '/home/double/src/kerry-0.09' 'config.h' (file) 060411 1431437994 07987 Beagle WARN EX: System.ArgumentException: length 060411 1431437994 07987 Beagle WARN EX: in [0x00247] System.Array:Copy (System.Array sourceArray, Int32 sourceIndex, Syste m.Array destinationArray, Int32 destinationIndex, Int32 length) 060411 1431437994 07987 Beagle WARN EX: in [0x00166] (at /var/tmp/portage/beagle-999/work/beagle/beagled/Lucene.Net/Index/ TermBuffer.cs:70) Lucene.Net.Index.TermBuffer:SetTextLength (Int32 newLength) 060411 1431437994 07987 Beagle WARN EX: in [0x0001b] (at /var/tmp/portage/beagle-999/work/beagle/beagled/Lucene.Net/Index/ TermBuffer.cs:82) Lucene.Net.Index.TermBuffer:Read (Lucene.Net.Store.IndexInput input, Lucene.Net.Index.FieldInfos fieldInf os) 060411 1431437994 07987 Beagle WARN EX: in [0x00050] (at /var/tmp/portage/beagle-999/work/beagle/beagled/Lucene.Net/Index/ SegmentTermEnum.cs:130) Lucene.Net.Index.SegmentTermEnum:Next () This appears only possible if multiple threads are using the same TermBuffer instance. This class is not thread safe. This is on an SMP machine. Steps to reproduce: 1. Wait until crawling is complete 2. Extract a tarball in your home directory 3. Perform a search (I'm using kerry) while beagle is processing the new files Actual results: Get the exceptions logged, may be missing some results, not sure about the result list as I do get some of them. Expected results: Nice clean log file, no exceptions :) Does this happen every time? Yes Other information: I modified some code after a "make a copy" method because it actually was not making a copy. This reduced the number of exceptions because, I assume, string.ToCharArray is thread safe and my copy implementation gets this char array and then sets the internal vars.
Created attachment 63278 [details] [review] TermBuffer.cs.patch Adds some debug info and changes some code that apparently was supposed to copy a value.
This is (or at least, was) what was causing the memory explosion bug. You should take a look at 19_no_thread_local_storage.patch, which is the patch I committed to work around a leak in Mono in the 1.1.13.x timeframe. The initial patch was broken because it didn't handle concurrency: the length exception was a lot more widespread. I updated it, though, and I'm not entirely sure why it doesn't work. The data being stored should be static per-thread.
Created attachment 63328 [details] beagle log I was redirected here from bug 335178 Joe Shaw told me that my last log show the problem of this bug The attach contains my beagle log hope to be usefull
*** Bug 338672 has been marked as a duplicate of this bug. ***
I am thinking that maybe the values are retrieved using a thread static (previously thread local), but are passed to a method that in turn uses it in another thread. I think the be safe making TermBuffer thread safe would be a good way to go, although I'm not sure of the locking overhead. I will try making TermBuffer thread safe and see what happens.
At this point I think the right thing to do is to just back out that patch altogether and just require Mono 1.1.13.5, which fixes the problem.
Requiring mono-1.1.13.5 and removing the patch for broken mono sounds the right thing to do. Life is really difficult when one has to workaround bugs in dependencies.
Sounds good to me. I have reverted the patch, updated from CVS, and am testing now.
Tested and no exceptions which I could easily reproduce before. Looks good from here.
I second the no errors, looks good. :)
Is this patch on 0.2.5? or it is going in next release??
This patch is not in 0.2.5, but I checked it into CVS and it will be in for 0.2.6.