GNOME Bugzilla – Bug 312348
Evolution crashes when EDS terminats due to memory corruption
Last modified: 2013-09-10 14:04:39 UTC
Evolution stack traces Backtrace was generated from '/opt/gnome/bin/evolution' Using host libthread_db library "/lib/tls/libthread_db.so.1". [Thread debugging using libthread_db enabled] [New Thread 1097776768 (LWP 25492)] [New Thread 1132825520 (LWP 25540)] [Thread debugging using libthread_db enabled] [New Thread 1097776768 (LWP 25492)] [New Thread 1132825520 (LWP 25540)] [Thread debugging using libthread_db enabled] [New Thread 1097776768 (LWP 25492)] [New Thread 1132825520 (LWP 25540)] [New Thread 1124420528 (LWP 25522)] [New Thread 1121995696 (LWP 25518)] [New Thread 1119894448 (LWP 25517)] [New Thread 1117793200 (LWP 25516)] [New Thread 1115691952 (LWP 25510)] [New Thread 1113590704 (LWP 25509)] [New Thread 1109625776 (LWP 25495)] [New Thread 1107524528 (LWP 25494)] 0xffffe410 in ?? ()
+ Trace 62143
Thread 1 (Thread 1097776768 (LWP 25492))
EDS traces at terminal POST /soap HTTP/1.1 SOAP-Debug: 0x81085c0 @ 1122987004 Host: 164.99.169.177 Connection: Keep-Alive User-Agent: Evolution/1.3.6 Content-Type: text/xml SOAPAction: createCursorRequest <?xml version="1.0" encoding="UTF-8" standalone="no"?> <SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/1999/XMLSchema" xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance"><SOAP-ENV:Header SOAP-ENV:encodingStyle=""><session>gzHsMcL09kCUhbYi</session></SOAP-ENV:Header><SOAP-ENV:Body xmlns:types="http://schemas.novell.com/2003/10/NCSP/types.xsd" SOAP-ENV:encodingStyle=""><createCursorRequest><container>A.dell.net.100.0.1.0.1@19</container><view>id iCalId recurrenceKey</view></createCursorRequest></SOAP-ENV:Body></SOAP-ENV:Envelope> *** glibc detected *** double free or corruption (fasttop): 0x424209e8 *** Aborted
Evolution groupwise debug traces <?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:xsi="http://www.w3.org/1999/XMLSchema-instance" xmlns:xsd="http://www.w3.org/1999/XMLSchema" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><SOAP-ENV:Header SOAP-ENV:encodingStyle=""/><SOAP-ENV:Body SOAP-ENV:encodingStyle="" xmlns:types="http://schemas.novell.com/2003/10/NCSP/types.xsd"><markReadResponse><status><code>0</code></status></markReadResponse></SOAP-ENV:Body></SOAP-ENV:Envelope> (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2431: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:2943: Unable to contact backend (evolution:25492): libecal-WARNING **: e-cal.c:1698: Unable to contact backend GThread-ERROR **: file gthread-posix.c: line 160 (): error 'Device or resource busy' during 'pthread_mutex_destroy ((pthread_mutex_t *) mutex)' aborting...
Same scenario but traces are different in which segv handler is called Backtrace was generated from '/opt/gnome/bin/evolution' Using host libthread_db library "/lib/tls/libthread_db.so.1". [Thread debugging using libthread_db enabled] [New Thread 1097776768 (LWP 27477)] [New Thread 1124457392 (LWP 27564)] [Thread debugging using libthread_db enabled] [New Thread 1097776768 (LWP 27477)] [New Thread 1124457392 (LWP 27564)] [Thread debugging using libthread_db enabled] [New Thread 1097776768 (LWP 27477)] [New Thread 1124457392 (LWP 27564)] [New Thread 1122356144 (LWP 27530)] [New Thread 1120254896 (LWP 27529)] [New Thread 1118153648 (LWP 27528)] [New Thread 1116052400 (LWP 27519)] [New Thread 1113951152 (LWP 27518)] [New Thread 1111493552 (LWP 27480)] [New Thread 1107381168 (LWP 27479)] 0xffffe410 in ?? ()
+ Trace 62147
Thread 1 (Thread 1097776768 (LWP 27477))
*** Bug 353454 has been marked as a duplicate of this bug. ***
*** Bug 351972 has been marked as a duplicate of this bug. ***
looksl ike a crash in ORBit, changing product Unfortunately, that stack trace is not very useful in determining the cause of the crash. Can you get us one with debugging symbols? Please see http://live.gnome.org/GettingTraces for more information on how to do so. please install the ORBit debugging package
*** Bug 354842 has been marked as a duplicate of this bug. ***
*** Bug 355172 has been marked as a duplicate of this bug. ***
Created attachment 72503 [details] GDB trace #1 (from dwm@doc.ic.ac.uk)
Hi, My original bug (Bug 355172) was marked a duplicate of this one, so I'll mail here. I've attempted to replicate my previous crashes; however, it does seem a little random whether or not the crashing of the evolution-data-server-1.8 process will also cause the evolution process to crash. I've tried installing several -dbg packages (I'm running current Ubuntu Eft); however, there don't appear to be any Orbit -dbg packages in this distribution. (Even with extra *verse repositories enabled.) I've generated another trace (see attached "GDB trace #1"); if it is not useful, I can have a go at building an unstripped version of the relevant libraries. Cheers, David
I've just asked seb128 to put up ORBit debugging packages, you may want to check again in an hour or so
*** Bug 355075 has been marked as a duplicate of this bug. ***
*** Bug 354941 has been marked as a duplicate of this bug. ***
*** Bug 356413 has been marked as a duplicate of this bug. ***
*** Bug 353473 has been marked as a duplicate of this bug. ***
So - the trace is great David - thanks:
+ Trace 72626
I *suspect* that there is something very bad about the e_cal_view code here calling a CORBA method on an invalid object from this callback. Can you walk up the stack there and do a few eg.: p *_obj up p *view up p *model p *client_data up p *client up p *ecal p *gcal and dump the output here ? of course - memory corruption is all too likely, but very unlikely in ORBit2 - I see no reason to believe this is an ORBit bug. Better - with the nice symbols, can you run: valgrind --tool=memcheck evolution-2.6 2>&1 | tee /tmp/val-log and see if we can get the log file attached ? that may show where the corruption lies more precisely. Re-assign back to evo.
*** Bug 356683 has been marked as a duplicate of this bug. ***
*** Bug 356894 has been marked as a duplicate of this bug. ***
*** Bug 357243 has been marked as a duplicate of this bug. ***
*** Bug 357293 has been marked as a duplicate of this bug. ***
*** Bug 357887 has been marked as a duplicate of this bug. ***
*** Bug 358242 has been marked as a duplicate of this bug. ***
*** Bug 358571 has been marked as a duplicate of this bug. ***
*** Bug 358699 has been marked as a duplicate of this bug. ***
*** Bug 358927 has been marked as a duplicate of this bug. ***
*** Bug 358983 has been marked as a duplicate of this bug. ***
*** Bug 359150 has been marked as a duplicate of this bug. ***
Created attachment 73974 [details] Valgrind memcheck log #1 Valgrind memcheck log generated against evolution-2.8.1, as shipped as part of the evolution-2.8.1-0ubuntu1 package from Eft.
Hi, Apologies for not responding more quickly -- real life has been somewhat busy of late. Attached to thus bug as attachment #73974 [details] is the logfile generated by the Valgrind memcheck tool whilst running Evolution. (Since updated locally to 2.8.1, but still exhibiting the same behaviour.) Unfortunately, gdb wasn't able to extract any useful information from the running process; I'll see if I can reproduce it again and print out the requested gdb information. Cheers, David
Created attachment 73976 [details] GDB trace #2 (from dwm@doc.ic.ac.uk) GDB trace and stack-inspection output from a crashed evolution-2.8.1 process, as requested.
The smoking gun for the evolution bug is here: ==16424== Invalid read of size 4 ... ==16424== by 0x4323D31: ORBit_c_stub_invoke (in /usr/lib/libORBit-2.so.0.1.0) ==16424== by 0x4B10DEE: GNOME_Evolution_Calendar_CalView_start (Evolution-DataServer-Calendar-stubs.c:10) ==16424== by 0x4B326DB: e_cal_view_start (e-cal-view.c:389) ==16424== by 0x6D017B0: update_e_cal_view_for_client (e-cal-model.c:1519) ==16424== by 0x6D025AF: add_new_client (e-cal-model.c:1590) ==16424== by 0x6D5B7C3: client_cal_opened_cb (gnome-cal.c:2588) ==16424== by 0x498E878: g_cclosure_marshal_VOID(i_xx_t) (gmarshal.c:216) ==16424== by 0x498179A: g_closure_invoke (gclosure.c:490) ==16424== Address 0xE27C540 is 8 bytes inside a block of size 40 free'd ... ==16424== by 0x43182DC: CORBA_Object_release (in /usr/lib/libORBit-2.so.0.1.0) ==16424== by 0x4222DD2: bonobo_object_release_unref (in /usr/lib/libbonobo-2.so.0.0.0) ==16424== by 0x4B32BFC: e_cal_view_finalize (e-cal-view.c:226) It would be -great- to get deeper stack traces - the default is not helpful: --with-num-callers=128 is good - but I'm confident this is an evolution cockup using an incorrectly managed ORBit2 resource after it has been freed.
*** Bug 359630 has been marked as a duplicate of this bug. ***
Could this be related? I've seen this in my valgrind logs: ==16988== Invalid read of size 4 ==16988== at 0xB74679: open_async (e-cal.c:1878) ==16988== by 0x897FEE: g_thread_create_proxy (gthread.c:553) ==16988== by 0x6EDF99: start_thread (pthread_create.c:274) ==16988== by 0x5279AD: clone (in /lib/libc-2.4.90.so) ==16988== Address 0x4F13C40 is 8 bytes inside a block of size 28 free'd ==16988== at 0x4004FEA: free (vg_replace_malloc.c:233) ==16988== by 0x8815F0: g_free (gmem.c:187) ==16988== by 0xB6A6D7: async_signal_idle_cb (e-cal.c:1867) ==16988== by 0x8785E0: g_idle_dispatch (gmain.c:3924) ==16988== by 0x87A341: g_main_context_dispatch (gmain.c:2043) ==16988== by 0x87D31E: g_main_context_iterate (gmain.c:2675) ==16988== by 0x87D6C8: g_main_loop_run (gmain.c:2879) ==16988== by 0x4C616A22: bonobo_main (bonobo-main.c:311) ==16988== by 0x805C490: main (notify-main.c:162)
so, what's in store here? any chance to see this fixed for gnome 2.16.2?
The valgrind hit that was mentioned in comment #32 was also logged in bug #335217.
*** Bug 365832 has been marked as a duplicate of this bug. ***
*** Bug 370200 has been marked as a duplicate of this bug. ***
*** Bug 365094 has been marked as a duplicate of this bug. ***
*** Bug 369633 has been marked as a duplicate of this bug. ***
*** Bug 369054 has been marked as a duplicate of this bug. ***
Note that among all these duplicates are references to the Calendar *and* the Address Book.
+ Trace 82562
Just started looking into this - and looks to me that there are more than one problems in the long list of duplicates - most prominently an extra unlock on that mutex. Looks to me that I will not be able to fix and test this before rolling out the tarballs today but thus far, I have a reliable way to reproduce the problem and have a few theories on hand - will push in a fix sometime tomorrow.
Seems to be my day :-)...i've nailed down the problem and will roll out the tarballs with the fix
Committed the fix to HEAD.
*** Bug 371429 has been marked as a duplicate of this bug. ***
*** Bug 371442 has been marked as a duplicate of this bug. ***
*** Bug 371583 has been marked as a duplicate of this bug. ***
No patch... Fixed for all cases, I assume? See comment 40.
*** Bug 372376 has been marked as a duplicate of this bug. ***
*** Bug 373082 has been marked as a duplicate of this bug. ***
(In reply to comment #47) > No patch... Fixed for all cases, I assume? See comment 40. > No .Just the scenario and traces originally reported in *this* bug. I will have to comb through the huge pile of (alleged) duplicates to separate the unrelated bugs and process them while I am back on Monday. You can find my patch at http://cvs.gnome.org/viewcvs/evolution-data-server/calendar/libecal/e-cal.c?r1=1.129&r2=1.130. This was not a memory corruption as the summary claims. It was an erroneous double mutex unlock as hinted by Michael on Comment #30. Thanks for caring.
*** Bug 373462 has been marked as a duplicate of this bug. ***
(In reply to comment #50) > (In reply to comment #47) > > No patch... Fixed for all cases, I assume? See comment 40. > > No .Just the scenario and traces originally reported in *this* bug. Which one is "this bug"? The original description and summary? The original stacktrace that misses the entire crashing trace? Or maybe the trace in comment 2? > I will have to comb through the huge pile of (alleged) duplicates to > separate the unrelated bugs and process them while I am back on Monday. Which unrelated bugs? I already took care of all the false duplicates of that Epiphany (possibly AT-SPI, see bug 351972) crasher. Although that trace seems rather similiar. See comment 40. There are 2 almost identical stacktraces here, both Evolution. They just happen to be in different Components. Frankly, and without having a look at the code, this feels like a copy-n-paste issue, both Components sharing the same crash. Hence, there most likely is a similar issue lurking for other Evolution Components. No, I do not want to wait till a user reports that crash. I want them fixed just as well. > This was not a memory corruption as the summary claims. It was an erroneous > double mutex unlock as hinted by Michael on Comment #30. Feel free to correct the summary. > Thanks for caring. Well -- there would have been no need for poking in the first place, and especially there would have been no need for me to go all ranty just to make that caring of mine result in anything. If, yeah if, you would have read the note of mine that happened to be *right* above the textfield where you entered your comment. You can't possibly have missed that, now can you? REOPENing, as per comment 50. Just for reference, I left a bunch of notes already in a lot of bug reports, summarizing, providing cross references and pointing out important bits. Hope not all of them get overlooked like this one.
(In reply to comment #43) > Committed the fix to HEAD. What about the stable branch? Setting Target Milestone from 2.9 to 2.8. Crasher, this should be fixed in the stable branch. Too much duplicates anyway.
*** Bug 375576 has been marked as a duplicate of this bug. ***
*** Bug 378293 has been marked as a duplicate of this bug. ***
*** Bug 381372 has been marked as a duplicate of this bug. ***
*** Bug 382912 has been marked as a duplicate of this bug. ***
*** Bug 385203 has been marked as a duplicate of this bug. ***
*** Bug 385067 has been marked as a duplicate of this bug. ***
*** Bug 385108 has been marked as a duplicate of this bug. ***
*** Bug 390596 has been marked as a duplicate of this bug. ***
*** Bug 392380 has been marked as a duplicate of this bug. ***
*** Bug 385053 has been marked as a duplicate of this bug. ***
So, can we close this then? If it was fixed in 2.8.latest I mean.
*** Bug 414854 has been marked as a duplicate of this bug. ***
*** Bug 421929 has been marked as a duplicate of this bug. ***
*** Bug 431226 has been marked as a duplicate of this bug. ***
*** Bug 430755 has been marked as a duplicate of this bug. ***
*** Bug 432946 has been marked as a duplicate of this bug. ***
*** Bug 435233 has been marked as a duplicate of this bug. ***
*** Bug 440718 has been marked as a duplicate of this bug. ***
*** Bug 442811 has been marked as a duplicate of this bug. ***
*** Bug 444388 has been marked as a duplicate of this bug. ***
*** Bug 446411 has been marked as a duplicate of this bug. ***
bug 444388 comes from GNOME 2.18.
*** Bug 457604 has been marked as a duplicate of this bug. ***
*** Bug 464646 has been marked as a duplicate of this bug. ***
*** Bug 429108 has been marked as a duplicate of this bug. ***
Is this still an issue?
looks like the flood of duplicates has calmed down ;-) Harish fixed a bug like this: lets close this as fixed & start aggregating new bugs elsewhere ?
(In reply to comment #80) > Harish fixed a bug like this: lets close this as fixed & start aggregating new > bugs elsewhere ? sounds good :-)