GNOME Bugzilla – Bug 544998
Evo hangs every so often, BT shows a g-kr read on top
Last modified: 2009-03-05 03:52:35 UTC
E-D-S 2.23.6 (trunk) and E-D-S 2.23.4, g-kr 2.23.5 with fix for bug 502603, Ubuntu Intrepid 8.10 Alpha3.
+ Trace 203921
I will grab a stacktrace from g-kr-deamon next time.
Thanks, yes a stacktrace from gnome-keyring-daemon is necessary. FWIW, UI programs should generally use asynchronous gnome-keyring functions rather than the synchronous ones (which evolution is doing in the stack trace above). This allows them to keep the UI interactive during a gnome-keyring-daemon prompt or some such.
Point taken, although Evolution's password management is already asynchronous, not to mention complex and brittle. I hesitate to pile another asynchronous layer on top. There's a case to be made that our whole password management design needs to be rethought, but my previous attempts have failed. That code just hates me. Anyway, wanted to mention that a lot of Evolution users also report seeing this filling up their /var/log/messages during the hang: Jun 15 09:11:55 localhost gnome-keyring-daemon[2607]: couldn't read 4 bytes from client: Also, the hang has been reported on both 32 and 64-bit architectures.
and another comment -- these last few days I have not experienced any hangs. Before that, I would get at least one hang every day. BUT -- I *did* change my Evo account setup, raising the interval between POP mail checks from 5 to 10 minutes (I was wondering if this could be playing a role...). I will reduce it back to 5 minutes (or even lower), and try again.
There was an erroneous warning (ie: 'couldn't read 4 bytes from client') has been fixed in recent versions of gnome-keyring. See bug #511285
OK, got it again. Evo BT:
+ Trace 204608
and g-kr-daemon's bt: (gdb) thread apply all bt full
+ Trace 204609
Thread 2 (Thread 0x413b6950 (LWP 26573))
Thanks for the hard work in getting the stack trace. This is going to be a tough one to figure out. But I'll give it a shot. How often can you duplicate this problem?
Stef, usually once per day, when I had email checking every 5 minutes or so. Right now, with a longer interval, it is once per week or so. I will downset the interval again, and we will see what happens. If you would like me to to something specific, please tell me.
*** Bug 548845 has been marked as a duplicate of this bug. ***
Created attachment 117170 [details] [review] Patch to try Does this patch fix the problem you're experiencing?
The patch does not apply successfully on 2.23.90, but applies successfuly on trunk (revision 1260); but it fails to build, on gp11, so it has nothing to do with the patch: libtool: compile: gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I.. -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -pthread -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -Wall -Wchar-subscripts -Wmissing-declarations -Wmissing-prototypes -Wnested-externs -Wpointer-arith -Wcast-align -Wsign-compare -Werror -g -O2 -Wno-strict-aliasing -Wno-sign-compare -MT gp11-call.lo -MD -MP -MF .deps/gp11-call.Tpo -c gp11-call.c -fPIC -DPIC -o .libs/gp11-call.o cc1: warnings being treated as errors gp11-call.c: In function ‘_gp11_call_sync’: gp11-call.c:355: error: format not a string literal and no format arguments gp11-call.c: In function ‘_gp11_call_basic_finish’: gp11-call.c:444: error: format not a string literal and no format arguments make[4]: *** [gp11-call.lo] Error 1 make[4]: Leaving directory `/usr/src/buildd/gnome-keyring-svn/gp11' make[3]: *** [all-recursive] Error 1 make[3]: Leaving directory `/usr/src/buildd/gnome-keyring-svn/gp11' make[2]: *** [all] Error 2 make[2]: Leaving directory `/usr/src/buildd/gnome-keyring-svn/gp11' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/usr/src/buildd/gnome-keyring-svn' make: *** [all] Error 2 Base system is Ubuntu Alpha4, Gnome 2.23.90. I am unsure if there are special instructions to build g-kr, and I could not find a site talking about it. Or, perhaps, I should not apply the patch to trunk?
Thanks for letting me know. I've fixed that build problem on trunk. Although I didn't get those build errors, the problems they were pointing to were legitimate.
Stef, some more: cc1: warnings being treated as errors gkr-crypto.c: In function ‘fatal_handler’: gkr-crypto.c:63: error: format not a string literal and no format arguments gkr-crypto.c: In function ‘gkr_crypto_sexp_dump’: gkr-crypto.c:808: error: format not a string literal and no format arguments make[3]: *** [libgkr_common_la-gkr-crypto.lo] Error 1 make[3]: Leaving directory `/usr/src/buildd/gnome-keyring-svn/common' make[2]: *** [all-recursive] Error 1
Cool, your compiler found some possible bugs. What compiler are you using? Compile errors fixed: 2008-08-27 Stef Walter <stef@memberwebs.com> * common/gkr-crypto.c: Fix build problems with string formats.
gcc 4.3.1, Ubuntu Intrepid. Will try it again :-)
Yet another: cc1: warnings being treated as errors gkr-wakeup.c: In function ‘gkr_wakeup_now’: gkr-wakeup.c:93: error: ignoring return value of ‘write’, declared with attribute warn_unused_result make[3]: *** [libgkr_common_la-gkr-wakeup.lo] Error 1 make[3]: Leaving directory `/usr/src/buildd/gnome-keyring-svn/common' notice the first line. This is a make from SVN, so the compiler options are as SVN has them set. I simply took out -Werror from the Makefile under ./common, and I was then able to build g-kr. Testing now.
That's a silly warning from gcc. I've committed a hopeful fix to silence the warning. As far as -Werror .... We use -Werror in SVN checkout versions to catch problems that we wouldn't otherwise find, as the gnome-keyring maintainers obviously don't have every compiler, OS, and distro at their disposal.
I have been running your patch for a while, and have not had another lock. I would guess this is a go... of course, this is a deadlock, and it might just be I did not hit the right spot so far. But, given that I would be hit by it almost every day, not having had it for about 2 weeks is a good indication. Thank you.
*** Bug 558934 has been marked as a duplicate of this bug. ***
Great let's close this.