GNOME Bugzilla – Bug 646928
Unreachable assert reached in gkr_operation_block_and_unref
Last modified: 2019-02-22 11:46:10 UTC
In Ubuntu, we are seeing a crash in gkr_operation_block_and_unref() on line 409, the g_assert_not_reached() call. https://launchpad.net/bugs/743497 I'm going to spend a little time looking at it, but it seems the sort of thing a maintainer would be able to spot quicker. Here's a bit of the relevant stacktrace:
+ Trace 226615
Did you find anything? In neither version 2.32.0 or version 3.0.0 does line 409 of gkr-operation.c have an assertion. So obviously a maintainer would wonder if perhaps it's custom changes that have introduced the problem. I removed a spurious assertion during the development of 3.0.0: commit 84103a1d29e66ede966874e4af4b17706605cd48 Author: Stef Walter <stefw@collabora.co.uk> Date: Thu Mar 10 17:12:23 2011 +0100 This assertion is vulnerable to corner conditions. In particular if the operation completes one pending call, unrefs it and creates another pending call, and the memory allocator decides to allocate that same memory for the next pending call, then this will assert in a bogus way. https://bugzilla.redhat.com/show_bug.cgi?id=660407 So I have no idea whether the bug you're reporting involves this commit or not. In order for me to help, please at a minumum include reference the git branch for your version of the libgnome-keyring. Much better would be for you to try and reproduce this problem on libgnome-keyring-3.0.0 or libgnome-keyring master.
BTW, seb128 and I worked on this a bit, he said that different assertions occur with or without the patch I referred to above. In any case it's still hard for me to know what branch you're working with. I'd like to see your branch, thanks!
Sorry, but there is only one g_assert_not_reached call in that file; I didn't mean to be ambiguous. Ubuntu is using 2.32.0, and the only source patch is the backported patch you reference above, removing an assert. You can browse the code here: http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/natty/libgnome-keyring/natty/files The assert line I'm talking about is in gkr_operation_block_and_unref(): while ((int) gkr_operation_get_result (op) == INCOMPLETE) { if (op->pending) { ... } else if (op->prompting) { ... } else { g_assert_not_reached (); } } I can't reliably reproduce even with this version, so not sure that testing 3.0 or master would help. The bug reports center around attempts to mount ssh servers via gvfsd, but that may be a coincidence.
(In reply to comment #3) > Sorry, but there is only one g_assert_not_reached call in that file; I didn't > mean to be ambiguous. Aha, okay that helps a little. > Ubuntu is using 2.32.0, and the only source patch is the backported patch you > reference above, removing an assert. You can browse the code here: > http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/natty/libgnome-keyring/natty/files BTW, the code at that link above does not match the stack trace. Line 409 of gkr-operation.c does not have an assertion and is not even in the same function. I'm not trying to be testy here, it's just that with these heisenbugs there's already enough strange variables without throwing more in. > The assert line I'm talking about is in gkr_operation_block_and_unref(): > > while ((int) gkr_operation_get_result (op) == INCOMPLETE) { > if (op->pending) { > ... > } else if (op->prompting) { > ... > } else { > g_assert_not_reached (); > } > } Do you have any full stack traces with all the threads included? As we seem to be invoking strange behavior from libdbus when used in a secondary thread, so stack traces of the other threads would be helpful.
> BTW, the code at that link above does not match the stack trace. Line 409 of > gkr-operation.c does not have an assertion and is not even in the same > function. Whoops, my bad. I linked you to an out-of-date mirror of the package. I'll get that mirror updated, but the code is just 2.32.0 with http://git.gnome.org/browse/libgnome-keyring/commit/?id=84103a1d29e66ede966874e4af4b17706605cd48 applied. > Do you have any full stack traces with all the threads included? As we seem to > be invoking strange behavior from libdbus when used in a secondary thread, so > stack traces of the other threads would be helpful. Yup, from the Launchpad bug I linked: https://launchpadlibrarian.net/67492254/ThreadStacktrace.txt
Created attachment 186332 [details] [review] Better fix for libdbus threading race problem Could you test whether this patch works for you? Even if it doesn't end up fixing the problem, it does fix a corner case, and I'd like to include it. I think it's either part or all of the solution to this problem. commit 4a6151ddaa966f7d797a550835390e404ce7636b Author: Stef Walter <stefw@collabora.co.uk> Date: Wed Apr 20 06:41:14 2011 +0200 Better fix for dbus threading race condition. * Making sure that op->pending has changed is not fool proof because the memory allocator could have allocated a new pending which has the same address. https://bugzilla.gnome.org/show_bug.cgi?id=646928
This didn't fix the issue for me. I was however, able to find a routine that consistently crashed for me. Just running gvfs-mount ssh://NAME@URL gvfs-mount ssh://NAME@URL -u a few times in a row got the crash, almost always within 5 iterations. (And I had a password for NAME@URL in my keyring.)
Woot, I can reproduce :)
Created attachment 186456 [details] [review] Refactor sync/async dbus request paths. Here's a patch which fixes the problem. This completely refactors the way that blocking requests are done. We expect less good behavior of dbus, and don't have it use callbacks on blocking requests. This patch should apply to the gnome-3-0 branch. Please let me know if there's any problems with it. If I don't hear back, I'll merge it into libgnome-keyring-3.0.1
Seems to work for me! Thanks.
Merged into gnome-3-0 branch. This will be included in libgnome-keyring-3.0.1