GNOME Bugzilla – Bug 640683
Mutex deadlock on unsubscribe
Last modified: 2011-05-09 13:34:20 UTC
Another problem I hit when using my Deja Dup test suite. It freezes my app seemingly with a mutex deadlock. Looks like maybe another threading issue?
+ Trace 225713
Ah, interesting. It's not a permanent deadlock. But it lasts for a few seconds, enough to make the system think the app is unresponsive.
I think this is a GDBus problem, but I'm not 100% sure. Can you get a trace of what's going on in the other threads while this lock is waiting?
Created attachment 179655 [details] Backtraces set 1 Here's one set of backtraces. I have another from a separate instance of the same deadlock behavior that I will attach shortly. Is this the easiest format for consumption? Would core dumps be useful if you don't have my trunk-compiled copy of deja-dup that I was using for testing?
Created attachment 179656 [details] Backtraces set 2
Hmm, is it possible you can try these two things 1. Run your app under valgrind (to check for heap corruption) 2. Run your app with accessibility turned off and see if you can reproduce. Thanks!
Heap corruption is certainly possible. Difficult to test with accessibility turned off though (as it's intermittent and thus most easily reproduced by running my test suite which does all sorts of things driven by the a11y layer).
OK, after some more testing, I'm very confident this is a real bug in dconf or gdbus, but not my app. === Begin long story === My app does not use gsettings very carefully. It has a lot of widgets that each have their own GSettings object and will write back to GSettings whenever they change or set themselves from GSettings when the 'changed' signal is emitted for their key. Basically a homebrewed binding system like GSettings already has. But notably, these widgets don't make an attempt to avoid setting a value if that same value is in GSettings already. So, imagine a widget starts up, asks GSettings for a value, sets it, which emits a changed signal, and the widget pays attention to that as well. This doesn't actually cause any harm, it's just inefficient, so I haven't bothered to fix it yet. Anyway, this deadlock bug seemed to only be getting hit when a preference dialog with a particularly complicated widget was being futzed with. So I suspected that all the reads and writes going on were causing some issues. I told my app to use a tiny wrapper of GSettings that avoided setting values when that value was already in the backend and I no longer got this bug. === End long story === Thus, my app isn't at fault (besides abusing the GSettings api) but does cause lots of simultaneous reads and writes. Again, this is not a permanent deadlock, but the UI does freeze for a few seconds (long enough for it to be greyed out by the window manager). Note that this is with 0.7.2. I don't have a simple test case unfortunately. Do the stacktraces not help narrow this down?
I see a child watch in the backtrace here, so I should mention bug #398418. I'm not sure if it's relevant though, since GMainLoop running in a multi-threaded process should be OK.
(In reply to comment #4) > Created an attachment (id=179656) [details] > Backtraces set 2 This one really looks like a heap corruption problem. The deadlock occurs because the main thread is attempting to acquire a mainloop that is having a source added to it in another thread. That thread is stuck in this loop: while (tmp_source && tmp_source->priority <= source->priority) { last_source = tmp_source; tmp_source = tmp_source->next; } which is a simple linked list traversal that apparently never reaches its end. (In reply to comment #3) > Created an attachment (id=179655) [details] > Backtraces set 1 This one is just plain weird. There's absolutely nothing in the backtrace indicating why this lock should already be held. The only other mention of the main context in that trace is that it's dispatching, but it releases its own lock before doing so. Again, heap corruption crosses my mind...
hi Michael Did you see this again?
No. I changed the app to be more careful about how it uses the GSettings API and I haven't had the problem since. I could test again on the previous code if it would help...
I'm not sure that this was ever a bug in glib or dconf, and I'm actually fairly sure it wasn't in dconf in any case, so I will close this for now.