After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 640683 - Mutex deadlock on unsubscribe
Mutex deadlock on unsubscribe
Status: RESOLVED INCOMPLETE
Product: dconf
Classification: Core
Component: gsettings backend
0.7.x
Other Linux
: Normal normal
: ---
Assigned To: dconf-maint
dconf-maint
Depends on:
Blocks:
 
 
Reported: 2011-01-27 02:02 UTC by Michael Terry
Modified: 2011-05-09 13:34 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Backtraces set 1 (9.74 KB, text/plain)
2011-01-30 22:23 UTC, Michael Terry
Details
Backtraces set 2 (5.67 KB, text/plain)
2011-01-30 22:23 UTC, Michael Terry
Details

Description Michael Terry 2011-01-27 02:02:51 UTC
Another problem I hit when using my Deja Dup test suite.  It freezes my app seemingly with a mutex deadlock.  Looks like maybe another threading issue?

  • #0 __lll_lock_wait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 136
  • #1 _L_lock_999
    from /lib/libpthread.so.0
  • #2 __pthread_mutex_lock
    at pthread_mutex_lock.c line 61
  • #3 g_dbus_connection_send_message_with_reply
    at /build/buildd/glib2.0-2.27.92/gio/gdbusconnection.c line 1851
  • #4 g_dbus_connection_call
    at /build/buildd/glib2.0-2.27.92/gio/gdbusconnection.c line 5106
  • #5 dconf_settings_backend_send
    at dconfsettingsbackend.c line 164
  • #6 dconf_settings_backend_unsubscribe
    at dconfsettingsbackend.c line 587
  • #7 g_settings_finalize
    at /build/buildd/glib2.0-2.27.92/gio/gsettings.c line 510
  • #8 g_object_unref
    at /build/buildd/glib2.0-2.27.92/gobject/gobject.c line 2734

Comment 1 Michael Terry 2011-01-27 03:20:28 UTC
Ah, interesting.  It's not a permanent deadlock.  But it lasts for a few seconds, enough to make the system think the app is unresponsive.
Comment 2 Allison Karlitskaya (desrt) 2011-01-27 15:52:54 UTC
I think this is a GDBus problem, but I'm not 100% sure.

Can you get a trace of what's going on in the other threads while this lock is waiting?
Comment 3 Michael Terry 2011-01-30 22:23:11 UTC
Created attachment 179655 [details]
Backtraces set 1

Here's one set of backtraces.  I have another from a separate instance of the same deadlock behavior that I will attach shortly.

Is this the easiest format for consumption?  Would core dumps be useful if you don't have my trunk-compiled copy of deja-dup that I was using for testing?
Comment 4 Michael Terry 2011-01-30 22:23:30 UTC
Created attachment 179656 [details]
Backtraces set 2
Comment 5 David Zeuthen (not reading bugmail) 2011-02-01 18:05:14 UTC
Hmm, is it possible you can try these two things

 1. Run your app under valgrind (to check for heap corruption)

 2. Run your app with accessibility turned off

and see if you can reproduce. Thanks!
Comment 6 Michael Terry 2011-02-01 18:34:26 UTC
Heap corruption is certainly possible.  Difficult to test with accessibility turned off though (as it's intermittent and thus most easily reproduced by running my test suite which does all sorts of things driven by the a11y layer).
Comment 7 Michael Terry 2011-02-09 04:04:39 UTC
OK, after some more testing, I'm very confident this is a real bug in dconf or gdbus, but not my app.

=== Begin long story ===
My app does not use gsettings very carefully.  It has a lot of widgets that each have their own GSettings object and will write back to GSettings whenever they change or set themselves from GSettings when the 'changed' signal is emitted for their key.  Basically a homebrewed binding system like GSettings already has.

But notably, these widgets don't make an attempt to avoid setting a value if that same value is in GSettings already.  So, imagine a widget starts up, asks GSettings for a value, sets it, which emits a changed signal, and the widget pays attention to that as well.  This doesn't actually cause any harm, it's just inefficient, so I haven't bothered to fix it yet.

Anyway, this deadlock bug seemed to only be getting hit when a preference dialog with a particularly complicated widget was being futzed with.  So I suspected that all the reads and writes going on were causing some issues.

I told my app to use a tiny wrapper of GSettings that avoided setting values when that value was already in the backend and I no longer got this bug.
=== End long story ===

Thus, my app isn't at fault (besides abusing the GSettings api) but does cause lots of simultaneous reads and writes.  Again, this is not a permanent deadlock, but the UI does freeze for a few seconds (long enough for it to be greyed out by the window manager).

Note that this is with 0.7.2.  I don't have a simple test case unfortunately.  Do the stacktraces not help narrow this down?
Comment 8 Allison Karlitskaya (desrt) 2011-02-09 17:06:21 UTC
I see a child watch in the backtrace here, so I should mention bug #398418.  I'm not sure if it's relevant though, since GMainLoop running in a multi-threaded process should be OK.
Comment 9 Allison Karlitskaya (desrt) 2011-02-09 17:21:23 UTC
(In reply to comment #4)
> Created an attachment (id=179656) [details]
> Backtraces set 2

This one really looks like a heap corruption problem.  The deadlock occurs because the main thread is attempting to acquire a mainloop that is having a source added to it in another thread.  That thread is stuck in this loop:

      while (tmp_source && tmp_source->priority <= source->priority)
        {
          last_source = tmp_source;
          tmp_source = tmp_source->next;
        }

which is a simple linked list traversal that apparently never reaches its end.

(In reply to comment #3)
> Created an attachment (id=179655) [details]
> Backtraces set 1

This one is just plain weird.  There's absolutely nothing in the backtrace indicating why this lock should already be held.  The only other mention of the main context in that trace is that it's dispatching, but it releases its own lock before doing so.  Again, heap corruption crosses my mind...
Comment 10 Allison Karlitskaya (desrt) 2011-05-08 16:17:02 UTC
hi Michael

Did you see this again?
Comment 11 Michael Terry 2011-05-09 11:52:22 UTC
No.  I changed the app to be more careful about how it uses the GSettings API and I haven't had the problem since.  I could test again on the previous code if it would help...
Comment 12 Allison Karlitskaya (desrt) 2011-05-09 13:34:20 UTC
I'm not sure that this was ever a bug in glib or dconf, and I'm actually fairly sure it wasn't in dconf in any case, so I will close this for now.