GNOME Bugzilla – Bug 753378
Crashes in the GIO kqueue backend
Last modified: 2015-12-23 14:42:15 UTC
firefox, thunderbird, leafpad, libre-office-writer crashes if I try to save. Pcmanfm, nautilus, thunar don't start anymore. It is diskussed here (someone found a temporary workaround) https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=202128
(re-titling for clarity)
I don't mentioned it happens after update of following package to: glib-2.44.1 cairo-1.14.2,2 gobject-introspection-1.44.0 atk-2.16.0 gdk-pixbuf2-2.31.5 adwaita-icon-theme-3.16.2.1 libsoup-gnome-2.50.0 gtk3-3.16.6 at-spi2-atk-2.16.0 at-spi2-core-2.16.0 gdk-pixbuf2-2.31.5
It would be useful to have a stacktrace here
I never worked much with debugger. You have told me how to do this.
https://wiki.gnome.org/Community/GettingInTouch/Bugzilla/GettingTraces/DistroSpecificInstructions#FreeBSD You can add WITH_DEBUG=yes to /etc/make.conf, rebuild glib from ports, and re-run the program in gdb. Type 'r' in gdb to start the program and type 'bt' to get the stacktrace after it crashes. I cannot reproduce the problem on my machine, so I cannot provide a backtrace for you.
Thanks, this commands I found but I got only with leafpad and pcmanfm (gdb) bt
+ Trace 235339
unfortunately, that is not enough of a stacktrace to say much. Can you try t a a bt instead of bt when you are in gdb ? That will produce a stacktrace with all threads
Here it is (with leafpad) no debugging symbols found)...[New LWP 100461] [New Thread 808806400 (LWP 100461/leafpad)] ^C[New Thread 808a1cc00 (LWP 101228/leafpad)] Program received signal SIGINT, Interrupt. [Switching to Thread 808a1cc00 (LWP 101228/leafpad)] 0x000000080359ee6a in _kevent () from /lib/libc.so.7 (gdb) t a a bt [New Thread 808a64c00 (LWP 101232/leafpad)]
+ Trace 235354
Thread 2 (Thread 808806400 (LWP 100461/leafpad))
Hi! I've already been looking at this bug in the context of the Debian glib2.0 package for GNU/kFreeBSD: https://bugs.debian.org/bug=734290#17 As best as I could tell: "This was introduced or exposed by: https://git.gnome.org/browse/glib/commit/?id=548c165a9f8386af29e8bb8243d8923e0f315c2e" I'm not so sure my analysis is correct: "[g_unix_mount_monitor_get] waits there for the constructor to finish (effective_state->1). But inside the constructor, it sets up a file monitor on /etc/fstab, and that hangs waiting for the_volume_monitor_mutex already held somewhere." Debugging it should be much easier on actual FreeBSD, since Debian GNU/kFreeBSD's gdb wasn't able to see all threads. I'd recommend someone running real FreeBSD to try this to see if it helps: "commenting out that mutex in gio/gunionvolumemonitor.c and it seem to not hang any more" 581 _g_mount_get_for_mount_path (const gchar *mount_path, 582 GCancellable *cancellable) 583 { 593 if (klass->get_mount_for_mount_path) 594 { 595 // g_rec_mutex_lock (&the_volume_monitor_mutex); 596 mount = klass->get_mount_for_mount_path (mount_path, cancellable); 597 // g_rec_mutex_unlock (&the_volume_monitor_mutex); 598 } Thanks!
Seems to work on FreeBSD-10.2 amd64.
If you're using glib20 port revision 2.44.1_1, it has disabled kqueue to avoid hitting the bug, but as far as I know the problem is still there: http://www.freshbsd.org/commit/freebsd-ports/r393663
I have not forgotten to comment out the line in glib20/Makefile and so re-enable kqueue. Here it works with (tested with Thunderbird, Firefox and Leafpad, pcmanfm starts now normal).
Hi Walter, just to make sure I understand you... Did you comment out the mutex as I mentioned in #9 - or is it working now unmodified with FreeBSD 10.2 amd64?
I comment out the "workaround" in glib20/Makefile and comment out the mutex as in #9 above.
(In reply to Steven Chamberlain from comment #9) > I'd recommend someone running real FreeBSD to try this to see if it helps: Thanks, it works for me too: - FreeBSD 10.2 stable amd64 - glib-2.44.1_1 Without this patch I had got a problem with file icons in file managers such as caja/thunar/pcmanfm, I had to manually press F5 to refresh icons. Also the trash didn't show any icons, now it seems to work ok.
Created attachment 317773 [details] [review] gio: drop obsoleted lock causing deadlocks on FreeBSD I think it is a recursion from the GUnixMountMonitor constructor, to a GLocalFileMonitor on /etc/fstab, and into GUnixMountMonitor again, now with a mutex already held, so it deadlocks. https://bugzilla.gnome.org/page.cgi?id=traceparser/trace.html&trace_id=235354 That mutex in glocalfile.c:g_local_file_find_enclosing_mount() doesn't seem necessary any more IMHO. Inside it, only 'mount' is modified, but that's just a stack variable local to this function. When klass->get_mount_for_mount_path is called, it's given one const parameter and the other is unused, so they're unchanged. 'klass' doesn't seem it could be modified either inside that function. It doesn't recurse infinitely, but seems to work correctly and pass the testsuite after this change. The FreeBSD project already applied my patch in their ports tree, and their users seem happy with it. See https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=712848#64 and
If FreeBSD and Debian kfreebsd are both shipping this patch, then there is no reason for us to wait any longer. Sorry for not noticing this until now.
Attachment 317773 [details] pushed as 42b160b - gio: drop obsoleted lock causing deadlocks on FreeBSD