GNOME Bugzilla – Bug 712570
Hang in g_threaded_socket_service_func
Last modified: 2014-12-07 06:56:48 UTC
gvfs creates and frees a lot of GDBusServers. Each of these has a GThreadedSocketService which makes use of a threadpool. I noticed that sometimes threads and fds are leaked: 37 Thread 0x7fd873b96700 (LWP 21849) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 36 Thread 0x7fd871ea4700 (LWP 21887) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 35 Thread 0x7fd8716a3700 (LWP 21896) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 34 Thread 0x7fd870ea2700 (LWP 21899) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 33 Thread 0x7fd85bfff700 (LWP 21998) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 32 Thread 0x7fd85b7fe700 (LWP 22002) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 31 Thread 0x7fd85affd700 (LWP 22008) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 30 Thread 0x7fd85a7fc700 (LWP 22011) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 29 Thread 0x7fd859ffb700 (LWP 22017) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 28 Thread 0x7fd8597fa700 (LWP 22025) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 27 Thread 0x7fd858ff9700 (LWP 22234) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 26 Thread 0x7fd837fff700 (LWP 22306) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 25 Thread 0x7fd8377fe700 (LWP 22362) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 24 Thread 0x7fd836ffd700 (LWP 22381) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 23 Thread 0x7fd8367fc700 (LWP 22454) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 22 Thread 0x7fd835ffb700 (LWP 22728) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 21 Thread 0x7fd8357fa700 (LWP 22731) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 20 Thread 0x7fd834ff9700 (LWP 22765) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 19 Thread 0x7fd817fff700 (LWP 22772) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 18 Thread 0x7fd8177fe700 (LWP 22914) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 17 Thread 0x7fd816ffd700 (LWP 22919) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 16 Thread 0x7fd8167fc700 (LWP 22922) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 15 Thread 0x7fd815ffb700 (LWP 22928) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 14 Thread 0x7fd8157fa700 (LWP 22931) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 13 Thread 0x7fd814ff9700 (LWP 23279) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 12 Thread 0x7fd7f7fff700 (LWP 23372) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 11 Thread 0x7fd7f77fe700 (LWP 23491) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 10 Thread 0x7fd7f6ffd700 (LWP 23527) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 9 Thread 0x7fd7f67fc700 (LWP 23593) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 8 Thread 0x7fd7f5ffb700 (LWP 23598) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 7 Thread 0x7fd7f57fa700 (LWP 24004) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 6 Thread 0x7fd7f4ff9700 (LWP 24062) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 5 Thread 0x7fd7e7fff700 (LWP 24068) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 4 Thread 0x7fd7e77fe700 (LWP 24168) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 3 Thread 0x7fd7e6ffd700 (LWP 24280) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 2 Thread 0x7fd7e67fc700 (LWP 24311) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 A backtrace of one of the threads:
+ Trace 232787
From what I can determine, g_threaded_socket_service_func() calls "g_object_unref (data->service);" which, if it is the last reference, causes g_threaded_socket_service_finalize() to be called which hangs because it waits for all tasks to be finished, but it is called from a task... I'm not sure how best to fix this.
I just noticed this again, attaching to gvfsd-trash: (gdb) info threads Id Target Id Frame 11 Thread 0x7f06aeff2700 (LWP 1470) "gdbus" 0x00007f06b74bc9dd in poll () at ../sysdeps/unix/syscall-template.S:81 10 Thread 0x7f06adf8e700 (LWP 1472) "gmain" 0x00007f06b74bc9dd in poll () at ../sysdeps/unix/syscall-template.S:81 9 Thread 0x7f06acf8c700 (LWP 1474) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 8 Thread 0x7f069ffff700 (LWP 1475) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 7 Thread 0x7f06ad78d700 (LWP 1621) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 6 Thread 0x7f069f7fe700 (LWP 1627) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 5 Thread 0x7f069effd700 (LWP 1760) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 4 Thread 0x7f069e7fc700 (LWP 1766) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 3 Thread 0x7f069dffb700 (LWP 1776) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 * 2 Thread 0x7f069d7fa700 (LWP 1782) "pool" pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 1 Thread 0x7f06b8f5b840 (LWP 1469) "gvfsd-trash" 0x00007f06b74bc9dd in poll () at ../sysdeps/unix/syscall-template.S:81 There are 8 hung threads. (gdb) thread apply 2 3 4 bt
+ Trace 233717
Created attachment 289551 [details] [review] gio: Prevent hang when finalizing GThreadedSocketService If all users of a GThreadedSocketService release their references to the service while a connection thread is running, the thread function will release the last reference to the service which causes the finalize to deadlock waiting for all threads to finish (because it's called from the thread function). To fix this, don't wait for all threads to finish in the service's finalize method. Since the threads hold a reference to the service, finalize should only be called when all threads are finished running (or have unrefed the service and are about to finish).
Attached is a reproducer: $ gcc -Wall -g -o test test.c `pkg-config --cflags --libs glib-2.0 gio-2.0` $ ./test In another terminal, run: $ nc localhost 8888 Then wait 30 seconds and attach to the process with gdb. (gdb) bt
+ Trace 234269
Thread 2 (Thread 0x7f1419f79700 (LWP 16367))
Created attachment 289553 [details] program to reproduce the problem
Pushed your patch, along with a test case based on the reproducer. Thanks Attachment 289551 [details] pushed as 9a6e01e - gio: Prevent hang when finalizing GThreadedSocketService
Thanks!
I can make this testcase hang if I run it in a loop (initially detected by the Ubuntu CI machines where it seems more frequent; 3/6 runs of the installed-tests). bt goes:
+ Trace 234382
Thread 2 (Thread 0x7f0a4dd65700 (LWP 27257))
Created attachment 291849 [details] [review] gio/tests: Prevent hangs and aborts in socket-listener Fix two problems: 1) If g_socket_service_stop is called before the accept call is requeued, then the reference count won't decrease and this code will hang forever: while (G_OBJECT (service)->ref_count == ref_count) g_main_context_iteration (NULL, TRUE); 2) Sometimes the testcase fails (maybe 1 in 200 times for me): GLib-GIO:ERROR:socket-listener.c:73:connection_cb: assertion failed (G_OBJECT (service)->ref_count == 2): (3 == 2) Aborted (core dumped) The problem is that depending on ordering, cancellation of the async listener can require further main context iterations before it releases the reference on the socket service. Furthermore, in some cases, it requires at least one iteration.
I have tested this for 20000 iterations without a hang or abort.
Pushed to master as 7d9816934e6a233369a2df1282a49d1549d9fa0f. Please reopen if the test case hangs or fails.