GNOME Bugzilla – Bug 732615
Error sending large DBus message with file descriptors
Last modified: 2018-05-24 16:43:44 UTC
Created attachment 279743 [details] Code to reproduce Big messages which do not fit into single socket send call are not handled properly if accompanied by file descriptors and result in disconnection with inappropriate error reported: g_dbus_connection_real_closed: Remote peer vanished with error: Tried sending a file descriptor on unsupported stream of type GSocketOutputStream (g-io-error-quark, 0). Exiting. This is how the top of the backtrace would look like: #1 https://git.gnome.org/browse/glib/tree/gio/gdbusprivate.c?id=2.41.1#n1091 #2 https://git.gnome.org/browse/glib/tree/gio/gdbusprivate.c?id=2.41.1#n1083 This is because on the recursive call the (data->total_written == 0) condition [1] cannot [2] be met. Attached is a minimal example to reproduce this. Instructions inside. [1] https://git.gnome.org/browse/glib/tree/gio/gdbusprivate.c?id=2.41.1#n1008 [2] https://git.gnome.org/browse/glib/tree/gio/gdbusprivate.c?id=2.41.1#n1070
*** Bug 732796 has been marked as a duplicate of this bug. ***
Reproduced with this: gcc gdbus-fd-error.c `pkg-config glib-2.0 gio-2.0 gio-unix-2.0 --cflags --libs` pvanhoof@lenny:~/Downloads$ cat /usr/lib/i386-linux-gnu/pkgconfig/gio-unix-2.0.pc prefix=/usr exec_prefix=${prefix} libdir=${prefix}/lib/i386-linux-gnu includedir=${prefix}/include Name: GIO unix specific APIs Description: unix specific headers for glib I/O library Version: 2.40.0 Requires: gobject-2.0,gio-2.0 Libs: -L${libdir} -lgio-2.0 Cflags: -I${includedir}/gio-unix-2.0/ pvanhoof@lenny:~/Downloads$ pvanhoof@lenny:~/Downloads$ gdb ./a.out GNU gdb (GDB) 7.6.2 (Debian 7.6.2-1.1+b1) Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/pvanhoof/Downloads/a.out...(no debugging symbols found)...done. (gdb) run Starting program: /home/pvanhoof/Downloads/./a.out warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1". [New Thread 0xb7a47b40 (LWP 6869)] [New Thread 0xb70ffb40 (LWP 6870)] g_dbus_connection_real_closed: Remote peer vanished with error: Tried sending a file descriptor on unsupported stream of type GSocketOutputStream (g-io-error-quark, 0). Exiting. Program received signal SIGTERM, Terminated. 0xb7fde424 in __kernel_vsyscall () (gdb) bt
+ Trace 233914
ps. Source of the example describes that you need to do this a few times from another terminal: dbus-send --print-reply --dest=org.example.FdErrorTest /org/example/FdErrorTest org.example.FdErrorTest.Test int32:$((1024*200)) |sed 's/#\{1024\}/./g'
Some additional useful information is that on_worker_closed gets called in a thread right before that g_dbus_connection_real_closed happens on the mainloop's context.
+ Trace 233915
Thread 3071277888 (LWP 7038)
Continuing. g_dbus_connection_real_closed: Remote peer vanished with error: Tried sending a file descriptor on unsupported stream of type GSocketOutputStream (g-io-error-quark, 0). Exiting. Program received signal SIGTERM, Terminated. [Switching to Thread 0xb7a48780 (LWP 7036)] 0xb7fde424 in __kernel_vsyscall () (gdb)
So this means that we reach that TODO: handle in this situation. I wonder what the TODO means. git blame shows me David, but David is assigned so he receives this without having to put David in CC ;-) 8a3a4596 (David Zeuthen 2010-08-16 13:43:35 -0400 1340) /* TODO: handle */ static void write_message_cb (GObject *source_object, GAsyncResult *res, gpointer user_data) { MessageToWriteData *data = user_data; GError *error; g_mutex_lock (&data->worker->write_lock); g_assert (data->worker->output_pending == PENDING_WRITE); data->worker->output_pending = PENDING_NONE; error = NULL; if (!write_message_finish (res, &error)) { g_mutex_unlock (&data->worker->write_lock); /* TODO: handle */ _g_dbus_worker_emit_disconnected (data->worker, TRUE, error); g_error_free (error); g_mutex_lock (&data->worker->write_lock); } message_written_unlocked (data->worker, data); g_mutex_unlock (&data->worker->write_lock); continue_writing (data->worker); message_to_write_data_free (data); }
In the version that im debugging right now, line 1027 is this line: if (!(data->worker->capabilities & G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING)) of the function write_message_continue_writing Breakpoint 10, write_message_continue_writing (data=data@entry=0x8058498) at /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c:1027 1027 /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c: No such file or directory. (gdb) bt
+ Trace 233916
$4 = (MessageToWriteData *) 0x8058498 (gdb) p *data $5 = {worker = 0xb7108cf0, message = 0x8057c60, blob = 0xb688b008 "l\002\001\001\005 \003", blob_size = 204861, total_written = 0, simple = 0xb6702840} (gdb) p *data->worker $6 = {ref_count = 4, shared_thread_data = 0xb7109378, stopped = 0, frozen = 0, capabilities = G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING, received_messages_while_frozen = 0x80520c0, stream = 0xb7101060, cancellable = 0x8054950, message_received_callback = 0xb7f02370 <on_worker_message_received>, message_about_to_be_sent_callback = 0xb7eff990 <on_worker_message_about_to_be_sent>, disconnected_callback = 0xb7eff740 <on_worker_closed>, user_data = 0x8056848, socket = 0xb71060d0, read_lock = { p = 0xb710a650, i = {3071321680, 0}}, read_buffer = 0xb6701200 "l\001", read_buffer_allocated_size = 4096, read_buffer_cur_size = 0, read_buffer_bytes_wanted = 16, read_fd_list = 0x0, read_ancillary_messages = 0x0, read_num_ancillary_messages = 0, output_pending = PENDING_WRITE, write_lock = {p = 0xb7108ea8, i = {3071315624, 0}}, write_queue = 0x80520d0, write_num_messages_written = 2, write_num_messages_flushed = 0, write_pending_flushes = 0x0, pending_close_attempts = 0x0, close_expected = 0} (gdb) p *data->worker->capabilities Attempt to take contents of a non-pointer value. (gdb) p data->worker->capabilities $7 = G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING (gdb)
Sorry, last comment was a false positive, because that line is still handled without error. But I can't break in the actual error case :-( $7 = G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING (gdb) n 1037 in /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c (gdb) b Breakpoint 11 at 0xb7f16368: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1037. (gdb) d Delete all breakpoints? (y or n) y (gdb) b gdbusprivate.c:1028 Breakpoint 12 at 0xb7f161a2: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1028. (gdb) c Continuing. g_dbus_connection_real_closed: Remote peer vanished with error: Tried sending a file descriptor on unsupported stream of type GSocketOutputStream (g-io-error-quark, 0). Exiting. Program received signal SIGTERM, Terminated. [Switching to Thread 0xb7a48780 (LWP 7288)] 0xb7fde424 in __kernel_vsyscall () (gdb) b gdbusprivate.c:1029 Note: breakpoint 12 also set at pc 0xb7f161a2. Breakpoint 13 at 0xb7f161a2: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1029. (gdb) b gdbusprivate.c:1030 Breakpoint 14 at 0xb7f161c5: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1030. (gdb) b gdbusprivate.c:1031 Note: breakpoint 14 also set at pc 0xb7f161c5. Breakpoint 15 at 0xb7f161c5: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1031. (gdb) b gdbusprivate.c:1033 Note: breakpoints 14 and 15 also set at pc 0xb7f161c5. Breakpoint 16 at 0xb7f161c5: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1033. (gdb) b gdbusprivate.c:1034 Breakpoint 17 at 0xb7f161cd: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gdbusprivate.c, line 1034. (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/pvanhoof/Downloads/./a.out warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1". [New Thread 0xb7a47b40 (LWP 7302)] [New Thread 0xb70ffb40 (LWP 7303)] g_dbus_connection_real_closed: Remote peer vanished with error: Tried sending a file descriptor on unsupported stream of type GSocketOutputStream (g-io-error-quark, 0). Exiting. Program received signal SIGTERM, Terminated. 0xb7fde424 in __kernel_vsyscall () (gdb)
Ok, no more being stupid and here is the more simple version of my debugging session that avoids that gdb can miss lines because of compiler optimizations or somethings. Simple breakpoint on g_simple_async_result_set_error :-) Program received signal SIGTERM, Terminated. 0xb7fde424 in __kernel_vsyscall () (gdb) b g_simple_async_result_set_error Breakpoint 18 at 0xb7ea1ee0: file /build/glib2.0-f_gKLq/glib2.0-2.40.0/./gio/gsimpleasyncresult.c, line 716. (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/pvanhoof/Downloads/./a.out warning: Could not load shared library symbols for linux-gate.so.1. Do you need "set solib-search-path" or "set sysroot"? [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1". [New Thread 0xb7a47b40 (LWP 7308)] [New Thread 0xb70ffb40 (LWP 7309)]
+ Trace 233917
Thread 3071277888 (LWP 7309)
$9 = (MessageToWriteData *) 0x8058498 (gdb) print *data $10 = {worker = 0xb7108cf0, message = 0x8057c60, blob = 0xb688b008 "l\002\001\001\005 \003", blob_size = 204861, total_written = 183360, simple = 0xb6702840} (gdb) print *data->worker $11 = {ref_count = 4, shared_thread_data = 0xb7109378, stopped = 0, frozen = 0, capabilities = G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING, received_messages_while_frozen = 0x80520c0, stream = 0xb7101060, cancellable = 0x8054950, message_received_callback = 0xb7f02370 <on_worker_message_received>, message_about_to_be_sent_callback = 0xb7eff990 <on_worker_message_about_to_be_sent>, disconnected_callback = 0xb7eff740 <on_worker_closed>, user_data = 0x8056848, socket = 0xb71060d0, read_lock = { p = 0xb710a650, i = {3071321680, 0}}, read_buffer = 0xb6701200 "l\001", read_buffer_allocated_size = 4096, read_buffer_cur_size = 0, read_buffer_bytes_wanted = 16, read_fd_list = 0x0, read_ancillary_messages = 0x0, read_num_ancillary_messages = 0, output_pending = PENDING_WRITE, write_lock = {p = 0xb7108ea8, i = {3071315624, 0}}, write_queue = 0x80520d0, write_num_messages_written = 2, write_num_messages_flushed = 0, write_pending_flushes = 0x0, pending_close_attempts = 0x0, close_expected = 0} (gdb) print data->worker->capability There is no member named capability. (gdb) print data->worker->capabilities $12 = G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING (gdb)
This? --- a/gio/gdbusprivate.c +++ b/gio/gdbusprivate.c @@ -1024,7 +1024,7 @@ control_message = NULL; if (fd_list != NULL && g_unix_fd_list_get_length (fd_list) > 0) { - if (!(data->worker->capabilities & G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING)) + if (data->worker->capabilities & ~G_DBUS_CAPABILITY_FLAGS_UNIX_FD_PASSING) { g_simple_async_result_set_error (simple, G_IO_ERROR,
ps. With Comment 9 applied I can still reproduce but at a different location: [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/i386-linux-gnu/i686/cmov/libthread_db.so.1". [New Thread 0xb7a48b40 (LWP 17212)] [New Thread 0xb70ffb40 (LWP 17213)] [Thread 0xb7a48b40 (LWP 17212) exited]
+ Trace 233918
Thread 3071277888 (LWP 17213)
$18 = (MessageToWriteData *) 0x8058498 (gdb) print *data $19 = {worker = 0xb7108cf0, message = 0xb6703e30, blob = 0xb688b008 "l\002\001\001\005 \003", blob_size = 204861, total_written = 183360, simple = 0xb7101418} (gdb) list 1092 else 1093 { 1094 #ifdef G_OS_UNIX 1095 if (fd_list != NULL) 1096 { 1097 g_simple_async_result_set_error (simple, 1098 G_IO_ERROR, 1099 G_IO_ERROR_FAILED, 1100 "Tried sending a file descriptor on unsupported stream of type %s", 1101 g_type_name (G_TYPE_FROM_INSTANCE (ostream))); (gdb) print fd_list $20 = (GUnixFDList *) 0xb6705638 (gdb) print *fd_list $21 = {parent_instance = {g_type_instance = {g_class = 0x8058610}, ref_count = 1, qdata = 0x0}, priv = 0xb6705630} (gdb) print *fd_list->priv $22 = {fds = 0x804fc28, nfd = 1} (gdb) print *fd_list->priv->fds $23 = 10 (gdb)
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/897.