GNOME Bugzilla – Bug 556458
crash in IA__g_file_load_partial_contents_finish (with X forwarding or "su")
Last modified: 2009-04-20 10:09:35 UTC
Steps to reproduce: 1. ssh -Y otheruser@localhost 2. run gedit in the remote session 3. use file/open -> crash Stack trace: (firefox:15463): GLib-GIO-CRITICAL **: g_simple_async_result_set_from_error: assertion `error != NULL' failed (firefox:15463): GLib-CRITICAL **: g_error_free: assertion `error != NULL' failed (firefox:15463): GLib-GIO-CRITICAL **: g_simple_async_result_set_from_error: assertion `error != NULL' failed (firefox:15463): GLib-GIO-WARNING **: (gfile.c:5249):IA__g_file_load_partial_contents_finish: runtime check failed: (g_simple_async_result_get_source_tag (simple) == g_file_load_contents_async) Program received signal SIGSEGV, Segmentation fault. IA__g_file_load_partial_contents_finish (file=<value optimized out>, res=<value optimized out>, contents=<value optimized out>, length=<value optimized out>, etag_out=<value optimized out>, error=<value optimized out>) at gfile.c:5253 5253 if (data->error) Current language: auto; currently c Missing separate debuginfos, use: debuginfo-install alsa-lib.i386 bzip2.i386 e2fsprogs.i386 freetype.i386 gail.i386 gtk-nodoka-engine.i386 gtk2.i386 hunspell.i386 lcms.i386 libSM.i386 libXScrnSaver.i386 libcroco.i386 libgsf.i386 libpng.i386 librsvg2.i386 libselinux.i386 openssl.i686 pango.i386 popt.i386 sqlite.i386 (gdb) bt 20
+ Trace 208229
Other information: Seen on Fedora 9. Appears to be a regression in 2.6.16 Reverting to 2.6.15 fixes the bug for me. Other bug reports at: https://bugzilla.redhat.com/show_bug.cgi?id=466687 https://bugzilla.redhat.com/show_bug.cgi?id=458586 https://bugzilla.mozilla.org/show_bug.cgi?id=459909 Full stack (from firefox) at: https://bugzilla.mozilla.org/attachment.cgi?id=343098
Looks to me like g_file_load_partial_contents_finish and load_contents_open_callback have some disagreement over the exact nature of the GSimpleAsyncResult in the error case. g_file_load_partial_contents_finish tries to get the LoadContentsData by calling g_simple_async_result_get_op_res_gpointer, but load_contents_open_callback doesn't call g_simple_async_result_set_op_res_gpointer.
I'll also point out that in the error case, the g_warn_if_fail (g_simple_async_result_get_source_tag (simple) == g_file_load_contents_async); always triggers.
The real problem is that g_file_read_finish returns NULL without setting error
The reproduction instructions don't give me a crash, neither with gedit nor with firefox. Can you reproduce it under gdb and break on g_logv, to get a stacktrace from where the first critical warning is emitted: firefox:15463): GLib-GIO-CRITICAL **: g_simple_async_result_set_from_error: assertion `error != NULL' failed ?
To my surprise I'm no longer able to reproduce it. Yesterday I was always able to reproduce it, I crashed many times. Then I went back to the older glib2 release (2.6.15) and the crash was fixed for me. Some hours later I suddenly started to crash again, still having 2.6.15 installing. And eventually this was combined with a kernel oops. I decided to reboot. Then it worked again. When I saw mclasen's request to debug, I installed the newer version 2.6.16. But I'm no longer able to reproduce with that version either. So the cause of the problem might be somewhere else. I'll continue to try to reproduce. As soon as I'm able to, I'll trace as requested in comment 4.
Luckily the crash came back, attaching requested information. I'll try to keep this gdb session open if you have more questions.
Created attachment 120764 [details] first g_logv hit, stack
I decided to look around myself, hopefully it's helpful. going up the stack... 459 void 460 g_simple_async_result_set_from_error (GSimpleAsyncResult *simple, 461 GError *error) 462 { 463 g_return_if_fail (G_IS_SIMPLE_ASYNC_RESULT (simple)); 464 g_return_if_fail (error != NULL); (gdb) print simple $3 = (GSimpleAsyncResult *) 0x9412380 (gdb) print *simple $4 = {parent_instance = {g_type_instance = {g_class = 0x9382940}, ref_count = 1, qdata = 0x0}, source_object = 0x936d8b0, callback = 0x3bb320 <load_contents_open_callback>, user_data = 0x936c908, error = 0x0, failed = 0, handle_cancellation = 1, source_tag = 0x3b8160, op_res = {v_pointer = 0x0, v_boolean = 0, v_ssize = 0}, destroy_op_res = 0} up one level more open_read_async_thread ... 4033 stream = iface->read_fn (G_FILE (object), cancellable, &error); 4034 (gdb) 4035 if (stream == NULL) 4036 { 4037 g_simple_async_result_set_from_error (res, error); 4038 g_error_free (error); 4039 } 4040 else (gdb) print *object $11 = {g_type_instance = {g_class = 0x9362760}, ref_count = 3, qdata = 0x0} gdb) print cancellable $9 = (GCancellable *) 0x0 (gdb) print res $12 = (GSimpleAsyncResult *) 0x9412380 (gdb) print *res $13 = {parent_instance = {g_type_instance = {g_class = 0x9382940}, ref_count = 1, qdata = 0x0}, source_object = 0x936d8b0, callback = 0x3bb320 <load_contents_open_callback>, user_data = 0x936c908, error = 0x0, failed = 0, handle_cancellation = 1, source_tag = 0x3b8160, op_res = {v_pointer = 0x0, v_boolean = 0, v_ssize = 0}, destroy_op_res = 0} Obviously iface->read_fn() returned NULL. You expect that it should have set "error", but error is NULL.
After the first two assertions were printed, I allowed gdb to continue, and arrived at at the 3rd assertion, now switched to thread 1, and the number of threads has increased: (gdb) info thr 7 Thread 0xb31fbb90 (LWP 14810) 0x00110416 in __kernel_vsyscall () 5 Thread 0xb45fdb90 (LWP 14778) 0x00110416 in __kernel_vsyscall () 4 Thread 0xb4ffeb90 (LWP 14721) 0x003abfbc in g_object_unref@plt () from /lib/libgio-2.0.so.0 3 Thread 0xb59ffb90 (LWP 14720) 0x00110416 in __kernel_vsyscall () 2 Thread 0xb656eb90 (LWP 14719) 0x00110416 in __kernel_vsyscall () * 1 Thread 0xb80ac760 (LWP 14716) IA__g_logv (log_domain=0x3f2952 "GLib-GIO", log_level=G_LOG_LEVEL_CRITICAL, format=0x1f00bf "%s: assertion `%s' failed", args1=0xbfbe07ac "\204\231?") at gmessages.c:395 (gdb) bt
+ Trace 208288
I had to restart the process, because of a mistake in gdb, so the following variables no longer match the previous information in this bug. Used "fin" a couple of times, now at: (gdb) info thr 4 Thread 0xb4ffeb90 (LWP 14862) 0x00110416 in __kernel_vsyscall () * 1 Thread 0xb80a0760 (LWP 14855) 0x003bb3df in load_contents_open_callback (obj=0x8f6eab0, open_res=0xb5a00b80, user_data=0x8e90d20) at gfile.c:5155 5155 res = g_simple_async_result_new_from_error (G_OBJECT (data->file), 5156 data->callback, 5157 data->user_data, 5158 error); 5159 g_simple_async_result_complete (res); 5160 g_error_free (error); 5161 load_contents_data_free (data); 5162 g_object_unref (res); 5163 } 5164 } (gdb) print res $2 = <value optimized out> (gdb) print data $3 = (struct here_cg_arc_record *) 0x0 (oops, that doesn't seem true, hmm, but maybe the function deleted data somehow. we're now after function execution) trying to step over g_simple_async_result_complete(res) the breakpoint gets hit again:
+ Trace 208290
5244 simple = G_SIMPLE_ASYNC_RESULT (res); (gdb) 5245 5246 if (g_simple_async_result_propagate_error (simple, error)) 5247 return FALSE; 5248 5249 g_warn_if_fail (g_simple_async_result_get_source_tag (simple) == g_file_load_contents_async); 5250 5251 data = g_simple_async_result_get_op_res_gpointer (simple); 5252 5253 if (data->error) line 5249 produced the warning, going to 5253 (gdb) print data $7 = (LoadContentsData *) 0x0 dereferencing data->error, crash
new comment at https://bugzilla.redhat.com/show_bug.cgi?id=458586#9 suggests this crash does not depend on using remote X forwarding, but can occur in other scenarios.
No idea whether the following is a good idea, but I've applied this patch as a workaround, and it allows me to continue my work with crashing, despite the assertions. diff -up ./gio/gfile.c.556458 ./gio/gfile.c --- ./gio/gfile.c.556458 2008-10-17 06:00:13.000000000 +0200 +++ ./gio/gfile.c 2008-10-17 06:03:09.000000000 +0200 @@ -5250,10 +5250,13 @@ g_file_load_partial_contents_finish (GFi data = g_simple_async_result_get_op_res_gpointer (simple); - if (data->error) + if (!data || data->error) { - g_propagate_error (error, data->error); - data->error = NULL; + if (data) + { + g_propagate_error (error, data->error); + data->error = NULL; + } *contents = NULL; if (length) *length = 0;
In addition to using remote X forwarding, the same stack is seen by users who change to a different user account (to root) using "su", see https://bugzilla.redhat.com/show_bug.cgi?id=458586 Updating summary.
I've been running a glib2 build with the patch from comment 12 since the day I posted it, and so far I haven't noticed any side effects.
there are quite some bluefish users that reported this with the 1.3 series of bluefish (which uses asynchronous GIO calls). Situation for which this is reported: Debian 5.0 (Lenny) with Xfce as desktop, open a http or ftp URL Ubuntu 8.04 with ssh forwarding, open a http or ftp URL Fedora 9 with a Xfce desktop, open a http or ftp URL All use glib-2.16 and in all situations a remote URL was opened and there was no gnome desktop in the same X11 session. The issue is not present in Debian unstable (Sid) or Ubuntu 8.10, both use glib-2.18. Both Debian 5.0 and Ubuntu 8.04 will be supported for one or two years, so I guess the Debian and Ubuntu people will be very happy if this bug gets fixed. To reproduce the issue I wrote a very short program, I'll attach the source.
Created attachment 129555 [details] Testcase for async file loading
b.t.w. what I gather from the bluefish-dev mailinglist is that in all of the situations where you get a segfault with asynchronous GIO the synchronous GIO calls report 'Operation not supported'. for example $ gvfs-cat http://www.nu.nl/ gvfs-cat: http://www.nu.nl/: error opening file: Operation not supported this is also different behavior from glib-2.18 on which gvfs-cat from Xfce or from a ssh -X session just works.
it seems that in all crash situations there is no session dbus available. I tried on Debian Lenny, if I manually run dbus-launch and export the DBUS_SESSION_BUS_ADDRESS variable, my testcase code works fine.
The core fail is iface->read_fn() returning NULL but not setting an error in open_read_async_thread. It would be interesting to know what iface->read_fn is. Is this a local file or a remote file?
As written in comment #15 this always happens with remote files.
Oliver: Thats not really how the initial report looks though. It seems to say just opening the file selector crashes.
Perhaps there are two different bugs then? But the have GLIB and GIO warnings are identical..?
No, the crashes seem identical. I mean, it may be that just opening the file selector accesses a remote bookmark or something, but It would be nice to get this confirmed in gdb so we know for sure.
I have a very similar looking problem in Ubuntu 8.04.2 (hardy). sudo gdmsetup gdmsetup[6262]: GLib-GIO-CRITICAL: g_simple_async_result_set_from_error: assertion `error != NULL' failed gdmsetup[6262]: GLib-CRITICAL: g_error_free: assertion `error != NULL' failed gdmsetup[6262]: GLib-GIO-CRITICAL: g_simple_async_result_set_from_error: assertion `error != NULL' failed gdmsetup[6262]: GLib-GIO-WARNING: (/build/buildd/glib2.0-2.16.6/gio/gfile.c:5249):IA__g_file_load_partial_contents_finish: runtime check failed: (g_simple_async_result_get_source_tag (simple) == g_file_load_contents_async) Segmentation fault gdb gives [...] Program received signal SIGSEGV, Segmentation fault.
+ Trace 214616
Thread 3073894080 (LWP 6204)
Is this likely to be the same bug? If there is further information I can provide please let me know. Raphael
(In reply to comment #24) > I have a very similar looking problem in Ubuntu 8.04.2 (hardy). > Is this likely to be the same bug? Can't say without a useful stacktrace with symbols. Please install debug packages.
I installed the debug symbols which, in case anyone else needs to do this, involved installing gvfs-dbgsym and libglib2.0-0-dbg . I now get run Starting program: /usr/sbin/gdmsetup [Thread debugging using libthread_db enabled] [New Thread 0xb735c6c0 (LWP 7927)] gdmsetup[7927]: GLib-GIO-CRITICAL: g_simple_async_result_set_from_error: assertion `error != NULL' failed gdmsetup[7927]: GLib-CRITICAL: g_error_free: assertion `error != NULL' failed gdmsetup[7927]: GLib-GIO-CRITICAL: g_simple_async_result_set_from_error: assertion `error != NULL' failed gdmsetup[7927]: GLib-GIO-WARNING: (/build/buildd/glib2.0-2.16.6/gio/gfile.c:5249):IA__g_file_load_partial_contents_finish: runtime check failed: (g_simple_async_result_get_source_tag (simple) == g_file_load_contents_async) Program received signal SIGSEGV, Segmentation fault.
+ Trace 214629
Thread 3073754816 (LWP 7927)
Thread 1 (Thread 0xb735c6c0 (LWP 7927))
At that point its a bit late. The interesting thing is why we get this error: gdmsetup[7927]: GLib-GIO-CRITICAL: g_simple_async_result_set_from_error: assertion `error != NULL' failed Can you put a breakpoint in g_log and re-run. That should cause a break on the first critical. Something is returning a "it failed" return value, but not setting the GError, and we need to figure out what call this is and on what error condition. Getting that breakpoint would get you one step closer to this.
I am sadly lacking gdb knowledge but here is my best attempt (thanks to seb128 at #ubuntu-devel for his/her patience). b g_log Function "g_log" not defined. Make breakpoint pending on future shared library load? (y or [n]) y Breakpoint 1 (g_log) pending. (gdb) run Starting program: /usr/sbin/gdmsetup [Thread debugging using libthread_db enabled] [New Thread 0xb73436c0 (LWP 8040)]
+ Trace 214630
Thread 3073652416 (LWP 8040)
Turns out this is fixed in this commit: 2008-07-28 Matthias Clasen <mclasen@redhat.com> Bug 545203 – gfile.c: argument is different type. * gfile.c (open_read_async_thread): Pass a GError ** to g_file_set_error_literal. Reported by Kazuki Iwamoto *** This bug has been marked as a duplicate of 545203 ***