After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 771912 - Segfault when request sent using soup_session_send_async is cancelled
Segfault when request sent using soup_session_send_async is cancelled
Status: RESOLVED OBSOLETE
Product: libsoup
Classification: Core
Component: HTTP Transport
2.54.x
Other Linux
: Normal major
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
: 781879 793553 (view as bug list)
Depends on:
Blocks:
 
 
Reported: 2016-09-24 13:16 UTC by Michael Gratton
Modified: 2018-09-21 16:27 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Add NULL check to SoupCache writing that was missing on one code path (1019 bytes, patch)
2017-04-20 19:14 UTC, Sean-Der
reviewed Details | Review
g_queue_pop_head returns NULL even though g_queue_get_length returns 1 (951 bytes, patch)
2017-04-20 19:16 UTC, Sean-Der
needs-work Details | Review

Description Michael Gratton 2016-09-24 13:16:37 UTC
I'm getting frequent crashes when a request sent using soup_session_send_async() has the passed GCancellable actually cancelled, via the main loop.

The session in use is an unremarkable SoupSession that has a small single-user cache.

Representative backtrace:

Thread 1 "geary" received signal SIGSEGV, Segmentation fault.
write_ready_cb (source=<optimised out>, result=<optimised out>, 
    istream=0x642e750 [SoupCacheInputStream]) at soup-cache-input-stream.c:175
175		if (pending) {
(gdb) bt 
  • #0 write_ready_cb
    at soup-cache-input-stream.c line 175
  • #1 g_task_return_now
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./gio/gtask.c line 1107
  • #2 g_task_return
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./gio/gtask.c line 1165
  • #3 async_ready_write_callback_wrapper
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./gio/goutputstream.c line 752
  • #4 g_task_return_now
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./gio/gtask.c line 1107
  • #5 complete_in_idle_cb
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./gio/gtask.c line 1121
  • #6 g_main_context_dispatch
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./glib/gmain.c line 3154
  • #7 g_main_context_dispatch
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./glib/gmain.c line 3769
  • #8 g_main_context_iterate
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./glib/gmain.c line 3840
  • #9 g_main_context_iteration
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./glib/gmain.c line 3901
  • #10 g_application_run
    at /build/glib2.0-7IO_Yw/glib2.0-2.48.1/./gio/gapplication.c line 2381
  • #11 _vala_main
    at /home/mjg/Projects/GNOME/geary/src/client/application/main.vala line 25
  • #12 __libc_start_main
    at ../csu/libc-start.c line 291
  • #13 _start

(gdb) info locals 
ostream = <optimised out>
priv = 0x642e6f0
write_size = 1369
pending = <error reading variable pending (Cannot access memory at address 0x8)>
error = 0x0
Comment 1 Michael Gratton 2016-10-07 04:13:29 UTC
Another crash at the same point in the application gives a slightly different (better?) error and stack trace - libsoup 2.54.1:

libsoup:ERROR:soup-cache-input-stream.c:195:soup_cache_input_stream_write_next_buffer: assertion failed: (priv->output_stream && !g_output_stream_is_closed (priv->output_stream))

Thread 1 "geary" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:58
58	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
  • #0 __GI_raise
    at ../sysdeps/unix/sysv/linux/raise.c line 58
  • #1 __GI_abort
    at abort.c line 89
  • #2 g_assertion_message
  • #3 g_assertion_message_expr
    at ././glib/gtestutils.c line 2452
  • #4 0x00007ffff31aab9e in
  • #5 0x00007ffff31aafdf in
  • #6 g_task_return_now
    at ././gio/gtask.c line 1121
  • #7 g_task_return
    at ././gio/gtask.c line 1179
  • #8 async_ready_write_callback_wrapper
    at ././gio/goutputstream.c line 752
  • #9 g_task_return_now
    at ././gio/gtask.c line 1121
  • #10 complete_in_idle_cb
    at ././gio/gtask.c line 1135
  • #11 g_main_dispatch
    at ././glib/gmain.c line 3201
  • #12 g_main_context_dispatch
    at ././glib/gmain.c line 3854
  • #13 g_main_context_iterate
    at ././glib/gmain.c line 3927
  • #14 g_main_context_iteration
    at ././glib/gmain.c line 3988
  • #15 g_application_run
    at ././gio/gapplication.c line 2381
  • #16 _vala_main
    at /home/mjg/Projects/GNOME/geary/src/client/application/main.vala line 25
  • #17 __libc_start_main
    at ../csu/libc-start.c line 291
  • #18 _start

Comment 2 Federico Bruni 2016-12-07 08:45:16 UTC
Mike, I managed to get a crash of Geary while running gdb.
I'm using libsoup 2.56.0 on Fedora 25.

Thread 1 "geary" received signal SIGSEGV, Segmentation fault.
istream_caching_finished (istream=0x324ab20 [SoupCacheInputStream], bytes_written=495, error=0x0, 
    user_data=0x37098c0) at soup-cache.c:784
784		g_clear_object (&entry->cancellable);

(gdb) bt
  • #0 istream_caching_finished
    at soup-cache.c line 784
  • #1 ffi_call_unix64
    at ../src/x86/unix64.S line 76
  • #2 ffi_call
    at ../src/x86/ffi64.c line 525
  • #3 g_cclosure_marshal_generic_va
    at gclosure.c line 1604
  • #4 _g_closure_invoke_va
    at gclosure.c line 867
  • #5 g_signal_emit_valist
    at gsignal.c line 3300
  • #6 g_signal_emit
    at gsignal.c line 3447
  • #7 notify_and_clear
    at soup-cache-input-stream.c line 55
  • #8 try_write_next_buffer
    at soup-cache-input-stream.c line 70
  • #9 write_ready_cb
    at soup-cache-input-stream.c line 184
  • #10 g_task_return_now
    at gtask.c line 1121
  • #11 g_task_return
    at gtask.c line 1179
  • #12 async_ready_write_callback_wrapper
    at goutputstream.c line 752
  • #13 g_task_return_now
    at gtask.c line 1121
  • #14 complete_in_idle_cb
    at gtask.c line 1135
  • #15 g_idle_dispatch
    at gmain.c line 5545
  • #16 g_main_dispatch
    at gmain.c line 3203
  • #17 g_main_context_dispatch
    at gmain.c line 3856
  • #18 g_main_context_iterate
    at gmain.c line 3929
  • #19 g_main_context_iteration
    at gmain.c line 3990
  • #20 g_application_run
    at gapplication.c line 2381
  • #21 _vala_main
    at /home/fede/src/geary/src/client/application/main.vala line 25
  • #22 __libc_start_main
    at ../csu/libc-start.c line 289
  • #23 _start

Comment 3 Sean-Der 2017-03-28 22:47:42 UTC
Hey Mike/Federico

I hit this same bug as well, and am working on fixing it (GStreamer is the application that is seeing this backtrace)

Using https://github.com/Sean-Der/libsoup do you have fewer SIGSEGV ? I am currently trying to figure out the case where buffer is NULL, but I just noticed a check was missing in one place (and all other callers were performing it)

I am not super familiar with libsoup, so just trying my best to fix this one case until I can get a maintainers attention.

thanks
Comment 4 Sean-Der 2017-03-28 22:48:36 UTC
Also I added an additional assert for `buffer` being NULL, I am still trying to figure out that case and what the proper behavior is.
Comment 5 Federico Bruni 2017-04-14 13:34:59 UTC
Hi Sean

I'd be happy to test your patches if you point me to a tutorial on building libsoup and especially let Geary use this version instead of Fedora system packages installed on my computer.

If it's just about running the usual build combo (./autogen.sh. configure, etc.) and install a new binary in /usr/local, then it's pretty simple. But if it's different, as I suspect, I need some help.
Comment 6 Michael Catanzaro 2017-04-20 15:01:53 UTC
You should attach a patch here so that it appears in the unreviewed patch tracker, otherwise we have no way to track all the patches that are not getting reviews.

(In reply to Federico Bruni from comment #5)
> If it's just about running the usual build combo (./autogen.sh. configure,
> etc.) and install a new binary in /usr/local, then it's pretty simple. But
> if it's different, as I suspect, I need some help.

No it's just the usual. configure, make, make install into /usr/local should work fine.
Comment 7 Sean-Der 2017-04-20 19:14:54 UTC
Created attachment 350162 [details] [review]
Add NULL check to SoupCache writing that was missing on one  code path
Comment 8 Sean-Der 2017-04-20 19:16:09 UTC
Created attachment 350163 [details] [review]
g_queue_pop_head returns NULL even though g_queue_get_length  returns 1

I am not fully confident about this patch, I am worried there is a deeper issue here I don't fully understand.
Comment 9 Michael Gratton 2017-04-23 03:34:02 UTC
(In reply to Sean-Der from comment #3)
> Hey Mike/Federico
> 
> I hit this same bug as well, and am working on fixing it (GStreamer is the
> application that is seeing this backtrace)
> 
> Using https://github.com/Sean-Der/libsoup do you have fewer SIGSEGV ? I am
> currently trying to figure out the case where buffer is NULL, but I just
> noticed a check was missing in one place (and all other callers were
> performing it)
> 
> I am not super familiar with libsoup, so just trying my best to fix this one
> case until I can get a maintainers attention.

Hey, thanks for looking into this, I've had a few stabs but never made any progress. Am rebuilding soup with the two patches above, will let you know if it seems to help over the next few days.
Comment 10 Sean-Der 2017-04-23 03:57:51 UTC
@Federico
Sorry I missed your comment! I really appreciate you testing out patches, if you need any help getting it working I am happy to jump on IRC and walk through stuff.

@Michael
Awesome! I still have other segfaults in the cache writing process, but it is getting better. I have coredumps enabled and just keep fixing as they come.
Comment 11 Federico Bruni 2017-04-24 10:00:16 UTC
I cannot test the patch, as I cannot reproduce the bug anymore. IIRC I used a long thread (181 messages) as crash test. But now it works fine, I cannot trigger the crash.
Comment 12 Michael Gratton 2017-04-27 08:26:24 UTC
Quick update: After running Geary against 2.56.0 with the two patches above applied, I haven't seen any crashes with stacks like the ones I or Federico reported above. I'd normally have expected to see about 1-2 per day.
Comment 13 Sean-Der 2017-04-27 23:05:21 UTC
Thank you so much for testing Federico/Michael!

Federico I wouldn't be surprised if an environment change (or really anything!) changed the issue for you it is super inconsistent for me as well. 

That's awesome to hear about the segfaults, this fixed the issue for me as well. Hopefully a maintainer is able to merge/review these

thanks again
Comment 14 Dan Winship 2017-04-30 18:21:29 UTC
Comment on attachment 350163 [details] [review]
g_queue_pop_head returns NULL even though g_queue_get_length  returns 1

So clearly this shouldn't be happening... is it possible there's a thread-safety issue? If you can still reproduce the bug reliably, adding something like:

  g_print ("soup_cache_input_stream_write_next_buffer %p in %p\n", istream, g_thread_get_self ());

to the top of each function could help figure that out (and if that doesn't turn up any threading problem maybe a few more printfs would)?
Comment 15 Dan Winship 2017-04-30 18:21:53 UTC
Comment on attachment 350162 [details] [review]
Add NULL check to SoupCache writing that was missing on one  code path

SoupCacheInputStream is pretty complicated for such a small file, and it's hard to say whether this is supposed to be necessary or not...

Do you know under what circumstances this gets hit?

If this helps *without also needing the gqueue patch* (which suggests some larger problem exists) then I'd be OK committing it as-is I guess, but it feels like eventually a larger rewrite would be nice (eg, having an explicit "priv->state" field rather than testing so many different variables in so many places). Maybe if you were going to tackle the larger rewrites needed to support sync operation as well that would be something to consider.
Comment 16 Michael Gratton 2017-05-02 04:31:00 UTC
*** Bug 781879 has been marked as a duplicate of this bug. ***
Comment 17 Michael Gratton 2017-05-02 05:00:48 UTC
FWIW, this happens under Geary when displaying an email conversation, due to loading avatar icons from Gravatar for each sender. For each message in the conversation, from the main loop, a call to soup_session_send_async() is made on a common, long-lived SoupSession object that has a SoupCache instance added to it as a feature. After each async call completes, the resulting GInputStream is then passed to a pixbuf loader iff the HTTP request was successful.

The crash has been worked around on Geary master to a greater extent by never cancelling any avatar loads, but it does still occur sometimes when loading large conversations, where messages with the same sender appear more than once and hence same Gravatar HTTP request is sent multiple times.

My theory is that for these larger conversations, since the same avatar may be requested multiple times, perhaps there is a race when the initial network load has completed, or perhaps GInputStream instances are somehow being re-used, or something?

I'll try to get around to putting the g_print() trace call in, but I'm massively time poor at the moment, so that might take a while.
Comment 18 Michael Gratton 2017-05-08 22:22:31 UTC
Ah, spoke too soon. After deleting my cache and restarting geary, I just had this crash:

(gdb) bt
  • #0 istream_caching_finished
    at soup-cache.c line 784
  • #1 ffi_call_unix64
  • #2 ffi_call
  • #3 g_cclosure_marshal_generic_va
    at ../../../../gobject/gclosure.c line 1604
  • #4 _g_closure_invoke_va
    at ../../../../gobject/gclosure.c line 867
  • #5 g_signal_emit_valist
    at ../../../../gobject/gsignal.c line 3300
  • #6 g_signal_emit
    at ../../../../gobject/gsignal.c line 3447
  • #7 notify_and_clear
    at soup-cache-input-stream.c line 55
  • #8 try_write_next_buffer
    at soup-cache-input-stream.c line 70
  • #9 write_ready_cb
    at soup-cache-input-stream.c line 184
  • #10 g_task_return_now
    at ../../../../gio/gtask.c line 1145
  • #11 g_task_return
    at ../../../../gio/gtask.c line 1203
  • #12 async_ready_write_callback_wrapper
    at ../../../../gio/goutputstream.c line 752
  • #13 g_task_return_now
    at ../../../../gio/gtask.c line 1145
  • #14 complete_in_idle_cb
    at ../../../../gio/gtask.c line 1159
  • #15 g_main_dispatch
    at ../../../../glib/gmain.c line 3203
  • #16 g_main_context_dispatch
    at ../../../../glib/gmain.c line 3856
  • #17 g_main_context_iterate
    at ../../../../glib/gmain.c line 3929
  • #18 g_main_context_iteration
    at ../../../../glib/gmain.c line 3990
  • #19 g_application_run
    at ../../../../gio/gapplication.c line 2381
  • #20 _vala_main
  • #21 __libc_start_main
    at ../csu/libc-start.c line 291
  • #22 _start

Comment 19 Michael Gratton 2018-02-18 23:45:05 UTC
*** Bug 793553 has been marked as a duplicate of this bug. ***
Comment 20 GNOME Infrastructure Team 2018-09-21 16:27:07 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/libsoup/issues/97.