After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 732783 - Segfault with concurrent async and sync operations
Segfault with concurrent async and sync operations
Status: RESOLVED FIXED
Product: libsoup
Classification: Core
Component: Misc
2.47.x
Other Linux
: Normal normal
: ---
Assigned To: libsoup-maint@gnome.bugs
libsoup-maint@gnome.bugs
Depends on:
Blocks:
 
 
Reported: 2014-07-05 22:00 UTC by Ross Lagerwall
Modified: 2014-07-19 16:19 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
crash reproducer (2.96 KB, text/x-csrc)
2014-07-05 22:00 UTC, Ross Lagerwall
  Details
soup-message-queue: Hold mutex when ref'ing (879 bytes, patch)
2014-07-06 10:53 UTC, Ross Lagerwall
committed Details | Review

Description Ross Lagerwall 2014-07-05 22:00:18 UTC
Created attachment 279964 [details]
crash reproducer

Since gvfs was ported to use a single SoupSession for sync and async operations, I discovered that the dav backend segfaulted sometimes. I managed to track it down to libsoup with the attached program.

Doing repeated concurrent sync and async operations in separate threads causes libsoup to segfault reliably.

Example backtrace:

Thread 2 (Thread 0x7ffff56a3700 (LWP 28686))

  • #0 _int_malloc
    from /usr/lib/libc.so.6
  • #1 _int_memalign
    from /usr/lib/libc.so.6
  • #2 _mid_memalign
    from /usr/lib/libc.so.6
  • #3 posix_memalign
    from /usr/lib/libc.so.6
  • #4 allocator_memalign
    at gslice.c line 1378
  • #5 allocator_add_slab
    at gslice.c line 1252
  • #6 slab_allocator_alloc_chunk
    at gslice.c line 1297
  • #7 magazine_cache_pop_magazine
    at gslice.c line 731
  • #8 thread_memory_magazine1_reload
    at gslice.c line 801
  • #9 g_slice_alloc
    at gslice.c line 996
  • #10 g_slice_alloc0
    at gslice.c line 1032
  • #11 g_type_create_instance
    at gtype.c line 1850
  • #12 g_object_new_internal
    at gobject.c line 1724
  • #13 g_object_newv
    at gobject.c line 1868
  • #14 g_object_new
    at gobject.c line 1568
  • #15 soup_address_connectable_enumerate
    at soup-address.c line 1244
  • #16 next_enumerator
    at gproxyaddressenumerator.c line 168
  • #17 next_enumerator
    at gproxyaddressenumerator.c line 118
  • #18 g_proxy_address_enumerator_next
    at gproxyaddressenumerator.c line 203
  • #19 g_socket_client_connect
    at gsocketclient.c line 1011
  • #20 soup_socket_connect_sync_internal
    at soup-socket.c line 1034
  • #21 soup_connection_connect_sync
    at soup-connection.c line 479
  • #22 get_connection
    at soup-session.c line 1920
  • #23 soup_session_process_queue_item
    at soup-session.c line 1941
  • #24 soup_session_real_send_message
    at soup-session.c line 2190
  • #25 soup_session_send_message
    at soup-session.c line 2226
  • #26 thread_fn
  • #27 g_thread_proxy
    at gthread.c line 764
  • #28 start_thread
    from /usr/lib/libpthread.so.0
  • #29 clone
    from /usr/lib/libc.so.6

Thread 1 (Thread 0x7ffff7fc7700 (LWP 28682))

  • #0 g_mutex_get_impl
    at gthread-posix.c line 120
  • #1 g_mutex_lock
    at gthread-posix.c line 209
  • #2 soup_message_queue_item_unref
    at soup-message-queue.c line 156
  • #3 soup_message_queue_next
    at soup-message-queue.c line 281
  • #4 async_run_queue
    at soup-session.c line 2029
  • #5 idle_run_queue
    at soup-session.c line 2073
  • #6 g_main_dispatch
    at gmain.c line 3064
  • #7 g_main_context_dispatch
    at gmain.c line 3663
  • #8 g_main_context_iterate
    at gmain.c line 3734
  • #9 g_main_loop_run
    at gmain.c line 3928
  • #10 main

Maybe an item in the queue has become corrupted due to concurrent access?

Thanks.
Comment 1 Ross Lagerwall 2014-07-05 22:13:47 UTC
I found that the webdav backend was segfaulting when displaying a location in Nautilus. It was due to query info calls (which do synchronous libsoup calls) running in parallel with thumbnail generation (which reads the file which in turn does asynchronous libsoup calls).
Comment 2 Ross Lagerwall 2014-07-06 10:53:54 UTC
Created attachment 279980 [details] [review]
soup-message-queue: Hold mutex when ref'ing

Protect access to ref_count with the queue mutex when incrementing the
reference count.
Comment 3 Ross Lagerwall 2014-07-06 10:56:23 UTC
The above patch seems to fix the issue.

As an aside, if the sync API is used from one thread and the async API from another thread, is it necessary for both threads to have different thread-default main contexts?
Comment 4 Dan Winship 2014-07-18 16:53:00 UTC
Comment on attachment 279980 [details] [review]
soup-message-queue: Hold mutex when ref'ing

ok, though we should probably make it use atomic ops instead at some point...
Comment 5 Ross Lagerwall 2014-07-19 15:26:02 UTC
(In reply to comment #4)
> (From update of attachment 279980 [details] [review])
> ok, though we should probably make it use atomic ops instead at some point...

Maybe. As far as I can see, the other uses of ref_count are protected by the queue mutex anyway so an atomic counter is probably unnecessary.
Comment 6 Ross Lagerwall 2014-07-19 15:26:40 UTC
Comment on attachment 279980 [details] [review]
soup-message-queue: Hold mutex when ref'ing

Pushed to master as 053fdb041cded88c396c98e525212819fa5fce01. Thanks for the review!
Comment 7 Dan Winship 2014-07-19 16:19:31 UTC
(In reply to comment #5)
> Maybe. As far as I can see, the other uses of ref_count are protected by the
> queue mutex anyway so an atomic counter is probably unnecessary.

yeah, but it would probably be faster than using a mutex