After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 760551 - shmsink: Deadlock in memory free
shmsink: Deadlock in memory free
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-bad
1.2.4
Other Linux
: Normal major
: 1.6.3
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-01-12 19:39 UTC by Matt Crane
Modified: 2016-01-16 21:57 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Patch with a fix to deadlock when freeing memory in shmsink (856 bytes, patch)
2016-01-15 16:02 UTC, Matt Crane
committed Details | Review

Description Matt Crane 2016-01-12 19:39:52 UTC
I discovered a deadlock during some stress testing of our video streaming architecture that utilizes shmsink/shmsrc to stream video buffers between two processes.  The system in question was running with GStreamer 1.2.4, the version that ships with SuSE Linux Enterprise 12. A quick examination of git seems to indicate that the problematic code still exists in the current version of gst-plugins-bad.

In our tests, a shmsrc client repeatedly connects/disconnects at various intervals to receive video from the shmsink server pipeline.  At some point during the test, the shmsink element in the server process was observed to deadlock in gst_shm_sink_allocator_free().  The stack trace of the appropriate thread from gdb:

  • #0 __lll_lock_wait
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S line 135
  • #1 _L_lock_1081
    from /lib64/libpthread.so.0
  • #2 __GI___pthread_mutex_lock
    at ../nptl/pthread_mutex_lock.c line 134
  • #3 g_mutex_lock
    from /usr/lib64/libglib-2.0.so.0
  • #4 gst_shm_sink_allocator_free
    at gstshmsink.c line 175
  • #5 _gst_memory_free
    at gstmemory.c line 97
  • #6 gst_memory_unref
    at /usr/include/gstreamer-1.0/gst/gstmemory.h line 325
  • #7 gst_shm_sink_render
    at gstshmsink.c line 725
  • #8 gst_base_sink_chain_unlocked
    at gstbasesink.c line 3378
  • #9 gst_base_sink_chain_main
    at gstbasesink.c line 3486
  • #10 gst_pad_chain_data_unchecked
    at gstpad.c line 3760
  • #11 gst_pad_push_data
    at gstpad.c line 3990
  • #12 gst_pad_push
    at gstpad.c line 4093
  • #13 gst_base_src_loop
    at gstbasesrc.c line 2779
  • #14 gst_task_func
    at gsttask.c line 316
  • #15 ??
    from /usr/lib64/libglib-2.0.so.0
  • #16 ??
    from /usr/lib64/libglib-2.0.so.0
  • #17 start_thread
    at pthread_create.c line 309
  • #18 clone
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S line 111

The lock in question is the shmsink GST_OBJECT lock.  A closer examination in gdb revealed that the same thread already held a lock on the mutex.  Looking at the code, we found the initial GST_OBJECT_LOCK() being acquired at the start of gst_shm_sink_render(), but it isn't released until *after* a call to gst_memory_unref():

lines 714-721 in gstshmsink.c (exact line #s may vary depending on version):

    while (self->wait_for_connection && !self->clients) {
      g_cond_wait (&self->cond, GST_OBJECT_GET_LOCK (self));
      if (self->unlock) {
        gst_memory_unref (memory);
        GST_OBJECT_UNLOCK (self);
        return GST_FLOW_FLUSHING;
      }
    }

As can be seen in the stack trace, this gst_memory_unref() leads to calls to _gst_memory_free() and gst_shm_sink_allocator_free().  Inside of gst_shm_sink_allocator_free(), the code tries to acquire the shmsink GST_OBJECT_LOCK() again:

static void
gst_shm_sink_allocator_free (GstAllocator * allocator, GstMemory * mem)
{
  GstShmSinkMemory *mymem = (GstShmSinkMemory *) mem;

  if (mymem->block) {
    GST_OBJECT_LOCK (mymem->sink);
    sp_writer_free_block (mymem->block);
    GST_OBJECT_UNLOCK (mymem->sink);
    gst_object_unref (mymem->sink);
  }
  gst_object_unref (mem->allocator);

  g_slice_free (GstShmSinkMemory, mymem);
}


I *think* this deadlock can be resolved by moving the gst_memory_unref() call in gst_shm_sink_render() to after the GST_OBJECT_UNLOCK() call, as in the following diff:

@@ -714,8 +722,8 @@
     while (self->wait_for_connection && !self->clients) {
       g_cond_wait (&self->cond, GST_OBJECT_GET_LOCK (self));
       if (self->unlock) {
-        gst_memory_unref (memory);
         GST_OBJECT_UNLOCK (self);
+        gst_memory_unref (memory);
         return GST_FLOW_FLUSHING;
       }
     }

We are currently testing this change in our own rebuilt version of the plugin and wanted to raise this issue with the GStreamer team to make sure they were aware of the problem and to get feedback on whether the proposed fix was appropriate.
Comment 1 Thiago Sousa Santos 2016-01-15 11:54:32 UTC
Did you get a chance to try it in a more recent version? 1.6, 1.7 or git master?

It seems like the same code path is still there, anyway.

Could you provide the patch in git format-patch format so it can be merged?

Thanks,
Comment 2 Matt Crane 2016-01-15 16:02:55 UTC
Created attachment 319115 [details] [review]
Patch with a fix to deadlock when freeing memory in shmsink

Patch made from gst_plugins_bad MASTER
Comment 3 Matt Crane 2016-01-15 16:05:14 UTC
I am unable to reproduce the problem nor test the fix on a more recent version.  Our product is built against the GStreamer included in SLES 12, so that means integrating against anything other than 1.2.4 would be a significant task.

Patch made against git master has been attached.
Comment 4 Olivier Crête 2016-01-15 23:32:40 UTC
Review of attachment 319115 [details] [review]:

That patch looks good.
Comment 5 Tim-Philipp Müller 2016-01-16 21:54:35 UTC
commit eba01f84e53329dd14508cb72891c7a8e85888e0
Author: Matt Crane <mattcrane@tycoint.com>
Date:   Fri Jan 15 10:49:12 2016 -0500

    shmsink: fix possible deadlock in _render()/ _allocator_free()
    
    Drop object lock before unrefing memory, otherwise the object
    lock might be taken again from the allocator and then things
    deadlock.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=760551