GNOME Bugzilla – Bug 361769
Deadlock in gstpad.c
Last modified: 2006-10-13 13:29:24 UTC
I found an hard to reproduce deadlock in gstreamer that occur when the pad are linking.
I also found a patch for it. The problem occur in gstreamer-0.10.6 and also in gstreamer-0.10.10
I'll start with a description of the problem:
first, we need to create an audio avi:
gst-launch-0.10 audiotestsrc num-buffers=10 ! avimux ! filesink location=testaudio.avi
and do this :
gst-launch-0.10 -f -v DUMMYsink name=sink location=out.raw filesrc location=testaudio.avi ! avidemux .audio_00 ! queue ! sink.audio_00
DUMMYsink is a sink filter (using collect data) that does nothing usefull (this is a test). The code is in attachment in folder dummysink. Also, I put in attachment a modified gstbin.c and gstpad.c that do sleep at some critical point to make the deadlock occurs faster (file from gstreamer-0.10.6). Those files are in folder sleep_bug/
When it deadlock, here's the stack trace :
I was able to find where it occur in the code :
In gstpad.c, inside function gst_pad_link_check_hierarchy when it call "gst_object_get_parent"
In gstbin.c, inside function update_degree,line 1544 when it call "gst_pad_get_peer"
what occurs is this
Thread 3 in gst_pad_link_prepare lock Filter 1 source pad and Filter 2 sink pad. later in gst_pad_link_check_hierarchy, it tries to get the grandparents of each pad using the gst_object_get_parent function. This function lock the element to do it's job.
Thread 1 lock Filter 2 element and try to do a pad_get_peer on it's sink pad. This function tries to lock the source pad of Filter 1.
The solution is not to use gst_object_get_parent inside the function gst_pad_link_check_hierarchy but GST_OBJECT_PARENT instead. The pointers are only used to evaluate if the condition fail so it seems safe to do so.
For example, using the files in folder sleep_fix/ work without deadlock.
I will post the attachment and patch next.
Created attachment 74587 [details]
Testcase source for reproducing the deadlock
Created attachment 74588 [details] [review]
patch for this bug
indeed, well spotted. This is a case of where the locking order is not respected. Locking orde must be parent first, then child.
* gst/gstpad.c: (gst_pad_link_check_hierarchy),
(handle_pad_block), (gst_pad_push_event), (gst_pad_send_event):
Patch by: Yves Lefebvre <ivanohe at abacom dot com> Fix possible deadlock
due to wrong locking order. Fixes #361769.
Remove some redundant/misplaced checks in pad_block.