GNOME Bugzilla – Bug 361769
Deadlock in gstpad.c
Last modified: 2006-10-13 13:29:24 UTC
Hi, I found an hard to reproduce deadlock in gstreamer that occur when the pad are linking. I also found a patch for it. The problem occur in gstreamer-0.10.6 and also in gstreamer-0.10.10 I'll start with a description of the problem: first, we need to create an audio avi: gst-launch-0.10 audiotestsrc num-buffers=10 ! avimux ! filesink location=testaudio.avi and do this : gst-launch-0.10 -f -v DUMMYsink name=sink location=out.raw filesrc location=testaudio.avi ! avidemux .audio_00 ! queue ! sink.audio_00 DUMMYsink is a sink filter (using collect data) that does nothing usefull (this is a test). The code is in attachment in folder dummysink. Also, I put in attachment a modified gstbin.c and gstpad.c that do sleep at some critical point to make the deadlock occurs faster (file from gstreamer-0.10.6). Those files are in folder sleep_bug/ When it deadlock, here's the stack trace : (gdb) bt
+ Trace 76027
I was able to find where it occur in the code : In gstpad.c, inside function gst_pad_link_check_hierarchy when it call "gst_object_get_parent" In gstbin.c, inside function update_degree,line 1544 when it call "gst_pad_get_peer" what occurs is this Filter1-----Filter2 Thread 3 in gst_pad_link_prepare lock Filter 1 source pad and Filter 2 sink pad. later in gst_pad_link_check_hierarchy, it tries to get the grandparents of each pad using the gst_object_get_parent function. This function lock the element to do it's job. Thread 1 lock Filter 2 element and try to do a pad_get_peer on it's sink pad. This function tries to lock the source pad of Filter 1. The solution is not to use gst_object_get_parent inside the function gst_pad_link_check_hierarchy but GST_OBJECT_PARENT instead. The pointers are only used to evaluate if the condition fail so it seems safe to do so. For example, using the files in folder sleep_fix/ work without deadlock. I will post the attachment and patch next.
Created attachment 74587 [details] testcase Testcase source for reproducing the deadlock
Created attachment 74588 [details] [review] patch for this bug
indeed, well spotted. This is a case of where the locking order is not respected. Locking orde must be parent first, then child.
* gst/gstpad.c: (gst_pad_link_check_hierarchy), (gst_pad_get_caps_unlocked), (gst_pad_save_thyself), (handle_pad_block), (gst_pad_push_event), (gst_pad_send_event): Patch by: Yves Lefebvre <ivanohe at abacom dot com> Fix possible deadlock due to wrong locking order. Fixes #361769. Remove some redundant/misplaced checks in pad_block.