GNOME Bugzilla – Bug 574838
seeking can deadlock adder
Last modified: 2009-03-17 09:56:42 UTC
see frame #4 and #67
+ Trace 213344
Created attachment 130555 [details] backtraces from all threads
Created attachment 130680 [details] [review] use adder object lock The deadlock happens between: adder: gst_adder_src_event()/GST_EVENT_SEEK/GST_OBJECT_LOCK (adder->collect); and collectpads: gst_collect_pads_chain/GST_COLLECT_PADS_WAIT (pads); In the patch I added logging and some more return value handling. The fix for the problem is to use GST_OBJECT_LOCK (adder) to protect adder->segment_pending in gst_adder_src_event() and gst_adder_collected() instead of the object-lock on adder->collect. Is that the right fix?
It's not correct, we take the collector lock to wait for the collectpads to finish processing. From the backtrace it looks like the streaming thread is blocked somewhere in a sink. Is this a flushing seek?
Its not a flushing seek. Its seamless looping.
In fact, I don't see a deadlock.. I think it's just waiting for the collect function to finish before it can configure the new segment time and forward the event.
But the collect function in turn is stuck in GST_COLLECT_PADS_WAIT. The application is definitely locked - no more sound. If I just disable the lock in gst_adder_src_event()/GST_EVENT_SEEK/GST_OBJECT_LOCK (adder->collect); it does not lock up, but does not behave correctly either. I don't see why we have to wait for gst_collect_pads_chain() to finish. It does not (and it cannot) access adder->segment_pending and adder->segment_position and that is what we modity in the event function.
After implementing the patch in bug #575598, the deadlock does not happen any more. Also in my application I can workaround the problem by using new sequence number for each seek, so that I can ignore duplicate segment-dones. Unfortunately that will require 0.10.22 (not yet sure how I can avoid it with older versions).
For older version I count my sources and ignore the first N-1 segment-done-event. Closing ticket as the fix in bug #575598 is accepted.