GNOME Bugzilla – Bug 574289
[decodebin2] race in state change to PAUSED
Last modified: 2010-01-16 17:51:42 UTC
Consider a decodebin2 (in e.g. playbin2) that is set to PAUSED and builds e.g. a id3demux ! oggdemux ! multiqueue ! vorbisdec pipeline, then a race occurs between mainloop and streaming thread(s). Consider order of events: * streaming thread(s) cause the above elements to be added, and a.o. a decodegroup to be created * just before a decodepad is added to it, the mainloop fires the id3demux no-more-pads signal, which marks the group complete. Since there is no decodepad yet, the group is also considered blocked, and is therefore exposed * [streaming thread again:] a decodepad is added, and corresponding ghostpad is set to blocked * streaming continues, and blocks in the pad block. Since the group is already considered exposed, no action is taken (the pad is not unblocked). So, streaming is blocked with no hope for subsequent unblocking and no pads have been exposed.
Created attachment 130141 [details] Debug log Debug log that illustrates the scenario described above.
Created attachment 130142 [details] gdb backtrace gdb backtrace upon (deadb)lock in the above scenario Note that this "snapshot" is taken when the pipeline is set to NULL/READY again, which hangs as it fails to unblock the streaming thread (blocked in the unexposed ghostpad).
I've seen this one and was working on some patch.
A slightly related race (exposing before fully ok), although not so deadlocky, might be as follows, if I read/theorize things right. Consider a typical demuxer with 1 video and 1 audio pad. When the demuxer is added, no group exists yet. Then video or audio pad is connected, and a group is created, and taken along in the sequel. If caps are not fixed yet, then caps_notify_group_cb will be connected. This will delegate to pad_added_group_cb, and this one will start expose machinery as soon as there are no more dynamic objects. However, this does not check whether the group is complete (probably yes) or all pads have blocked (not so necessarily yes). So, group and pads may be exposed before all has blocked. (which may or may not be by design and possibly need not be problematic, but foregoes checks that are otherwise performed). In the unlikely case there are more complex elements downstream from the demuxer, e.g. no-more-pads emitting elements, nbdynamic count might loose track and grouped pad-added or caps notify might expose (and deblock) before more blocked pads will be added.
Created attachment 151491 [details] [review] Add locking for data consistency and to avoid races If I read things right with (somewhat rewritten) decodebin2, a fair share of the above scenarios is avoided. However, AFAICS, attached patch is still needed for some locking consistency on the one hand (as declared in the data structures), and to avoid some races.
Yes, that patch looks correct. Please commit :)
Well, it fixes sort of, based on all the rewriting ... commit 2482a536ac0cfb1f298ecad5afffce2a1422b4b6 Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk> Date: Fri Jan 15 19:52:29 2010 +0100 decodebin2: sprinkle some more locking ... to avoid races and ensure some data structure consistency. See also #574289.