After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 574289 - [decodebin2] race in state change to PAUSED
[decodebin2] race in state change to PAUSED
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
git master
Other Linux
: Normal normal
: 0.10.26
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2009-03-05 17:38 UTC by Mark Nauwelaerts
Modified: 2010-01-16 17:51 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Debug log (19.41 KB, text/plain)
2009-03-05 17:40 UTC, Mark Nauwelaerts
  Details
gdb backtrace (6.96 KB, text/plain)
2009-03-05 17:44 UTC, Mark Nauwelaerts
  Details
Add locking for data consistency and to avoid races (1.78 KB, patch)
2010-01-15 19:07 UTC, Mark Nauwelaerts
committed Details | Review

Description Mark Nauwelaerts 2009-03-05 17:38:17 UTC
Consider a decodebin2 (in e.g. playbin2) that is set to PAUSED and builds e.g. a id3demux ! oggdemux ! multiqueue ! vorbisdec pipeline, then a race occurs between mainloop and streaming thread(s).

Consider order of events:
* streaming thread(s) cause the above elements to be added, and a.o. a decodegroup to be created
* just before a decodepad is added to it, the mainloop fires the id3demux no-more-pads signal, which marks the group complete.  Since there is no decodepad yet, the group is also considered blocked, and is therefore exposed
* [streaming thread again:]
a decodepad is added, and corresponding ghostpad is set to blocked
* streaming continues, and blocks in the pad block.  Since the group is already considered exposed, no action is taken (the pad is not unblocked).

So, streaming is blocked with no hope for subsequent unblocking and no pads have been exposed.
Comment 1 Mark Nauwelaerts 2009-03-05 17:40:27 UTC
Created attachment 130141 [details]
Debug log

Debug log that illustrates the scenario described above.
Comment 2 Mark Nauwelaerts 2009-03-05 17:44:03 UTC
Created attachment 130142 [details]
gdb backtrace

gdb backtrace upon (deadb)lock in the above scenario

Note that this "snapshot" is taken when the pipeline is set to NULL/READY again, which hangs as it fails to unblock the streaming thread (blocked in the unexposed ghostpad).
Comment 3 Wim Taymans 2009-03-05 19:39:59 UTC
I've seen this one and was working on some patch.
Comment 4 Mark Nauwelaerts 2009-03-09 15:56:08 UTC
A slightly related race (exposing before fully ok), although not so deadlocky, might be as follows, if I read/theorize things right.

Consider a typical demuxer with 1 video and 1 audio pad.
When the demuxer is added, no group exists yet.  Then video or audio pad is connected, and a group is created, and taken along in the sequel.
If caps are not fixed yet, then caps_notify_group_cb will be connected.
This will delegate to pad_added_group_cb, and this one will start expose machinery as soon as there are no more dynamic objects.  However, this does not check whether the group is complete (probably yes) or all pads have blocked (not so necessarily yes).  So, group and pads may be exposed before all has blocked.
(which may or may not be by design and possibly need not be problematic, but foregoes checks that are otherwise performed).

In the unlikely case there are more complex elements downstream from the demuxer, e.g. no-more-pads emitting elements, nbdynamic count might loose track and grouped pad-added or caps notify might expose (and deblock) before more blocked pads will be added.
Comment 5 Mark Nauwelaerts 2010-01-15 19:07:27 UTC
Created attachment 151491 [details] [review]
Add locking for data consistency and to avoid races

If I read things right with (somewhat rewritten) decodebin2, a fair share of the above scenarios is avoided.

However, AFAICS, attached patch is still needed for some locking consistency on the one hand (as declared in the data structures), and to avoid some races.
Comment 6 Sebastian Dröge (slomo) 2010-01-16 10:09:11 UTC
Yes, that patch looks correct. Please commit :)
Comment 7 Mark Nauwelaerts 2010-01-16 17:51:24 UTC
Well, it fixes sort of, based on all the rewriting ... 

commit 2482a536ac0cfb1f298ecad5afffce2a1422b4b6
Author: Mark Nauwelaerts <mark.nauwelaerts@collabora.co.uk>
Date:   Fri Jan 15 19:52:29 2010 +0100

    decodebin2: sprinkle some more locking

    ... to avoid races and ensure some data structure consistency.

    See also #574289.