After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 752213 - tee: Avoid race condition while forwarding sticky events
tee: Avoid race condition while forwarding sticky events
Status: RESOLVED OBSOLETE
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal normal
: git master
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-07-10 08:48 UTC by Jose Antonio Santos Cadenas
Modified: 2018-11-03 12:29 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
tee: Avoid race condition while forwarding sticky events (1.47 KB, patch)
2015-07-10 08:48 UTC, Jose Antonio Santos Cadenas
needs-work Details | Review
tee: Avoid race condition while forwarding sticky events (1.47 KB, patch)
2015-07-10 13:13 UTC, Jose Antonio Santos Cadenas
none Details | Review
tee: Avoid race condition while forwarding sticky events (4.19 KB, patch)
2015-07-10 13:15 UTC, Jose Antonio Santos Cadenas
none Details | Review
Test (7.29 KB, text/x-csrc)
2016-01-20 08:40 UTC, Jose Antonio Santos Cadenas
  Details
tee: fix race condition while forwarding sticky events (2.71 KB, patch)
2016-03-18 14:36 UTC, Miguel París Díaz
none Details | Review
tee: fix race condition while forwarding sticky events (2.77 KB, patch)
2016-03-21 09:13 UTC, Miguel París Díaz
none Details | Review
tee: fix race condition while forwarding sticky events (3.39 KB, patch)
2016-03-21 09:42 UTC, Miguel París Díaz
none Details | Review
tee: Set GST_PAD_FLAG_PROXY_CAPS before forwarding sticky events (1.13 KB, patch)
2016-03-29 08:03 UTC, Sebastian Dröge (slomo)
committed Details | Review
tee: Prevent race condition between adding pads and forwarding sticky events (4.87 KB, patch)
2016-03-29 08:04 UTC, Sebastian Dröge (slomo)
none Details | Review
tee: Prevent race condition between adding pads and forwarding sticky events (3.90 KB, patch)
2016-03-29 14:05 UTC, Miguel París Díaz
none Details | Review
tee: Prevent race condition between adding pads and forwarding sticky events (3.90 KB, patch)
2016-03-29 14:09 UTC, Miguel París Díaz
none Details | Review
tee: force Sticky event misordering (2.63 KB, patch)
2016-03-29 15:04 UTC, Miguel París Díaz
rejected Details | Review
tee: Prevent race condition between adding pads and forwarding sticky events (4.45 KB, patch)
2016-03-31 11:51 UTC, Miguel París Díaz
none Details | Review
tee: Prevent race condition between adding pads and forwarding sticky events (4.34 KB, patch)
2016-03-31 11:58 UTC, Miguel París Díaz
needs-work Details | Review
tee: do not override query function for sink pad (1.92 KB, patch)
2016-03-31 12:01 UTC, Miguel París Díaz
none Details | Review
tee: do not override query function for sink pad (1.97 KB, patch)
2016-06-23 14:23 UTC, Miguel París Díaz
none Details | Review
Simple program to force sticky event misordeing in tee src's peer pad (3.17 KB, text/plain)
2016-06-23 14:58 UTC, Miguel París Díaz
  Details
pad: simple patch to force race conditions (1.35 KB, patch)
2016-06-23 14:58 UTC, Miguel París Díaz
none Details | Review
tee: fix race conditions while forwarding sticky events (2.88 KB, patch)
2016-06-29 08:43 UTC, Miguel París Díaz
none Details | Review
tee: fix race conditions while forwarding sticky events (2.91 KB, patch)
2016-06-29 12:40 UTC, Miguel París Díaz
none Details | Review
tee: fix race conditions while forwarding sticky events (3.04 KB, patch)
2018-03-14 11:41 UTC, Miguel París Díaz
none Details | Review
Simple program to force deadlock (4.56 KB, text/x-csrc)
2018-08-23 09:31 UTC, Miguel París Díaz
  Details
tee: fix race conditions while forwarding sticky events (3.68 KB, patch)
2018-08-23 09:35 UTC, Miguel París Díaz
none Details | Review

Description Jose Antonio Santos Cadenas 2015-07-10 08:48:16 UTC
Created attachment 307207 [details] [review]
tee: Avoid race condition while forwarding sticky events

I have found that in some circumstances a source pad receives data without having a segment event, printing this error:

Unexpected critical/warning: gstpad.c:4258:gst_pad_push_data:<tee10:src_1> Got data flow before segment event

It is a race condition, because sink pad did not have segment event while adding the new source pad and it is received before the source pad is added to the tee but after the sticky events are forwarded, so this event is not in the new src pad.

Previous pads are not failing, just the new one. And this happens rarely 1 on 500 or so.

The solution I found is to lock the sink stream while adding the pad and once it is added forward the sticky events. With this patch I can not reproduce the issue.
Comment 1 Sebastian Dröge (slomo) 2015-07-10 09:18:10 UTC
Comment on attachment 307207 [details] [review]
tee: Avoid race condition while forwarding sticky events

I'm not much in favour of using the stream lock here. It might block the application or whoever requested the pad for a while.

I think the best would be to remember when we last sent sticky events, and check in the chain function if we need to update them. Some counter/cookie thing should work for that too.
Comment 2 Jose Antonio Santos Cadenas 2015-07-10 10:03:59 UTC
I've tried to push sticky events on chain function, using a flag to check if they were synchronized or not, but the patch becomes too complicated, because if the next we receive is not data but an event we have to forward events too (otherwise we can have the warning of receiving sticky events missordered), but I think that is too much overhead to check all the source pads sticky events whenever we receive an event at sink pad.

I think that getting the stream mutex is less harmful than have to increase complexity whenever we are receiving events and/or buffers.
Comment 3 Jose Antonio Santos Cadenas 2015-07-10 12:10:01 UTC
I've been reading the code a little bit more and one of the bigger problems I see is that the event function should be re-implemented and gst_pad_forward could not be used. Instead we may need a function like the one pushing the buffers, checking for each pad if it needs the sticky events to be forwarded.
Comment 4 Jose Antonio Santos Cadenas 2015-07-10 13:13:59 UTC
Created attachment 307224 [details] [review]
tee: Avoid race condition while forwarding sticky events

This patch does not get the stream mutex while requesting the pad, but does not solve the issue completely, just reduces fail to 1 on 700.

Do you think if there is something missing in the patch?

I still get the error, but this time not in tee source pad pad but on its peer:

Unexpected critical/warning: gstpad.c:4013:gst_pad_chain_data_unchecked:<queue15:sink> Got data flow before segment event
Comment 5 Jose Antonio Santos Cadenas 2015-07-10 13:15:29 UTC
Created attachment 307225 [details] [review]
tee: Avoid race condition while forwarding sticky events

Sorry, wrong patch

This patch does not get the stream mutex while requesting the pad, but does not solve the issue completely, just reduces fail to 1 on 700.

Do you think if there is something missing in the patch?

I still get the error, but this time not in tee source pad pad but on its peer:

Unexpected critical/warning: gstpad.c:4013:gst_pad_chain_data_unchecked:<queue15:sink> Got data flow before segment event
Comment 6 Tim-Philipp Müller 2016-01-19 18:24:48 UTC
Do you think you could whip up a unit test that reproduces the problem (doesn't have to be 100% reliable, but if it fails often enough that'd be useful)?
Comment 7 Jose Antonio Santos Cadenas 2016-01-20 08:17:42 UTC
It's being a while since I found this, I'll try to recover the test I did.
Comment 8 Jose Antonio Santos Cadenas 2016-01-20 08:39:39 UTC
I recover the test and with gstreamer 1.6.0 but there are issues (not checket yet) on 1.7.1
Comment 9 Jose Antonio Santos Cadenas 2016-01-20 08:40:36 UTC
Created attachment 319432 [details]
Test

I'ts not a unit test but a simple program that shows issues with 1.7.1
Comment 10 Miguel París Díaz 2016-03-18 14:36:30 UTC
Created attachment 324280 [details] [review]
tee: fix race condition while forwarding sticky events
Comment 11 Miguel París Díaz 2016-03-18 14:37:44 UTC
The patch uploaded fixes the race condition in a easy and efficient way without changing the code a lot.
Comment 12 Miguel París Díaz 2016-03-18 15:09:31 UTC
Review of attachment 324280 [details] [review]:

::: plugins/elements/gsttee.c
@@ +420,3 @@
   /* Forward sticky events to the new srcpad */
   gst_pad_sticky_events_foreach (tee->sinkpad, forward_sticky_events, srcpad);
   GST_OBJECT_FLAG_SET (srcpad, GST_PAD_FLAG_PROXY_CAPS);

Can we set this flag before entering the critical section (before forwarding sticky events)?
Comment 13 Sebastian Dröge (slomo) 2016-03-18 17:20:40 UTC
It seems weird to set it afterwards... It's part of the pad setup and should be done before actually doing anything with the pad. So yes, feel free to move it please :)
Comment 14 Miguel París Díaz 2016-03-21 09:13:41 UTC
Created attachment 324415 [details] [review]
tee: fix race condition while forwarding sticky events

Now, the flag of srcpad is set before entering the critical section, so it is the minimum possible.
Comment 15 Sebastian Dröge (slomo) 2016-03-21 09:17:12 UTC
Review of attachment 324415 [details] [review]:

Please add some more explanation to the commit message about what exactly the problem here is. Why is this needed at all? All sticky events are going to be copied over before the pad is added to the element, once the pad is added to the element all events will be passed through as usual. Where is the race?

::: plugins/elements/gsttee.c
@@ +306,3 @@
 gst_tee_init (GstTee * tee)
 {
+  g_rec_mutex_init (&tee->mutex_events);

Why does this need a recursive mutex?
Comment 16 Miguel París Díaz 2016-03-21 09:42:19 UTC
Created attachment 324416 [details] [review]
tee: fix race condition while forwarding sticky events

Add an explanation of the race condition in the commit message.
Comment 17 Miguel París Díaz 2016-03-21 09:45:11 UTC
Review of attachment 324415 [details] [review]:

::: plugins/elements/gsttee.c
@@ +306,3 @@
 gst_tee_init (GstTee * tee)
 {
+  g_rec_mutex_init (&tee->mutex_events);

It is needed because if the app creates a new src pad from the same thread of gst_pad_event_default (in gst_tee_sink_event), a deadlock will happen.
Comment 18 Sebastian Dröge (slomo) 2016-03-21 09:50:35 UTC
Comment on attachment 324416 [details] [review]
tee: fix race condition while forwarding sticky events

I think instead of all this it would then be better to add some kind of events cookie (e.g. a simple counter) and before pushing a buffer always check if it's up to date and otherwise update the sticky events before pushing. That works without yet another (recursive!) mutex and should be simpler.

The same bug is probably in other elements too, e.g. output-selector
Comment 19 Miguel París Díaz 2016-03-21 09:53:13 UTC
Which is the problem of using a recursive mutex?
I think that all these kinds of problems that GStreamer has is because of not using a recursive mutex.
Comment 20 Sebastian Dröge (slomo) 2016-03-21 09:58:17 UTC
They are expensive and we already have a lot of these, and in this specific case a mutex seems not even like the simpler solution.
Comment 21 Miguel París Díaz 2016-03-21 10:06:11 UTC
OK, so if you think that your solution is easier and cheaper, could you upload a patch with your proposal please?
So, we will be able to discuss over the code ;).
Comment 22 Sebastian Dröge (slomo) 2016-03-21 10:29:13 UTC
The main problem is btw not that it's expensive, but that every unneeded mutex is a liability that might cause your future self a deadlock later.
Comment 23 Miguel París Díaz 2016-03-28 06:31:56 UTC
For now, we are going to use this patch in our own fork.
If we detect some deadlock or another problem, we will report it here.
Comment 24 Sebastian Dröge (slomo) 2016-03-29 08:03:55 UTC
Created attachment 324923 [details] [review]
tee: Set GST_PAD_FLAG_PROXY_CAPS before forwarding sticky events
Comment 25 Sebastian Dröge (slomo) 2016-03-29 08:04:00 UTC
Created attachment 324924 [details] [review]
tee: Prevent race condition between adding pads and forwarding sticky events

Between copying all sticky events to the new pad and adding it to the element,
a new sticky event might arrive that would then never be forwarded to the new
pad. To be forwarded automatically it would have to be added to the element
already.

Instead just remember when we have to recheck sending sticky events and do so
in the chain function if needed.
Comment 26 Sebastian Dröge (slomo) 2016-03-29 08:04:43 UTC
Miguel, can you test this one with your testcase? Thanks
Comment 27 Miguel París Díaz 2016-03-29 14:05:00 UTC
Created attachment 324939 [details] [review]
tee: Prevent race condition between adding pads and forwarding sticky events

Patch based on the Sebastian's idea but simpler taking into account that the race condition can only take place for new src pads.
Comment 28 Miguel París Díaz 2016-03-29 14:09:17 UTC
Created attachment 324940 [details] [review]
tee: Prevent race condition between adding pads and forwarding sticky events

I overwritten the author, in this patch I put Sebastian as author because he had the cookie idea.
Comment 29 Miguel París Díaz 2016-03-29 14:17:28 UTC
(In reply to Sebastian Dröge (slomo) from comment #26)
> Miguel, can you test this one with your testcase? Thanks

It works fine, or at least I haven't found a way of breaking it ;).
See the new patch following the idea of cookie please.
Comment 30 Miguel París Díaz 2016-03-29 15:04:49 UTC
Created attachment 324944 [details] [review]
tee: force Sticky event misordering

(In reply to Miguel París Díaz from comment #29)
> (In reply to Sebastian Dröge (slomo) from comment #26)
> > Miguel, can you test this one with your testcase? Thanks
> 
> It works fine, or at least I haven't found a way of breaking it ;).
> See the new patch following the idea of cookie please.

I think that with the solution based on cookies we can have "Sticky event misordering" problems.
The race condition to see this is:
 1 - New src pad when the sink pad does not have any sticky event
 2 - Forward sticky events from sink pad to src pad (but there is not any event)
 3 - The 'stream-start' event arrives to the sink pad (but it is not forwarded to the src pad because it has not been added yet).
 4 - The src pad is added to the tee element.
 5 - The 'caps' event arrives to the sink pad and it is forwarded to the src pad, so we have the 'Sticky event misordering' error (because the 'stream-start' haven't been forwarded from the chain function yet.

With the related patch I have forced the problem:
gstpad.c:5031:store_sticky_event:<tee3:src_1> Sticky event misordering, got 'caps' before 'stream-start'

Does this way of forcing this situation make sense?
Comment 31 Sebastian Dröge (slomo) 2016-03-30 12:43:12 UTC
Review of attachment 324940 [details] [review]:

::: plugins/elements/gsttee.c
@@ +567,3 @@
   gboolean res;
 
+  res = gst_pad_event_default (pad, parent, event);

You can't use gst_pad_event_default() here, that's racy and why I didn't do that :)

But in the forward function you will *also* need to check the current pad cookie and if it's wrong you additionally need to forward all events there. That should fix the sticky event misordering problem
Comment 32 Miguel París Díaz 2016-03-31 09:12:00 UTC
OK, I am doing a patch to solve the misordering problem.

Moreover, why using gst_pad_event_default is racy?
Comment 33 Sebastian Dröge (slomo) 2016-03-31 09:28:03 UTC
Actually that shouldn't matter here. But you still can't use it because you need to do checks before forwarding the event ;)
Comment 34 Miguel París Díaz 2016-03-31 11:51:28 UTC
Created attachment 325078 [details] [review]
tee: Prevent race condition between adding pads and forwarding sticky events
Comment 35 Miguel París Díaz 2016-03-31 11:58:09 UTC
Created attachment 325079 [details] [review]
tee: Prevent race condition between adding pads and forwarding sticky events
Comment 36 Miguel París Díaz 2016-03-31 12:01:56 UTC
Created attachment 325080 [details] [review]
tee: do not override query function for sink pad
Comment 37 Sebastian Dröge (slomo) 2016-04-02 08:06:27 UTC
Review of attachment 325080 [details] [review]:

We don't do this Change-Id stuff in the commit messages :) Please also explain why you remove it in the commit message while you're editing it anyway and add a link to the bug report there (e.g. use git-bz)
Comment 38 Sebastian Dröge (slomo) 2016-04-02 08:10:23 UTC
Review of attachment 325079 [details] [review]:

::: plugins/elements/gsttee.c
@@ +141,3 @@
   gboolean removed;
+
+  gboolean do_fw_sticky_events;

Maybe "initial_sticky_events_pending" is more descriptive. This is not about general forwarding of them, but of the very initial ones.

@@ -416,3 @@
   GST_OBJECT_FLAG_SET (srcpad, GST_PAD_FLAG_PROXY_CAPS);
-  /* Forward sticky events to the new srcpad */
-  gst_pad_sticky_events_foreach (tee->sinkpad, forward_sticky_events, srcpad);

By *not* copying the sticky events here, the tee srcpad is not going to have any caps/segment/etc anymore. This might break some application assumptions.
Comment 39 Miguel París Díaz 2016-04-12 09:42:28 UTC
(In reply to Sebastian Dröge (slomo) from comment #38)
> Review of attachment 325079 [details] [review] [review]:

> @@ -416,3 @@
>    GST_OBJECT_FLAG_SET (srcpad, GST_PAD_FLAG_PROXY_CAPS);
> -  /* Forward sticky events to the new srcpad */
> -  gst_pad_sticky_events_foreach (tee->sinkpad, forward_sticky_events,
> srcpad);
> 
> By *not* copying the sticky events here, the tee srcpad is not going to have
> any caps/segment/etc anymore. This might break some application assumptions.

It is done when the sink pad receives an event (event_forward_func) or a buffer (gst_tee_handle_data).
Could be any problem?
Comment 40 Sebastian Dröge (slomo) 2016-04-12 10:47:59 UTC
Yes, because the pad won't have caps immediately when requesting a pad. But only after some random time.
Comment 41 Miguel París Díaz 2016-04-12 11:10:40 UTC
Yes, but this is the point and the race condition that we want to solve.
Forwarding sticky events before adding the src pad does not ensure that this pad has the caps, because the caps can arrive between forwarding sticky events (probably any event) and the pad is added, so we are in the same situation.
The only way of ensuring this is that forwarding sticky events and adding the pad was done in a critical section.

Anyway, what could happen if the src pad does not have the caps until *just before* a new event or a buffer arrives to the sink pad?
Comment 42 Sebastian Dröge (slomo) 2016-04-12 12:06:35 UTC
For demuxers, applications generally assume that they can get caps immediately when the pad is available. There might be applications assuming the same for the case when requesting a new pad from an already running pipeline. Those applications would fail now.
Comment 43 Miguel París Díaz 2016-04-12 13:13:24 UTC
OK, I see the point and you are right, in these cases with the last patch it would *always* fail.
But with the current implementation and even with the other patches proposed in  in this thread (based on flags and cookies), it would *sometimes* fail if the non desirable race condition happens.
This race condition is:
 1 - New src pad when the sink pad does not have any sticky event
 2 - Forward sticky events from sink pad to src pad (but there is not any event)
 3 - A new sticky event (eg: caps) arrives to the sink pad (but it is not forwarded to the src pad because it has not been added yet).
 4 - The src pad is added to the tee element.
 5 - Demuxers or apps fail because there is not any caps in the src pad.

To solve this problem the forwarding sticky events and adding the pad must be done in a critical section, the same that the event function of the sink pad.
This critical section could be performed with the patch https://bug752213.bugzilla-attachments.gnome.org/attachment.cgi?id=324416, but:
> The main problem is btw not that it's expensive, but that every unneeded
> mutex is a liability that might cause your future self a deadlock later

If I am not wrong, this only can happen if this mutex (mutex_events) is locked into another section locked by a mutex-X && mutex-X is locked into another section locked by this mutex (mutex_events).
Where mutex-X only can be:
 - GST_OBJECT_LOCK (pad)
 - GST_OBJECT_LOCK (tee)

So, can happen that
 - a new src pad of the tee is requeste
 - or gst_tee_sink_event is called
with mutex-X locked?

Other situations that can cause a deadlock?

Could we do a better solution taking into account that the cookies and flags ones did not fix the problem?
Comment 44 Sebastian Dröge (slomo) 2016-04-13 06:40:33 UTC
Not sure why you need a recursive mutex here, or any mutex at all. Why doesn't any of the existing solutions work if you just forward the sticky events that are currently there in request_new_pad() before the pad is added... and forward all pending sticky events on the next serialized event/buffer?
Comment 45 Miguel París Díaz 2016-04-13 12:19:09 UTC
(In reply to Sebastian Dröge (slomo) from comment #44)
> Not sure why you need a recursive mutex here, or any mutex at all. Why
> doesn't any of the existing solutions work if you just forward the sticky
> events that are currently there in request_new_pad() before the pad is
> added... and forward all pending sticky events on the next serialized
> event/buffer?

Because the race condition that I told you in my previous comment #43 causes the situation you told me in the comment #40

> > By *not* copying the sticky events here, the tee srcpad is not going to have
> > any caps/segment/etc anymore. This might break some application assumptions.

> > Yes, because the pad won't have caps immediately when requesting a pad. But
> > only after some random time.

These both things can happen if the race condition takes place.

Refs
[1] https://bugzilla.gnome.org/show_bug.cgi?id=752213#c43
[2] https://bugzilla.gnome.org/show_bug.cgi?id=752213#c40
Comment 46 Miguel París Díaz 2016-06-23 14:23:25 UTC
Created attachment 330260 [details] [review]
tee: do not override query function for sink pad
Comment 47 Miguel París Díaz 2016-06-23 14:58:13 UTC
Created attachment 330267 [details]
Simple program to force sticky event misordeing in tee src's peer pad
Comment 48 Miguel París Díaz 2016-06-23 14:58:48 UTC
Created attachment 330268 [details] [review]
pad: simple patch to force race conditions
Comment 49 Miguel París Díaz 2016-06-23 14:59:35 UTC
Hello,
I have found another race condition due not only the tee but the pad too.
I have done a simple app (attachment 330267 [details]) and a simple commit in gstpad.c (attachment 330268 [details] [review]) to force this race condition, which provokes a sticky event misordering in the peer pad of the tee src pads
(in our case in the sink pad of the second fakesink)

There are 2 threads:
 - streaming_thread: managed by the task of the SrcPad of a FakeSrc
 - app_thread: performs the creation and linkage of the tee src pads

The race condition is:
  1 - [app_thread] tee0 and fakesink0 are linked
  2 - [streaming_thread] stream-start event arrives to the tee0:sink pad
    2.1 - it is forwarded to tee0:src_0 and fakesink0:sink
  3 - [app_thread] Just:
      - after forwarding the event to all tee src pads
      - and bedore storing the sticky event in tee0:sink pad
    a new tee src pad is added (tee:src_1)
      - The stream-start is NOT forwarded because the forwarding iteration has already finished
      - the stream start is NOT stored because tee0:sink has stored the event yer
  4 - [streaming_thread] caps event arrives to the tee0:sink pad
    4.1 - it is forwarded to all tee src pads and to fakesink0:sink and fakesink:1:sink pads
        So, fakesink1:sink receives the caps event without having the stream-start event
  5 - [app_thread]
    5.1 - fakesink1:sink is unlinked from tee:src_1
    5.2 - tee:src_1 is released
    5.3 - fakesink1:sink is linked to a new tee src pad (tee:src_2)
      5.3.1 - stream-start event is stored in tee:src_2
      5.3.2 - stream-start event is tried to store into fakesik1:sink
            Here we have the misordering error
Comment 50 Miguel París Díaz 2016-06-29 08:43:20 UTC
Created attachment 330537 [details] [review]
tee: fix race conditions while forwarding sticky events

Fix the problems caused by the race conditions in tee and in pad.
Comment 51 Miguel París Díaz 2016-06-29 12:40:13 UTC
Created attachment 330571 [details] [review]
tee: fix race conditions while forwarding sticky events
Comment 52 Sebastian Dröge (slomo) 2016-11-01 18:54:00 UTC
Comment on attachment 324923 [details] [review]
tee: Set GST_PAD_FLAG_PROXY_CAPS before forwarding sticky events

Attachment 324923 [details] pushed as 25bf63d - tee: Set GST_PAD_FLAG_PROXY_CAPS before forwarding sticky events
Comment 53 Miguel París Díaz 2018-03-14 11:41:51 UTC
Created attachment 369659 [details] [review]
tee: fix race conditions while forwarding sticky events

Replace to fix event refcount issue.
Comment 54 Nicolas Dufresne (ndufresne) 2018-03-14 13:00:43 UTC
Review of attachment 369659 [details] [review]:

Does it passes all tests without deadlock ? Did you run gst-validate ? Can you provide a unit test that triggers the race ? This patch is very scary, specially that the is no doc in the commit to explain what it is fixing and why it won't cause issues.
Comment 55 Olivier Crête 2018-03-14 14:50:21 UTC
Review of attachment 369659 [details] [review]:

This has the same problem as slomo's patches adding the stream lock.

::: plugins/elements/gsttee.c
@@ +561,3 @@
+
+  GST_TEE_EVENTS_LOCK (GST_PAD_PARENT (pad));
+  res = gst_pad_event_default (pad, parent, event);

I think this could block at the sink on a preroll if it's a GAP event
Comment 56 Miguel París Díaz 2018-03-16 16:04:23 UTC
Hello @stormer and @ocrete,
first of all thanks for you feedback.

Answering @stormer questions:
> Does it passes all tests without deadlock?
  Yes, unit tests pass in all projects (gstreamer, gst-plugins-XYZ, etc.)

> Did you run gst-validate?
  How I can run it?

> Can you provide a unit test that triggers the race?
  Make a deterministic unit test is quite complex, but if you take a look at my explanation about the race condition, it is quite clear that it exist.

> This patch is very scary, specially that the is no doc in the commit to explain what it is fixing and why it won't cause issues.
First of all we all need to agree about the solution. For now you can take this issue thread as doc, where all details are discussed.

Answering @ocrete:
> I think this could block at the sink on a preroll if it's a GAP event
Could you please put an example?

In addition, I would like to mention that this patch is running in our production environment (since more than one year ago) fixing the race condition and we didn't notice any deadlock.
Anyway, it might have a deadlock that is not reproducible in our use case.
Comment 57 Miguel París Díaz 2018-08-23 09:31:07 UTC
Created attachment 373443 [details]
Simple program to force deadlock

Hi folks!!

A rare but real use case has been found that shows that attachment 369659 [details] [review] provokes a deadlock.
I have invested some efforts to understand it and I achieved the simple program attached that reproduces the deadlock:

There are 3 threads:
 - app_thread: performs the initial pipeline creation and waits until the expected actions are done
 - src_streaming_thread: managed by the task of the SrcPad of a FakeSrc
 - queue_streaming_thread: managed by the task of the SrcPad of a Queue

Requirements to force the deadlock:
-  Make the fakesrc to push 2 buffers and the EOS
 - Make the queue to be filled with 1 buffer

The race condition that provokes the deadlock is:
  1 - [app_thread] tee0, queue0 and fakesink0 are linked
    1.1 - waits until the buffer from (5.3) is pushed.
  2 - [src_streaming_thread] push the first buffer
    2.1 - it is forwarded to tee0:src_0 and queue0:sink
  3 - [queue_streaming_thread] push the first buffer
    3.1 - it is blocked at buffers_probe WAITING to link tee and fakesink1
  4 - [src_streaming_thread] push the second buffer
    4.1 - it is forwarded to tee0:src_0 and queue0:sink
    4.2 - as queue_streaming_thread is blocked, the queue is filled
  5 - [src_streaming_thread] push the EOS
    5.1 - tee: hold GST_TEE_EVENTS_LOCK and push EOS to tee:src_0
    5.2 - SIGNAL queue_streaming_thread to link tee and fakesink1
    5.3 - push a new buffer downstream
    5.4 - LOCKED because the queue is OVERRUN
  6 - [queue_streaming_thread] perform tee and fakesink1 linking
    6.1 - LOCKED at gst_tee_request_new_pad because GST_TEE_EVENTS_LOCK is hold by the src_streaming_thread

The deadlock happens due to (5.4) and (6.1)
Comment 58 Miguel París Díaz 2018-08-23 09:35:32 UTC
Created attachment 373444 [details] [review]
tee: fix race conditions while forwarding sticky events

Fix sticky events misordering avoiding deadlock.
In other words this new patch satisfies the test apps attachment 330267 [details] and attachment 373443 [details].
Comment 59 GStreamer system administrator 2018-11-03 12:29:01 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/gstreamer/issues/122.