After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 747610 - Inconsistent bin children state when a child fails to switch from NULL to READY
Inconsistent bin children state when a child fails to switch from NULL to READY
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal normal
: 1.5.1
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-04-10 10:29 UTC by Vincent Penquerc'h
Modified: 2015-04-15 15:03 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
undo upwards state change on children of a bin where a subsequent child fails (4.37 KB, patch)
2015-04-15 10:10 UTC, Vincent Penquerc'h
reviewed Details | Review
undo upwards state change on children of a bin where a subsequent child fails (2.52 KB, patch)
2015-04-15 11:33 UTC, Vincent Penquerc'h
none Details | Review
undo upwards state change on children of a bin where a subsequent child fails (5.45 KB, patch)
2015-04-15 12:32 UTC, Vincent Penquerc'h
none Details | Review
undo upwards state change on children of a bin where a subsequent child fails (5.35 KB, patch)
2015-04-15 12:52 UTC, Vincent Penquerc'h
none Details | Review
undo upwards state change on children of a bin where a subsequent child fails (5.70 KB, patch)
2015-04-15 14:24 UTC, Vincent Penquerc'h
none Details | Review
undo upwards state change on children of a bin where a subsequent child fails (5.24 KB, patch)
2015-04-15 14:47 UTC, Vincent Penquerc'h
committed Details | Review

Description Vincent Penquerc'h 2015-04-10 10:29:43 UTC
I'm putting this in -bad for now as it's triggered by a -bad test case, but it may be an issue in core if it turns out it's a bug in GstBin.

In -bad, the dtlssrtpdec element is a bin, which has, among others, a funnel and a dtlsdec.

In the states unit test, the dtlsdec will fail to get from NULL to READY due to not being set up. The funnel has previously switched from NULL to READY. Due to the dtlsdec failure, the bin itself will fail its state switch, and keep in NULL. However, the funnel is in READY, and stays there.

This will cause the funnel to complain it's being deleted while not in NULL. An attempt to switch the bin to NULL will not switch the funnel to NULL, as the bin is already in NULL.

The bin code doesn't seem to try to switch back child elements that sucesfully switched when a subsequent child fails in the same batch. This seems to be the cause of the problem, but doing this might also bring issues (like what happens if a failure happens in such a switch back).
Comment 1 Olivier Crête 2015-04-14 02:23:56 UTC
A downwards state change to NULL should never block or fail. I think the same applies if the target is READY. So I think it makes sense to undo state changes from NULL upwards.
Comment 2 Sebastian Dröge (slomo) 2015-04-14 07:52:02 UTC
Yes, that seems like something that has to be fixed in GstBin.

(In reply to Olivier Crête from comment #1)
> A downwards state change to NULL should never block or fail. I think the
> same applies if the target is READY.

"should", but it could. And things like closing a device can actually block for a while. But that seems unrelated to this issue, we should at least try to change to the previous state again
Comment 3 Vincent Penquerc'h 2015-04-15 10:10:53 UTC
Created attachment 301611 [details] [review]
undo upwards state change on children of a bin where a subsequent child fails
Comment 4 Sebastian Dröge (slomo) 2015-04-15 10:58:20 UTC
Review of attachment 301611 [details] [review]:

::: gst/gstbin.c
@@ +2651,3 @@
+   * up, we need to switch those elements back down to the previous state to be
+   * in sync with the bin */
+  changed_elements = NULL;

Why not just set the state to the previous one for *all* elements. No need to remember the ones that were successful IMHO, that's just making things more complicated everywhere for an error case.
Comment 5 Vincent Penquerc'h 2015-04-15 11:03:52 UTC
Fair enough, though I think doing it for all elements would actually be complex since one would have to add an extra iterator loop/switch construct.
Comment 6 Sebastian Dröge (slomo) 2015-04-15 11:05:27 UTC
gst_iterator_foreach() :)
Comment 7 Vincent Penquerc'h 2015-04-15 11:33:24 UTC
Created attachment 301618 [details] [review]
undo upwards state change on children of a bin where a subsequent child fails

Nice, done :)
Comment 8 Sebastian Dröge (slomo) 2015-04-15 11:43:38 UTC
Review of attachment 301618 [details] [review]:

Looks generally good, but please also a unit test :)

::: gst/gstbin.c
@@ +2545,3 @@
+{
+  GstElement *e = g_value_get_object (data);
+  GstState state = (GstState) user_data;

GPOINTER_TO_INT()

Otherwise this will give compiler warnings on weird compilers/platforms :)

@@ +2547,3 @@
+  GstState state = (GstState) user_data;
+
+  gst_element_set_state (e, state);

Maybe check return value here and print something if things go wrong

@@ +2830,3 @@
+          gst_element_state_get_name (current));
+      do {
+        ret = gst_iterator_foreach (it, &reset_state, (gpointer) current);

GINT_TO_POINTER()
Comment 9 Vincent Penquerc'h 2015-04-15 12:32:53 UTC
Created attachment 301622 [details] [review]
undo upwards state change on children of a bin where a subsequent child fails
Comment 10 Sebastian Dröge (slomo) 2015-04-15 12:47:58 UTC
Review of attachment 301622 [details] [review]:

Looks good, just a bit weird in the test :)

::: tests/check/generic/states.c
@@ +232,3 @@
+
+  /* we want at least one before and one after */
+  g_assert (sizeof (mid) / sizeof (mid[0]) >= 3);

??? (also G_N_ELEMENTS())
Comment 11 Vincent Penquerc'h 2015-04-15 12:52:39 UTC
Created attachment 301624 [details] [review]
undo upwards state change on children of a bin where a subsequent child fails

If you meant "what is that statement for ?", then it is because it needs at least three children to test states for a child already changed, the one that fails, and one that was not already changed. I'm kinda assuming the bin doesn't mess up the ordering though, which I'm not sure about.
Comment 12 Sebastian Dröge (slomo) 2015-04-15 13:03:31 UTC
Review of attachment 301624 [details] [review]:

::: tests/check/generic/states.c
@@ +232,3 @@
+
+  /* we want at least one before and one after */
+  g_assert (G_N_ELEMENTS (mid) >= 3);

But you just created it with 3 elements 5 lines above :)

@@ +254,3 @@
+
+  /* make the middle element fail to switch up */
+  GST_ELEMENT_GET_CLASS (mid[1])->change_state = fail_to_go_to_ready;

Here you're changing change_state for all identity elements... not just your 3, but all that are ever created in the same process from this point onwards.
Comment 13 Vincent Penquerc'h 2015-04-15 13:11:27 UTC
> But you just created it with 3 elements 5 lines above :)

Yes. So it passes. Which is good, no ? Am I missing something ?

> @@ +254,3 @@
> +
> +  /* make the middle element fail to switch up */
> +  GST_ELEMENT_GET_CLASS (mid[1])->change_state = fail_to_go_to_ready;
> 
> Here you're changing change_state for all identity elements... not just your
> 3, but all that are ever created in the same process from this point onwards.

Doh. I'd started with elements and did not catch when I remembered it needed to be on the class. I'll fix that after lunch.
Comment 14 Sebastian Dröge (slomo) 2015-04-15 13:18:35 UTC
(In reply to Vincent Penquerc'h from comment #13)
> > But you just created it with 3 elements 5 lines above :)
> 
> Yes. So it passes. Which is good, no ? Am I missing something ?

Well, keep it then :) It seems confusing to me, I first thought that something else was supposed to check here instead of the obviously correct thing.

> > @@ +254,3 @@
> > +
> > +  /* make the middle element fail to switch up */
> > +  GST_ELEMENT_GET_CLASS (mid[1])->change_state = fail_to_go_to_ready;
> > 
> > Here you're changing change_state for all identity elements... not just your
> > 3, but all that are ever created in the same process from this point onwards.
> 
> Doh. I'd started with elements and did not catch when I remembered it needed
> to be on the class. I'll fix that after lunch.

Enjoy lunch :) I think the best you can do here is to use a fakesink. With async=false it can do state changes directly, and it has a property (state-error) to have it fail specific state changes.
Comment 15 Vincent Penquerc'h 2015-04-15 14:24:00 UTC
Created attachment 301640 [details] [review]
undo upwards state change on children of a bin where a subsequent child fails
Comment 16 Sebastian Dröge (slomo) 2015-04-15 14:38:58 UTC
Review of attachment 301640 [details] [review]:

::: tests/check/generic/states.c
@@ +259,3 @@
+  /* make the middle element fail to switch up */
+  identity_state_change = GST_ELEMENT_GET_CLASS (mid[1])->change_state;
+  GST_ELEMENT_GET_CLASS (mid[1])->change_state = fail_to_go_to_ready;

No, use state-error=null-to-ready on the fakesink

Otherwise if your test fails, the vfunc will be wrong for all future fakesinks.
Comment 17 Vincent Penquerc'h 2015-04-15 14:47:47 UTC
Created attachment 301646 [details] [review]
undo upwards state change on children of a bin where a subsequent child fails

Not really since I was using only one and resetting its change state function afterwards, but here's a version with the fakesink property.
Comment 18 Vincent Penquerc'h 2015-04-15 14:49:42 UTC
Oh - do you mean that a test failing causes a hidden return from the function, bypassing the cleanup, and then it continues the next test fnuction in the same program ? If it does this, then I now understand why you said that :)
Comment 19 Sebastian Dröge (slomo) 2015-04-15 14:56:04 UTC
Review of attachment 301646 [details] [review]:

Yes, exactly because of that :)


Did you check if your test fails without your fix? If so, please push
Comment 20 Vincent Penquerc'h 2015-04-15 15:03:11 UTC
Yes, I had been testing both on each update :)

Thanks.

commit a3b42ec42a085f4767930f55d5bb0b4f4fc8321a
Author: Vincent Penquerc'h <vincent.penquerch@collabora.co.uk>
Date:   Wed Apr 15 11:02:54 2015 +0100

    bin: undo upward state changes on children when a child fails
    
    When a bin changes states upwards, and a child fails to change,
    any child that was already switched will not be reset to its
    original state, leaving its state inconsistent with the bin,
    which does not change state due to the failure.
    
    If the state change was from NULL to READY, it means that deleting
    this bin will cause those children to be deleted while not in
    NULL state, which is a Bad Thing. For other upward changes, it
    is less of a problem, as a subsequent switch back to NULL will
    cause an actual downwards change on those inconsistent elements,
    albeit from the "wrong" state.
    
    We now reset state to the original one when a child fails.
    
    Includes unit test.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=747610