Bug 505770 – gst_element_get_state() should unblock if element posts an error

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 505770 - gst_element_get_state() should unblock if element posts an error


Summary:	gst_element_get_state() should unblock if element posts an error


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gstreamer (core)
Version:	0.10.14
Hardware:	Other Linux

Importance:	Normal blocker
Target Milestone:	0.10.30
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Duplicates:	538176 (view as bug list)
Depends on:
Blocks:	623318

Reported:	2007-12-26 18:13 UTC by Andrzej Szombierski
Modified:	2010-07-01 19:30 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
possible patch (3.25 KB, patch) 2010-03-04 13:07 UTC, Wim Taymans	none	Details \| Review

Description Andrzej Szombierski 2007-12-26 18:13:38 UTC

If a pipeline fails to preroll, calls to gst_element_get_state result in a deadlock.

I've encountered this bug in a specific scenario, although I think it would happen in other cases as well. In my case there is a simple pipeline:
filesrc location=some-file  '!' decodebin '!' fakesink

Now, if the mime-type of some-file is recognised, yet no codec for the format is found, the pipeline (correctly) fails to enter PAUSED state. The problem is that any attempt to check for this condition using gst_element_get_state result in a deadlock.

Steps to reproduce:
1. Open gst-launch sources. Append the  line
       gst_element_get_state (pipeline, NULL, NULL, GST_CLOCK_TIME_NONE);
   after (around line 700)
       fprintf (stderr, _("ERROR: pipeline doesn't want to preroll.\n"));

2. Build.
3. Launch the above pipeline with a file for which a codec is not available.
4. gst-launch prints out the following text, then deadlocks.
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
ERROR: from element /pipeline0/filesrc0: Internal data flow error.
Additional debug info:
gstbasesrc.c(1816): gst_base_src_loop (): /pipeline0/filesrc0:
streaming task paused, reason not-linked (-1)
ERROR: pipeline doesn't want to preroll.

Comment 1 Wim Taymans 2007-12-26 18:30:25 UTC

This is as designed. You can check for this condition by:

- regularly poll the state (with a timeout, so that you don't block forever).
- act when you receive the ASYNC_DONE message on the bus.

Another option would be to make the get_state function return a FAILURE when an error is posted from the pipeline. This is not done yet, do you think it's needed?

Comment 2 Andrzej Szombierski 2007-12-26 20:05:56 UTC

The documentation states clearly:
"If the element completes the state change or goes into an error, this function returns immediately with a return value of GST_STATE_CHANGE_SUCCESS or GST_STATE_CHANGE_FAILURE respectively."

I wish to synchronously change the state and leave the glib main loop alone, so waiting for error events is not really an option. Polling with a nonzero timeout - it will possibly work, but a timeout does not tell me if the state-change has failed, or it is just taking longer than usual.

Comment 3 Andrzej Szombierski 2008-02-06 22:41:44 UTC

I don't mean to be rude or annoy you in any way, but the behavior I've described is:
- inconsistent with the documentation
- difficult to avoid without really dirty hacks
- contrary to the "principle of least astonishment" :)

This said, I'm quite confident to call it a bug.

<rant mode on>
I was going to write how much this blocks my project, but who cares. Please fix this bug, or at least point me in the right direction so I can understand how this voodoo code works.
<rant mode off>

Comment 4 Tim-Philipp Müller 2008-02-06 23:48:09 UTC

The documentation needs to be made clearer then. Currently, you're supposed to do things like this:

 if (gst_element_set_state() == GST_STATE_CHANGE_FAILURE)
   goto handle_error;

 ... either handle state changes or errors asynchroneously, e.g. by
     running a main loop and setting up a bus watch, or by doing
     something like:

 msg = gst_bus_timed_pop_filtered (GST_ELEMENT_BUS (pipeline), -1,
     GST_MESSAGE_ERROR | GST_MESSAGE_ASYNC_DONE);

 if (GST_MESSAGE_TYPE (msg) == GST_MESSAGE_ERROR)
    goto handle error_message;

 ...

Comment 5 Wim Taymans 2009-03-18 16:14:18 UTC

ping? do you think it's interesting to unblock a _get_state() with an error when an error message is posted on the element? I think it's possible to do this and I'm starting to think that it would be a nice thing. Opinions?

Comment 6 Tim-Philipp Müller 2009-03-18 16:35:29 UTC

> Do you think it's interesting to unblock a _get_state() with an error
> when an error message is posted on the element? I think it's possible to do
> this and I'm starting to think that it would be a nice thing. Opinions?

I think that would probably be a good idea. It's just what people expect, and there's not a lot of code out there that really gets this stuff right (because it is very tedious if you don't have an entirely async state machine). And I don't expect anything to break if we change this.

Comment 7 Tim-Philipp Müller 2009-04-16 22:39:28 UTC

*** Bug 538176 has been marked as a duplicate of this bug. ***

Comment 8 Sebastian Dröge (slomo) 2009-09-10 08:27:49 UTC

So let's get this done :)

Comment 9 Sebastian Dröge (slomo) 2009-09-10 09:43:51 UTC

(In reply to comment #5)
> ping? do you think it's interesting to unblock a _get_state() with an error
> when an error message is posted on the element? I think it's possible to do
> this and I'm starting to think that it would be a nice thing. Opinions?

How would this be implemented? The state cond needs to be signalled when an error is posted but for this we would need to look at the bus. How do you want to look at all bus messages?

Comment 10 Wim Taymans 2009-09-10 09:51:14 UTC

gstbin has a sync bus handler where it receives all bus messages from all children. When you see and error, see if there is a state change going on, mark the state change as error and signal the element state gcond to unblock the pending wait.

The reason why it's not currently done is because it is not deterministic. Sometimes a state change would work without error (and you get the error later) or sometimes it would fail because the error message was posted faster.

Maybe it's better to unblock the gcond without changing the state change return value. It needs some thinking.

Comment 11 Wim Taymans 2010-03-04 13:07:09 UTC

Created attachment 155225 [details] [review]
possible patch

This is a possible patch. It marks the bin state change return value as FAILURE and wakes up any pending get_state() functions.

Comment 12 Wim Taymans 2010-03-04 14:19:04 UTC

It probably does not cause problems in any apps I know but it still sits weird.. Now we would return a FAILURE state change return even if there was no state change return failure...

Comment 13 Sebastian Dröge (slomo) 2010-03-13 05:09:26 UTC

Maybe we should add a new return value for this case?

Comment 14 Tim-Philipp Müller 2010-04-04 13:02:59 UTC

> It probably does not cause problems in any apps I know but it still sits
> weird.. Now we would return a FAILURE state change return even if there was no
> state change return failure...

Not sure I understand the problem. There was a failure and we return FAILURE, that seems fine to me. Preroll/dataflow is part of the state change mechanism, so it seems ok to classify this as a 'state change failure' IMHO.

Adding a new return value would cause more problems than solve things. I'd rather keep current behaviour than add yet another return value.

Comment 15 Thomas Vander Stichele 2010-04-05 23:22:23 UTC

I ran into the exact same problem today while coding an application, and my unit tests where failing on a build slave that didn't have flac installed and thus my pipeline errored out and never left get_state() as I expected.

I'm not sure why adding a return value would complicate things; as a bonus it also allows code to safely test for whether the behaviour has changed and it can safely use get_state()

Comment 16 Tim-Philipp Müller 2010-04-05 23:59:30 UTC

> I'm not sure why adding a return value would complicate things; as a bonus it
> also allows code to safely test for whether the behaviour has changed and it
> can safely use get_state()

It can safely use get_state() in any case, unless you can think of a case where a program might actually want to block forever waiting for something that won't be happening.

Adding a new return value would complicate things because:

 - 99% of the code out there doesn't even handle the
   existing values correctly

 - the actual problem is that most people who use this
   function don't understand the details of the state
   change mechanism, the various failure scenarios, how
   things work and/or are supposed to work, and whether
   or not they should be using get_state() here or if it's safe.
   Adding yet another return value won't help

 - adding another return value risks breaking existing
   code that actually tries to handle the return values;
   of course you could argue that that code isn't correct
   from the start, but that wouldn't not particularly helpful
   IMHO (and then we could just argue that code using
   get_state() incorrectly is broken and we don't need this
   fix at all).


Maybe we should just deprecate _get_state() and replace it with a bunch of new, better functions, such as gst_pipeline_wait_for_preroll (pipeline, &error), gst_pipeline_wait_for_preroll_with_timeout (pipeline, &error, -1); gst_element_get_states (&cur_state, &pending_state); or something like that.

Comment 17 Wim Taymans 2010-04-09 10:01:48 UTC

I think we should then push the Patch in Comment 11 after the next freeze and see what happens.

Comment 18 Tim-Philipp Müller 2010-05-10 10:41:43 UTC

Now might be a good time?

Comment 19 Wim Taymans 2010-05-10 11:05:50 UTC

yes, let's try it

Comment 20 Wim Taymans 2010-05-25 17:20:29 UTC

commit 88c6896fb9db7d61429d6305314d549c818545da
Author: Wim Taymans <wim.taymans@collabora.co.uk>
Date:   Tue May 25 19:17:44 2010 +0200

    gstbin: unlock _get_state() on error
    
    When an error message is received on the bus, mark the bin as being in the error
    state and unlock all current _get_state() calls with an error.
    
    Fixes #505770

Comment 21 Tim-Philipp Müller 2010-07-01 00:24:34 UTC

Re-opening this, because according to git-bisect it's responsible for causing crashes in totem, when trying to play the next file after totem has gotten an error (e.g. file is encrypted or is a random text file), leading to this:



** Message: Error: This appears to be a text file
gstdecodebin2.c(1795): type_found (): /GstPlayBin2:play/GstURIDecodeBin:uridecodebin0/GstDecodeBin2:decodebin20:
decodebin2 cannot decode plain text files
...
0:00:03.312382906  7912      0x11ec100 DEBUG               playbin2 gstplaybin2.c:2993:autoplug_select_cb:<play> checking factory qtdemux

(lt-totem:7912): GLib-GObject-CRITICAL **: g_object_get: assertion `G_IS_OBJECT (object)' failed
Segmentation fault




Program received signal SIGSEGV, Segmentation fault.
0x00007ffff328b6c2 in IA__g_value_set_object (value=0x7fffffffa780, v_object=0x12d1de0) at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c:2840
2840	/tmp/buildd/glib2.0-2.24.1/gobject/gobject.c: No such file or directory.
	in /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c
(gdb) bt

+ Trace 222656

#0 IA__g_value_set_object
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 2840
#1 gst_play_bin_get_property
at gstplaybin2.c line 1860
#2 object_get_property
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 935
#3 IA__g_object_get_valist
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 1555
#4 IA__g_object_get
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 1645
#5 playbin_source_notify_cb
at bacon-video-widget-gst-0.10.c line 2395
#6 IA__g_closure_invoke
at /tmp/buildd/glib2.0-2.24.1/gobject/gclosure.c line 767
#7 signal_emit_unlocked_R
at /tmp/buildd/glib2.0-2.24.1/gobject/gsignal.c line 3248
#8 IA__g_signal_emit_valist
at /tmp/buildd/glib2.0-2.24.1/gobject/gsignal.c line 2981
#9 IA__g_signal_emit
at /tmp/buildd/glib2.0-2.24.1/gobject/gsignal.c line 3038
#10 g_object_dispatch_properties_changed
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 801
#11 gst_object_dispatch_properties_changed
at gstobject.c line 509
#12 g_object_notify_queue_thaw
at /tmp/buildd/glib2.0-2.24.1/gobject/gobjectnotifyqueue.c line 120
#13 IA__g_object_notify
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 888
#14 notify_source_cb
at gstplaybin2.c line 3099
#15 IA__g_closure_invoke
at /tmp/buildd/glib2.0-2.24.1/gobject/gclosure.c line 767
#16 signal_emit_unlocked_R
at /tmp/buildd/glib2.0-2.24.1/gobject/gsignal.c line 3248
#17 IA__g_signal_emit_valist
at /tmp/buildd/glib2.0-2.24.1/gobject/gsignal.c line 2981
#18 IA__g_signal_emit
at /tmp/buildd/glib2.0-2.24.1/gobject/gsignal.c line 3038
#19 g_object_dispatch_properties_changed
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 801
#20 gst_object_dispatch_properties_changed
at gstobject.c line 509
#21 g_object_notify_queue_thaw
at /tmp/buildd/glib2.0-2.24.1/gobject/gobjectnotifyqueue.c line 120
#22 IA__g_object_notify
at /tmp/buildd/glib2.0-2.24.1/gobject/gobject.c line 888
#23 setup_source
at gsturidecodebin.c line 1669
#24 gst_uri_decode_bin_change_state
at gsturidecodebin.c line 2146
#25 gst_element_change_state
at gstelement.c line 2532
#26 gst_element_set_state_func
at gstelement.c line 2488
#27 gst_bin_element_set_state
at gstbin.c line 2121
#28 gst_bin_change_state_func
at gstbin.c line 2420
#29 gst_pipeline_change_state
at gstpipeline.c line 475
#30 gst_play_bin_change_state
at gstplaybin2.c line 3573
#31 gst_element_change_state
at gstelement.c line 2532
#32 gst_element_set_state_func
at gstelement.c line 2488
#33 bacon_video_widget_open
at bacon-video-widget-gst-0.10.c line 3745
#34 totem_action_set_mrl_with_warning
at totem-object.c line 1721



Haven't investigated exact cause yet though.

Comment 22 Tim-Philipp Müller 2010-07-01 19:30:44 UTC

Closing again. See bug #623318 for the playbin2 issue exposed by this.