After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 440463 - gstbin unit test sometimes hangs waiting for ASYNC_DONE message
gstbin unit test sometimes hangs waiting for ASYNC_DONE message
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer (core)
git master
Other Linux
: Normal blocker
: 0.10.13
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2007-05-22 14:42 UTC by Tim-Philipp Müller
Modified: 2007-05-24 14:45 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
bin:5,GST_STATES:5 debug log for one of the hangs (not exactly the same as the one above) (19.64 KB, text/plain)
2007-05-22 14:48 UTC, Tim-Philipp Müller
  Details
possible patch (2.34 KB, patch)
2007-05-22 16:31 UTC, Wim Taymans
none Details | Review
improved patch (3.47 KB, patch)
2007-05-23 15:49 UTC, Wim Taymans
accepted-commit_now Details | Review

Description Tim-Philipp Müller 2007-05-22 14:42:33 UTC
Running make gst/gstbin.forever I regularly get hangs in the gst/gstbin unit test. This is not always the same test it seems. Sometimes it spins without using cpu, sometimes it hangs using 100% cpu.

Here's a stack trace:

(gdb) thread apply all bt


Comment 1 Tim-Philipp Müller 2007-05-22 14:48:51 UTC
Created attachment 88614 [details]
bin:5,GST_STATES:5 debug log for one of the hangs (not exactly the same as the one above)
Comment 2 Wim Taymans 2007-05-22 16:31:00 UTC
Created attachment 88626 [details] [review]
possible patch

The problem was because the get_state function recalculated the state and discovered that it was no longer async. Then the state_continue function tries to see if it needs to perform additional actions after the state change completed. It saw nothing was pending and exited while it should have posted an ASYNC_DONE message. 

This patch always makes the state_continue function post ASYNC_DONE. We can do this unconditionally because we know that the ASYNC_DONE was not posted yet when we started the function.
Comment 3 Jan Schmidt 2007-05-22 17:22:56 UTC
The patch didn't apply completely cleanly for me because of bilboed's changes this morning, but it applied well enough.

It seems to fix the immediate issue of hanging. When running the gstbin check forever though, I get this after a few hundred iterations: 

gst/gstbin.c:699:F:bin tests:test_children_state_change_order_semi_sink:0: No state change message within 1 second (#209)

That could be a bug in the test though, I've no idea.

Wim, how confident are you that this doesn't introduce any other problems?
Comment 4 Tim-Philipp Müller 2007-05-22 17:28:10 UTC
> I get this after a few hundred iterations: 
> 
> gst/gstbin.c:699:F:bin tests:test_children_state_change_order_semi_sink:0: No
> state change message within 1 second (#209)
> 
> That could be a bug in the test though, I've no idea.
> 
> Wim, how confident are you that this doesn't introduce any other problems?

Same here, but I thought it was because I was compiling stuff in another terminal :)

I still got a hang in the unit test once with the patch applied, but haven't been able to reproduce it since.
Comment 5 Wim Taymans 2007-05-23 13:31:20 UTC
there is one problem left: in certain situations no state-change message is posted, for the same reasons the ASYNC_DONE was not posted before.
Comment 6 Wim Taymans 2007-05-23 15:49:52 UTC
Created attachment 88679 [details] [review]
improved patch

this is an improved patch and restores the state_lock around recalc_state that was wrongly removed during the rewrite. The problem with some messages not being posted is still here and I can't see how to make a quick fix.
Comment 7 Wim Taymans 2007-05-23 17:19:12 UTC
ok, last patch indeed posts all messages but the ASYNC one might be out-of-order, which should not be considered a regression since it's a new message. The test just deals badly with it and ideally the ASYNC_DONE should be posted in the right order but that will require a slight rework.
Comment 8 Tim-Philipp Müller 2007-05-23 18:09:31 UTC
Seems to work fine now, the only warning I now get occasionally is this:

gst/gstbin.c:303:F:bin tests:test_message_state_changed_children:0: sink (0x8082148) refcount is 4 instead of 3
Comment 9 Jan Schmidt 2007-05-24 11:17:47 UTC
That message is because the state change recalculation thread hasn't finished yet, I believe. I'd change the test to allow either 3 or 4 as a valid refcount.