After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 635784 - ringbuffer: make sure to not start if the may_start flag is FALSE
ringbuffer: make sure to not start if the may_start flag is FALSE
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-base
unspecified
Other All
: Normal blocker
: 0.10.33
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2010-11-25 16:14 UTC by Håvard Graff (hgr)
Modified: 2011-04-18 09:41 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
patch (1.17 KB, patch)
2010-11-25 16:14 UTC, Håvard Graff (hgr)
none Details | Review

Description Håvard Graff (hgr) 2010-11-25 16:14:32 UTC
Created attachment 175244 [details] [review]
patch

This would potentially deadlock when changing the element-state up and down.
Comment 1 Sebastian Dröge (slomo) 2010-12-12 16:02:16 UTC
Looks good to me... Wim?
Comment 2 Wim Taymans 2010-12-29 11:37:01 UTC
I think this papers over another problem and might not fix it fundamentally. Can you provide more info about what sequence of events lead to a deadlock or a test case?
Comment 3 Tim-Philipp Müller 2011-01-06 20:30:16 UTC
Håvard: ping?
Comment 4 Håvard Graff (hgr) 2011-01-07 00:43:15 UTC
Pong! I have been quite busy lately, but will try to find time to provide a better overview of the problem :)
Comment 5 Tim-Philipp Müller 2011-01-11 15:14:22 UTC
Removing blocker status.
Comment 6 Tobias Mueller 2011-04-15 17:52:20 UTC
Håvard: Ping
Comment 7 Håvard Graff (hgr) 2011-04-15 18:02:13 UTC
Ping noted. I am reverting the patch and reproducing locally as to get some callstacks :) BRB
Comment 8 Håvard Graff (hgr) 2011-04-15 19:37:09 UTC
Here are the deadlocking callstacks:

Callstack 1:
 	libgthread-2.0-0.dll!g_mutex_lock_errorcheck_impl(_GMutex * mutex=0x012933b0, const unsigned long magic=4175530711, char * const location=0x0054b8f8)  Line 112 + 0xc bytes	C
>	libgstbase-0.10-0.dll!gst_base_src_set_playing(_GstBaseSrc * basesrc=0x0125d178, int live_play=0)  Line 2911 + 0x2a bytes	C
 	libgstbase-0.10-0.dll!gst_base_src_change_state(_GstElement * element=0x0125d178, GstStateChange transition=GST_STATE_CHANGE_PLAYING_TO_PAUSED)  Line 3115 + 0xb bytes	C
 	libgstaudio-0.10-0.dll!gst_base_audio_src_change_state(_GstElement * element=0x0125d178, GstStateChange transition=GST_STATE_CHANGE_PLAYING_TO_PAUSED)  Line 1149 + 0x27 bytes	C
 	libtaaaudio.dll!gst_duplex_audio_src_change_state(_GstElement * element=0x0125d178, GstStateChange transition=GST_STATE_CHANGE_PLAYING_TO_PAUSED)  Line 448 + 0x27 bytes	C
 	libgstreamer-0.10-0.dll!gst_element_change_state(_GstElement * element=0x0125d178, GstStateChange transition=GST_STATE_CHANGE_PLAYING_TO_PAUSED)  Line 2717 + 0x15 bytes	C
 	libgstreamer-0.10-0.dll!gst_element_set_state_func(_GstElement * element=0x0125d178, GstState state=GST_STATE_PAUSED)  Line 2673 + 0xd bytes	C
 	libgstreamer-0.10-0.dll!gst_element_set_state(_GstElement * element=0x0125d178, GstState state=GST_STATE_PAUSED)  Line 2574 + 0x15 bytes	C
 	gst-plugins-taa-test.exe!test_intense_paused_playing_on_src()  Line 617 + 0xb bytes	C


Callstack 2:
 	libgthread-2.0-0.dll!g_cond_wait_errorcheck_impl(_GCond * cond=0x012ae650, _GMutex * mutex=0x012ae600, const unsigned long magic=4175530711, char * const location=0x00a9eff0)  Line 206 + 0x10 bytes	C
 	libgstaudio-0.10-0.dll!wait_segment(_GstRingBuffer * buf=0x01215bd8)  Line 1478 + 0x2f bytes	C
 	libgstaudio-0.10-0.dll!gst_ring_buffer_read(_GstRingBuffer * buf=0x01215bd8, unsigned __int64 sample=0, unsigned char * data=0x0120e380, unsigned int len=480)  Line 1893 + 0x9 bytes	C
 	libgstaudio-0.10-0.dll!gst_base_audio_src_create(_GstBaseSrc * bsrc=0x0125d178, unsigned __int64 offset=18446744073709551615, unsigned int length=1920, _GstBuffer * * outbuf=0x0167fa80)  Line 811 + 0x19 bytes	C
 	libgstbase-0.10-0.dll!gst_base_src_get_range(_GstBaseSrc * src=0x0125d178, unsigned __int64 offset=18446744073709551615, unsigned int length=0, _GstBuffer * * buf=0x0167fa80)  Line 2153 + 0x21 bytes	C
 	libgstbase-0.10-0.dll!gst_base_src_loop(_GstPad * pad=0x01293cf0)  Line 2410 + 0x19 bytes	C
 	libgstreamer-0.10-0.dll!gst_task_func(_GstTask * task=0x012c6e58)  Line 318 + 0x11 bytes	C
 	libgstreamer-0.10-0.dll!default_func(TaskData * tdata=0x011f6368, _GstTaskPool * pool=0x011ede78)  Line 70 + 0x9 bytes	C
 	libglib-2.0-0.dll!g_thread_pool_thread_proxy(void * data=0x011eddd8)  Line 325 + 0x14 bytes	C
 	libglib-2.0-0.dll!g_thread_create_proxy(void * data=0x0120dae0)  Line 1955 + 0x10 bytes	C
 	libgthread-2.0-0.dll!g_thread_proxy(void * data=0x012c6f68)  Line 509 + 0x10 bytes	C


As you can see, Cs1 is waiting for the LIVE_LOCK, that is held by Cs2 in gst_base_src_loop. Cs2 on the other hand is waiting for data to be written to the ringbuffer.

The code-snippets of interest is the state-change PLAYING_TO_PAUSED in gstbaseaudiosrc.c:

case GST_STATE_CHANGE_PLAYING_TO_PAUSED:
  GST_DEBUG_OBJECT (src, "PLAYING->PAUSED");
  gst_ring_buffer_may_start (src->ringbuffer, FALSE);
  gst_ring_buffer_pause (src->ringbuffer);
  break;

and the beginning of wait_segment inside gstringbuffer.c:

if (G_UNLIKELY (g_atomic_int_get (&buf->abidata.ABI.may_start) == FALSE))
  goto no_start;

  GST_DEBUG_OBJECT (buf, "start!");
  segments = g_atomic_int_get (&buf->segdone);
  gst_ring_buffer_start (buf);

Now, if both _may_start and _pause is called *in between* the ABI.may_start check and gst_ring_buffer_start, what will happen?

The buffer will basically change state from PAUSED to PLAYING, and will go on to wait forever in the wait, since all mechanisms that should prevent that have failed: 
1. Setting state-flags have been reverted by gst_ringbuffer_start ()
2. The may_start check came to early (may_start = FALSE happened just after the check)
3. The signaling of the waiter has also happened to early (not waiting yet)

The way to combat this is to add a check inside _start to see if the buffer really is allowed to start, and this is safe because both _pause and _start take the buf-lock, so it will be serialized nicely.

... and that is exactly what this patch does! :)

One could also argue that this patch makes the check:
if (G_UNLIKELY (g_atomic_int_get (&buf->abidata.ABI.may_start) == FALSE))
  goto no_start;

...superfluous, since it will be checked for properly, serialized with _pause, inside _start, but I have not analyzed whether this would have any other side-effects.
Comment 9 Håvard Graff (hgr) 2011-04-15 19:40:22 UTC
Clear deadlock in BaseAudioSrc with easy fix. Sounds like blocker :)
Comment 10 Tobias Mueller 2011-04-15 23:29:19 UTC
But it's not necessarily the case. Thanks for your interested in making GStreamer better, but adjusting priority or severity fields doesn't help it. Your patch will be seen soon and this issue hopefully resolved.
Comment 11 Sebastian Dröge (slomo) 2011-04-16 05:26:55 UTC
I've to say that your explanation makes sense and the patch too. Let's get this into 0.10.33 :)

Tim?
Comment 12 Håvard Graff (hgr) 2011-04-16 06:30:27 UTC
If you want a test-case, all I do is to instantiate a src that inherits from BaseAudioSrc, and then do something like this:

for (i = 0; i < 10000000; i++)
{
  gst_element_set_state (src, GST_STATE_PLAYING);
  g_usleep (G_USEC_PER_SEC / 10000);
  gst_element_set_state (src, GST_STATE_PAUSED);
  g_usleep (G_USEC_PER_SEC / 10000);
  g_print("\r%d", i);
}

If the counting stops, I´ve got the deadlock. Without the patch it stops within a minute. With, I have ran it for 5 hours and no problem.
Comment 13 Wim Taymans 2011-04-18 09:35:50 UTC
Yes, I understand now. It also then looks like the other may_start check is not needed anymore.
Comment 14 Wim Taymans 2011-04-18 09:41:09 UTC
commit d9f1b3736e0e6927e513e95ba934236101ca0a82
Author: Håvard Graff <havard.graff@.eu.tandberg.int>
Date:   Thu Nov 25 17:01:53 2010 +0100

    ringbuffer: make sure to not start if the may_start flag is FALSE
    
    Fixes #635784