GNOME Bugzilla – Bug 735663
dashdemux: synchronize with the download loop thread before signalling it
Last modified: 2014-09-19 06:46:23 UTC
Created attachment 284805 [details] [review] [PATCH] dashdemux: synchronize with the download loop thread to signal it to continue There is an issue when dashdemux is working with local files that sometimes the downloader src element (filesrc) manages to download the whole file before the "download loop" thread (which has set filesrc to PLAYING) manages to reach its g_cond_wait() call where it waits for filesrc to send EOS (or ERROR). In that case, the g_cond_signal() doesn't have any effect and afterwards the "download loop" thread enters g_cond_wait() and waits forever. This patch solves the issue by locking the mutex associated with the GCond before g_cond_signal(), which ensures that g_cond_wait() has been reached.
The same has to be fixed in hlsdemux and mssdemux too then, they have the same code.
Created attachment 285470 [details] [review] [PATCH V2] dashdemux: synchronize with the download loop thread to signal it to continue Patch updated with the following changes: When a stream src error message is handled, the download fragment cond is signaled asynchrounously since we cannot hold the related lock from this function as it can be called when the element changes its state and it can result in a deadlock when gst_dash_demux_download_uri sync the stream src element state with its parent. Anyway i'm not sure if it's right to handle that asynchronously.
Created attachment 285496 [details] [review] [PATCH]dashdemux: properly wait fragment downloads Patch updated with the following changes: We do not rely entirely anymore on having the fragmemt download condition being signaled because it could happen before g_cond_wait is called. We cannot use the fragment download lock in the handle_message as it can be called when gst_dash_demux_download_uri set the state of the src element. I dropped the previous version of the patch since we do not have the guarantee that the source function will be executed (using g_idle_add). The code is a bit raw but does the job. Maybe waiting for the condition to be signaled can be dropped as we can only rely on the new variable fragment_download_signaled with its related lock. Comments welcome :)
Review of attachment 285496 [details] [review]: Very good catch! I suggested some minor corrections below. See if you agree. ::: ext/dash/gstdashdemux.c @@ +432,3 @@ + g_mutex_lock (&stream->fragment_download_signaled_lock); + stream->fragment_download_signaled = TRUE; I think the signaling has to be done after this variable is set, inside the same lock. @@ +1089,3 @@ gst_element_set_state (stream->src, GST_STATE_READY); g_cond_signal (&stream->fragment_download_cond); + g_mutex_lock (&stream->fragment_download_signaled_lock); Same as above. @@ +2014,3 @@ g_cond_signal (&stream->fragment_download_cond); + g_mutex_lock (&stream->fragment_download_signaled_lock); + stream->fragment_download_signaled = TRUE; Same here. @@ +2034,3 @@ + + g_mutex_lock (&stream->fragment_download_signaled_lock); + stream->fragment_download_signaled = TRUE; Same. @@ +2217,3 @@ "Waiting for fragment download to finish: %s", uri); + + end_time = g_get_monotonic_time () + 5 * G_TIME_SPAN_MILLISECOND; If at all, this should be configurable via a property. I'd prefer to have this as a separate patch. Let's focus here on just adding the boolean flag and making sure it properly works. ::: ext/dash/gstdashdemux.h @@ +87,3 @@ GMutex fragment_download_lock; GCond fragment_download_cond; + GMutex fragment_download_signaled_lock; Any reason to add a new lock? Can't use the fragment_download_lock for the signaling and the variable? I think it makes sense to use the same.
(In reply to comment #4) > Review of attachment 285496 [details] [review]: > > Very good catch! > > I suggested some minor corrections below. See if you agree. > > ::: ext/dash/gstdashdemux.c > @@ +432,3 @@ > > + g_mutex_lock (&stream->fragment_download_signaled_lock); > + stream->fragment_download_signaled = TRUE; > > I think the signaling has to be done after this variable is set, inside the > same lock. Updated in the new version of the patch. > > @@ +1089,3 @@ > gst_element_set_state (stream->src, GST_STATE_READY); > g_cond_signal (&stream->fragment_download_cond); > + g_mutex_lock (&stream->fragment_download_signaled_lock); > > Same as above. > > @@ +2014,3 @@ > g_cond_signal (&stream->fragment_download_cond); > + g_mutex_lock (&stream->fragment_download_signaled_lock); > + stream->fragment_download_signaled = TRUE; > > Same here. > > @@ +2034,3 @@ > + > + g_mutex_lock (&stream->fragment_download_signaled_lock); > + stream->fragment_download_signaled = TRUE; > > Same. > > @@ +2217,3 @@ > "Waiting for fragment download to finish: %s", uri); > + > + end_time = g_get_monotonic_time () + 5 * G_TIME_SPAN_MILLISECOND; > > If at all, this should be configurable via a property. > > I'd prefer to have this as a separate patch. Let's focus here on just adding > the boolean flag and making sure it properly works. > > ::: ext/dash/gstdashdemux.h > @@ +87,3 @@ > GMutex fragment_download_lock; > GCond fragment_download_cond; > + GMutex fragment_download_signaled_lock; > > Any reason to add a new lock? Can't use the fragment_download_lock for the > signaling and the variable? I think it makes sense to use the same. Thanks for the review. We need to introduce a new lock as gst_dash_demux_download_uri is called when fragment_download_lock is already held. gst_dash_demux_download_uri calls gst_element_sync_state_with_parent/gst_element_set_state which is likely to call gst_dash_demux_handle_message. If the same lock is used > deadlock.
Created attachment 285728 [details] [review] [PATCH]dashdemux: properly wait fragment downloads
Created attachment 285983 [details] [review] dashdemux: synchronize with the download loop thread to signal it to continue Can you check if this version also fixes the issue for you? It reduces the locking area and doesn't require an additional mutex
Created attachment 285984 [details] [review] hlsdemux: synchronize with the download loop thread to signal it to continue Same for hlsdemux
Created attachment 285985 [details] [review] mssdemux: synchronize with the download loop thread to signal it to continue And mssdemux
Review of attachment 285728 [details] [review]: ::: ext/dash/gstdashdemux.c @@ +430,1 @@ stream->last_ret = GST_FLOW_CUSTOM_ERROR; Shouldn't this also be protected by the mutex? @@ +431,3 @@ + + g_mutex_lock (&stream->fragment_download_signaled_lock); + stream->fragment_download_signaled = TRUE; Maybe "download_finished" instead? Also why not just store the flow return here to distinguish between erorrs and other things?
Comment on attachment 285985 [details] [review] mssdemux: synchronize with the download loop thread to signal it to continue Same comment about the name of the variable, otherwise seems ok
Created attachment 286138 [details] [review] dashdemux: synchronize with the download loop thread to signal it to continue Updated according to comments. Please test and let me know if it works for your case so I can merge and update the patches for hls and mss
Review of attachment 286138 [details] [review]: ::: ext/dash/gstdashdemux.c @@ -2005,3 @@ } - stream->last_ret = ret; Shouldn't this be kept with some locking, to assign stream->last_ret to GST_FLOW_NOT_LINKED when this is the case ?
(In reply to comment #13) > Review of attachment 286138 [details] [review]: > > ::: ext/dash/gstdashdemux.c > @@ -2005,3 @@ > } > > - stream->last_ret = ret; > > Shouldn't this be kept with some locking, to assign stream->last_ret to > GST_FLOW_NOT_LINKED when this is the case ? I've test the patch and it seems to work fine with the previous change i've mentionned. If that can help you here is my branch that includes all the remaining dash fixes rebased on top of master: http://cgit.collabora.com/git/user/mateo/gst-plugins-bad.git/log/?h=dash-fixes
We don't want to set that everytime to avoid taking the lock on the chain function. So we only set it if it is != _OK, otherwise it will continue being _OK till the end. commit e289ab07c13c9f958f04e396285b567b570120d1 Author: George Kiagiadakis <george.kiagiadakis@collabora.com> Date: Fri Sep 12 02:36:47 2014 -0300 mssdemux: synchronize with the download loop thread to signal it to continue If EOS or ERROR happens before the download loop thread has reached its g_cond_wait() call, then the g_cond_signal doesn't have any effect and the download loop thread stucks later. https://bugzilla.gnome.org/show_bug.cgi?id=735663 commit f4546b64ea6fd89eb3471258a880b850555d98c5 Author: George Kiagiadakis <george.kiagiadakis@collabora.com> Date: Fri Sep 12 02:35:44 2014 -0300 hlsdemux: synchronize with the download loop thread to signal it to continue If EOS or ERROR happens before the download loop thread has reached its g_cond_wait() call, then the g_cond_signal doesn't have any effect and the download loop thread stucks later. https://bugzilla.gnome.org/show_bug.cgi?id=735663 commit 55032ae5fee1d6253632f4ce7f2912b3d36e3a66 Author: George Kiagiadakis <george.kiagiadakis@collabora.com> Date: Fri Aug 29 12:38:12 2014 +0200 dashdemux: synchronize with the download loop thread to signal it to continue If EOS or ERROR happens before the download loop thread has reached its g_cond_wait() call, then the g_cond_signal doesn't have any effect and the download loop thread stucks later. https://bugzilla.gnome.org/show_bug.cgi?id=735663
This sounds like it should go into 1.4 after the 1.4.2 release. Ok?