After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 778830 - v4l2dec: Fix race when going from PAUSED to READY
v4l2dec: Fix race when going from PAUSED to READY
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-plugins-good
unspecified
Other All
: Normal normal
: 1.11.91
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2017-02-17 13:26 UTC by Thibault Saunier
Modified: 2017-04-07 18:05 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
v4l2dec: Fix race when going from PAUSED to READY (2.52 KB, patch)
2017-02-17 13:26 UTC, Thibault Saunier
none Details | Review
v4l2dec: Fix race when going from PAUSED to READY (5.57 KB, patch)
2017-02-21 19:47 UTC, Thibault Saunier
none Details | Review
pad: Add API to get the current state of a task (2.29 KB, patch)
2017-02-21 19:47 UTC, Thibault Saunier
none Details | Review
pad: Add API to get the current state of a task (2.30 KB, patch)
2017-02-21 20:05 UTC, Thibault Saunier
committed Details | Review
v4l2dec: Fix race when going from PAUSED to READY (5.92 KB, patch)
2017-04-07 13:20 UTC, Thibault Saunier
accepted-commit_now Details | Review

Description Thibault Saunier 2017-02-17 13:26:34 UTC
See commit message
Comment 1 Thibault Saunier 2017-02-17 13:26:39 UTC
Created attachment 346071 [details] [review]
v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:

  ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop: assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

The taken solution in that patch is to delay the setting of
`v4l2dec->process` to the time where the processing loop is
actually started.
Comment 2 Nicolas Dufresne (ndufresne) 2017-02-17 15:03:30 UTC
Review of attachment 346071 [details] [review]:

Have you tested some aggressive seeks (where we don't wait for the seek to complete) ? This patch needs work since it make gst_v4l2_video_dec_loop_stopped() and some other code useless. It's all complicated, and I never found a good solution. What happens is that sometimes, the pad task will stop before gst_v4l2_video_dec_loop() is called. In this case, we ended up with bugs, since the handle_frame() should return FLUSHING.

I think I was confused when I wrote this. Later Philipp Zabel found that we can read and sync on the pad task state. This is being used in _finish() function. I think the right solution is to completely remove this processing state and to use the pad state instead to detect early flush (while _process() is being called).
Comment 3 Thibault Saunier 2017-02-17 15:09:01 UTC
> Have you tested some aggressive seeks (where we don't wait for the seek to complete) ? This patch needs work since it make gst_v4l2_video_dec_loop_stopped() and some other code useless. 

Yeah, I ran the validate testsuite with all the scrub seeking tests with that patch and there seem to be no regression at all (and the testsuite results are pretty good on the platform, nice work!).

> It's all complicated, and I never found a good solution. What happens is that sometimes, the pad task will stop before gst_v4l2_video_dec_loop() is called. In this case, we ended up with bugs, since the handle_frame() should return FLUSHING.

My patch is about that case in the case of going back to READY, I am not sure what is your concerns about seeking here.

> I think I was confused when I wrote this. Later Philipp Zabel found that we can read and sync on the pad task state. This is being used in _finish() function. I think the right solution is to completely remove this processing state and to use the pad state instead to detect early flush (while _process() is being called).

That makes sense, will give that a try.
Comment 4 Thibault Saunier 2017-02-21 19:47:26 UTC
Created attachment 346389 [details] [review]
v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:

  ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop: assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

Here I am removing the whole ->processing logic to base it all on the
GstTask state to avoid duplicating the knowledge.
Comment 5 Thibault Saunier 2017-02-21 19:47:58 UTC
Created attachment 346390 [details] [review]
pad: Add API to get the current state of a task

Avoiding the user to need to deal with the locking himself etc.

API:
  gst_pad_task_get_state
Comment 6 Tim-Philipp Müller 2017-02-21 19:55:34 UTC
Comment on attachment 346390 [details] [review]
pad: Add API to get the current state of a task

>+ * Get @pad task state. If not task is currently
>+ * set, GST_TASK_STOPPED is returned.

not task -> no task

>+ * Returns: The current state of @pad' task.

@pad' -> pad's

>+GstTaskState
>+gst_pad_get_task_state (GstPad * pad)
>+{
>+  ...
>+  g_return_val_if_fail (GST_IS_PAD (pad), FALSE);

Should probably return a GstTaskState and not a boolean :)
Comment 7 Thibault Saunier 2017-02-21 20:05:06 UTC
Created attachment 346392 [details] [review]
pad: Add API to get the current state of a task

- Fix documenation typos
- Return a GstTaskStatus on assertions
Comment 8 Nicolas Dufresne (ndufresne) 2017-02-22 08:50:04 UTC
Review of attachment 346389 [details] [review]:

It would also be nice to make a version with an internal version of pad_get_state(), so we could merge in 1.10.

::: sys/v4l2/gstv4l2videodec.c
@@ +478,3 @@
   /* When flushing, decoding thread may never run */
+  if (gst_pad_get_task_state (GST_VIDEO_DECODER_SRC_PAD (self)) !=
+      GST_TASK_STOPPED) {

Is that really possible in the task implementation ? When I read gst_task_func(), the state will always be stopped when this is called. That's because we take a ref in start_task, and will only leave gst_task_func() when the state is STOPPED.

What this is trying to do, is to catch the case were we flush before our task function get called. Because we have no guaranty to be called once. In that case, the output_flow won't be set properly.

Maybe with a small rework, we could manage to initially set the output_flow to FLUSHING, and get away with that ?
Comment 9 Nicolas Dufresne (ndufresne) 2017-02-22 08:52:07 UTC
Review of attachment 346392 [details] [review]:

::: gst/gstpad.c
@@ +6090,3 @@
+  task = GST_PAD_TASK (pad);
+  if (task == NULL)
+    goto no_task;

I think the goto is not really needed here. Just set ret = GST_TASK_STOPPED, and then else  ...

::: gst/gstpad.h
@@ +1418,3 @@
 gboolean		gst_pad_pause_task			(GstPad *pad);
 gboolean		gst_pad_stop_task			(GstPad *pad);
+GstTaskState	gst_pad_get_task_state		(GstPad *pad);

Does not seem to align with other declarations.
Comment 10 Thibault Saunier 2017-02-24 19:21:48 UTC
Comment on attachment 346392 [details] [review]
pad: Add API to get the current state of a task

commit 6326c764ee85fa17f1d7838f6e4d5824e07e3132
Author: Thibault Saunier <thibault.saunier@osg.samsung.com>
Date:   Fri Feb 17 15:48:17 2017 -0300

    pad: Add API to get the current state of a task
    
    Avoiding the user to need to deal with the locking himself etc.
    
    API:
      gst_pad_task_get_state
    
    https://bugzilla.gnome.org/show_bug.cgi?id=778830
Comment 11 Nicolas Dufresne (ndufresne) 2017-04-07 00:37:48 UTC
Are you still looking into this issue ?
Comment 12 Thibault Saunier 2017-04-07 03:03:29 UTC
Yes, I have an updated version of the patch which seems to work well on odroid and wanted to check if I could test with vivid, gonna propose tomorrow.
Comment 13 Thibault Saunier 2017-04-07 13:20:53 UTC
Created attachment 349475 [details] [review]
v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:

  ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop: assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

Here I am removing the whole ->processing logic to base it all on the
GstTask state to avoid duplicating the knowledge.
Comment 14 Nicolas Dufresne (ndufresne) 2017-04-07 13:56:05 UTC
Review of attachment 349475 [details] [review]:

Loogs good to me. I'm quite happy the we are finally removing some of the atomic state.
Comment 15 Thibault Saunier 2017-04-07 18:05:52 UTC
commit 7b7a809818a0fa77dee1e340b42473a1e6e9ea8a
Author: Thibault Saunier <thibault.saunier@osg.samsung.com>
Date:   Fri Feb 17 10:01:08 2017 -0300

    v4l2dec: Fix race when going from PAUSED to READY
    
    Running `gst-validate-launcher -t validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
    on odroid XU4 (s5p-mfc v4l2 driver) often leads to:
    
      ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop: assertion failed: (g_atomic_int_get (&self->processing) == FALSE)
    
    This happens when the following race happens:
    
    - T0: Main thread
    - T1: Upstream streaming thread
    - T2. v4l2dec processing thread)
    
    [The decoder is in PAUSED state]
    
    T0. The validate scenario runs `Executing (36/40) set-state: state=null repeat=40`
    T1- The decoder handles a frame
    T2- A decoded frame is push downstream
    T2- Downstream returns FLUSHING as it is already flushing changing state
    T2- The decoder stops its processing thread and sets `->processing = FALSE`
    T1- The decoder handles another frame
    T1- `->process` is FALSE so the decoder restarts its streaming thread
    T0- In v4l2dec-> stop the processing thread is stopped
    NOTE: At this point the processing thread loop never started.
    T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)
    
    Here I am removing the whole ->processing logic to base it all on the
    GstTask state to avoid duplicating the knowledge.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=778830