After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 733897 - vaapidecode hangs when the downstream returns an error
vaapidecode hangs when the downstream returns an error
Status: RESOLVED FIXED
Product: gstreamer-vaapi
Classification: Other
Component: general
git master
Other Linux
: Normal normal
: ---
Assigned To: gstreamer-vaapi maintainer(s)
gstreamer-vaapi maintainer(s)
Depends on:
Blocks: 720305
 
 
Reported: 2014-07-29 06:38 UTC by Matthew Waters (ystreet00)
Modified: 2014-07-29 14:13 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
vaapidecode: return from decode loop properly when downstream returns an error (2.69 KB, patch)
2014-07-29 06:43 UTC, Matthew Waters (ystreet00)
none Details | Review
vaapidecode: return from decode loop properly when downstream returns an error (2.80 KB, patch)
2014-07-29 06:57 UTC, Matthew Waters (ystreet00)
committed Details | Review
vaapidecode: decode and output all pending frames on normal EOS (1.94 KB, patch)
2014-07-29 14:03 UTC, Gwenole Beauchesne
none Details | Review

Description Matthew Waters (ystreet00) 2014-07-29 06:38:08 UTC
Pipeline: ... ! vaapidecode ! {glimagesink,cluttersink}

The above pipeline hangs when the sink returns an error (e.g. when the window is closed)
Comment 1 Matthew Waters (ystreet00) 2014-07-29 06:43:42 UTC
Created attachment 281907 [details] [review]
vaapidecode: return from decode loop properly when downstream returns an error
Comment 2 Matthew Waters (ystreet00) 2014-07-29 06:57:39 UTC
Created attachment 281909 [details] [review]
vaapidecode: return from decode loop properly when downstream returns an error

There was still a possible hang/race when decode_loop() signalled before _finish() had the chance to g_cond_wait()
Comment 3 Gwenole Beauchesne 2014-07-29 07:51:28 UTC
Review of attachment 281909 [details] [review]:

LGTM. I'd probably commit after 0.5.9 release. Otherwise, I'd need to respin the tarballs and test again on every build back to 0.10. Or, is this an issue that is bound to occur in common situations? Thanks.

::: gst/vaapi/gstvaapidecode.c
@@ +450,3 @@
         return;
     }
+    GST_VIDEO_DECODER_STREAM_UNLOCK(vdec);

Maybe could we just move this one up. I like being able to count the number of lock vs. unlock operations to make sure we didn't have forgotten any. :)
Comment 4 Gwenole Beauchesne 2014-07-29 08:33:15 UTC
commit 6003596e82b9aaaa699135f89bd7839ae7792cb7
Author: Matthew Waters <ystreet00@gmail.com>
Date:   Tue Jul 29 16:22:01 2014 +1000

    vaapidecode: properly return from decode loop on downstream errors.
    
    Fixes a hang/race on shutdown where _decode_loop() had already completed
    its execution and _finish() was waiting on a GCond for decode_loop()
    to complete.  Also fixes the possible race where _finish() is called
    but _decode_loop() endlessly returns before signalling completion
    iff the decoder instance returns GST_FLOW_OK.
    
    Found with: ... ! vaapidecode ! {glimagesink,cluttersink}
    
    https://bugzilla.gnome.org/show_bug.cgi?id=733897
    
    [factored out GST_VIDEO_DECODER_STREAM_UNLOCK() call]
    Signed-off-by: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Comment 5 Gwenole Beauchesne 2014-07-29 14:00:29 UTC
Re-opening.
Comment 6 Gwenole Beauchesne 2014-07-29 14:03:08 UTC
Created attachment 281953 [details] [review]
vaapidecode: decode and output all pending frames on normal EOS

The previous brought an issue whereby it would have returned too early in normal conditions.

The gst_vaapidecode_decode_loop() function is called within a separate task to fetch and output all frames that were decoded so far. So, if the decoder_loop_status is forcibly set to EOS when _finish() is called, then we are bound to exist the task without submitting the pending frames.

If the downstream element error'ed out, then the gst_pad_push() would propagate up an error and so we will get it right for cutting off _finish() early in that case.

Tested locally and it works for both my use cases (normal exist), and yours (abnormal exit from downstream). Could you please give it a try? Thanks.
Comment 7 Matthew Waters (ystreet00) 2014-07-29 14:10:35 UTC
(In reply to comment #6)
> Tested locally and it works for both my use cases (normal exist), and yours
> (abnormal exit from downstream). Could you please give it a try? Thanks.

Works here :)
Comment 8 Gwenole Beauchesne 2014-07-29 14:13:08 UTC
Pushed. Thanks. Hopefully, QA auto-testing bot would not notice it overnight. :)
Comment 9 Gwenole Beauchesne 2014-07-29 14:13:56 UTC
commit 528f486513d406c45f1e30436f391f726bbb3b3d
Author: Gwenole Beauchesne <gwenole.beauchesne@intel.com>
Date:   Tue Jul 29 15:47:46 2014 +0200

    vaapidecode: decode and output all pending frames on normal EOS.
    
    The gst_vaapidecode_decode_loop() function is called within a separate
    task to fetch and output all frames that were decoded so far. So, if
    the decoder_loop_status is forcibly set to EOS when _finish() is called,
    then we are bound to exist the task without submitting the pending
    frames.
    
    If the downstream element error'ed out, then the gst_pad_push() would
    propagate up an error and so we will get it right for cutting off
    _finish() early in that case.
    
    This is a regression from 6003596.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=733897