After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 774654 - Flushing can hang if output port queue of pending buffers is empty
Flushing can hang if output port queue of pending buffers is empty
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-omx
git master
Other Linux
: Normal normal
: 1.10.3
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2016-11-17 23:40 UTC by formruga
Modified: 2016-12-22 12:48 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Hack: swap the drain (5.88 KB, patch)
2016-11-25 12:55 UTC, minfrin
none Details | Review
Hack: swap flush with stop (3.40 KB, patch)
2016-11-26 11:25 UTC, minfrin
none Details | Review
Swap flush with stop (3.30 KB, patch)
2016-12-11 20:44 UTC, minfrin
none Details | Review
Swap flush with stop (2.84 KB, patch)
2016-12-11 20:47 UTC, minfrin
none Details | Review
Flush before we stop the srcpad loop. Flushing could hang if output port queue of pending buffers was empty. (3.33 KB, patch)
2016-12-18 19:49 UTC, minfrin
committed Details | Review

Description formruga 2016-11-17 23:40:00 UTC
Steps in gst_omx_video_dec_flush are:
0) Pause the components
1) Wait until the srcpad loop is stopped
2) Flush the ports
3) Resume components
4) Unset flushing to allow ports to accept data again

The problem lays in sequence: 0 -> 1, because srcpad loop (gst_omx_video_dec_loop) can stuck in function gst_omx_port_acquire_buffer if output port queue of pending buffers (port->pending_buffers) is empty and component is in pause state (component processing is halted, no new buffers will be generated).

For me sequence should be changed into 1 -> 0. i.e. first wait until the srcpad loop is stopped then pause the components.

Interestingly, the sequence was changed by commit d7533445 "omx: Let base classes handle EOS. https://bugzilla.gnome.org//show_bug.cgi?id=734774". I don't find relation between the commit message and changed sequence.
Comment 1 Sebastian Dröge (slomo) 2016-11-21 07:29:04 UTC
It was probably part of making the code (which seems wrong) the same as in the audio classes. Can you provide a patch for changing it?
Comment 2 formruga 2016-11-21 21:04:53 UTC
I need some more time to ensure myself, if I think right. After reading OpenMAX IL specification and checking some IL implementation, I have some doubts.
Comment 3 minfrin 2016-11-25 12:55:30 UTC
Created attachment 340751 [details] [review]
Hack: swap the drain

I tried this patch to try out this theory, however it doesn't make a difference to me.

Also seeing the same hang.
Comment 4 minfrin 2016-11-26 02:03:09 UTC
We appear to be deadlocked here in gstomx.c:

  if (!g_queue_is_empty (&comp->messages)) {
    signalled = TRUE;
  } else if (timeout == GST_CLOCK_TIME_NONE) {
    GST_ERROR_OBJECT (comp->parent, "%s waiting for signal", comp->name);
// we hang on g_cond_wait
    g_cond_wait (&comp->messages_cond, &comp->messages_lock);
    GST_ERROR_OBJECT (comp->parent, "%s finished waiting for signal", comp->name);
    signalled = TRUE;
  } else {
    signalled =
        g_cond_wait_until (&comp->messages_cond, &comp->messages_lock,
        wait_until);
  }

It looks like we're waiting for callbacks from the GPU, and these callbacks never arrive.
Comment 5 formruga 2016-11-26 08:57:55 UTC
Proper sequence of steps in gst_omx_video_dec_flush are:
0) Pause the components
1) Set flushing to get back buffers from components
2) Wait until the srcpad loop is stopped
3) Resume components
4) Unset flushing to allow ports to accept data again

As one can see wait has to be after set flushing. In OMXIL specification put a component into pause, does not mean that any OMX buffers will be returned, thus gst-omx acquire buffer function can't returned to srcpad task if no messages are in comp->messages.
Comment 6 minfrin 2016-11-26 11:25:03 UTC
Created attachment 340794 [details] [review]
Hack: swap flush with stop

Tried this variation on the change as you described, but I still get a hang.

The new hang seems to make more sense though, it hangs like this:

0:00:11.868245518 25138 0x74090260 DEBUG            omxvideodec gstomxvideodec.c:2187:gst_omx_video_dec_handle_frame:<omxh264dec-omxh264dec0> Handling frame
0:00:11.868429057 25138 0x74090260 DEBUG                    omx gstomx.c:1263:gst_omx_port_acquire_buffer:<omxh264dec-omxh264dec0> Acquiring video_decode buffer from port 130
0:00:11.868600461 25138 0x74090260 DEBUG                    omx gstomx.c:1376:gst_omx_port_acquire_buffer:<omxh264dec-omxh264dec0> video_decode port 130 has pending buffers
0:00:11.868728012 25138 0x74090260 DEBUG                    omx gstomx.c:1390:gst_omx_port_acquire_buffer:<omxh264dec-omxh264dec0> Acquired buffer 0x744526c8 (0x65bc0810) from video_decode port 130: 0
0:00:11.868896031 25138 0x74090260 DEBUG            omxvideodec gstomxvideodec.c:2339:gst_omx_video_dec_handle_frame:<omxh264dec-omxh264dec0> Passing frame offset 0 to the component
0:00:11.869035300 25138 0x74090260 DEBUG                    omx gstomx.c:1412:gst_omx_port_release_buffer:<omxh264dec-omxh264dec0> Releasing buffer 0x744526c8 (0x65bc0810) to video_decode port 130
0:00:11.869901018 25138 0x74090260 DEBUG                    omx gstomx.c:1458:gst_omx_port_release_buffer:<omxh264dec-omxh264dec0> Released buffer 0x744526c8 to video_decode port 130: None (0x00000000)
0:00:11.870133151 25138 0x74090260 DEBUG            omxvideodec gstomxvideodec.c:2386:gst_omx_video_dec_handle_frame:<omxh264dec-omxh264dec0> Passed frame to component
0:00:11.871020692 25138 0x74090260 DEBUG            omxvideodec gstomxvideodec.c:2187:gst_omx_video_dec_handle_frame:<omxh264dec-omxh264dec0> Handling frame
0:00:11.871217305 25138 0x74090260 DEBUG                    omx gstomx.c:1263:gst_omx_port_acquire_buffer:<omxh264dec-omxh264dec0> Acquiring video_decode buffer from port 130
0:00:11.871396261 25138 0x74090260 DEBUG                    omx gstomx.c:1368:gst_omx_port_acquire_buffer:<omxh264dec-omxh264dec0> Queue of video_decode port 130 is empty
0:00:11.871539176 25138 0x74090260 DEBUG                    omx gstomx.c:434:gst_omx_component_wait_message:<omxh264dec-omxh264dec0> video_decode waiting for signal
<--- we hang here

Given that "Queue of video_decode port 130 is empty" it seems strange that we would choose to block waiting for something from the port when it appears nothing is expected to come out of the port.

Am I reading this correctly?
Comment 7 formruga 2016-11-26 12:13:16 UTC
But set flushing in gst_omx_video_dec_flush, before wait until the srcpad loop stops, should generate message OMX_CommandFlush from component and then GST_OMX_MESSAGE_FLUSH should be available in comp->message. 

Sequence is
if (g_queue_is_empty (&port->pending_buffers))
gst_omx_component_wait_message (comp, GST_CLOCK_TIME_NONE);
Comment 8 formruga 2016-11-26 12:23:53 UTC
Continuation of comment 7 (I can't edit my comment):

if flush message is sent by component, gst_omx_component_wait_message will succeeded and gst_omx_component_handle_messages will be called. For GST_OMX_MESSAGE_FLUSH, flags: flushing and flushed are set, which should ensure that gst_omx_port_acquire_buffer will be return with GST_OMX_ACQUIRE_BUFFER_FLUSHING value. GST_OMX_ACQUIRE_BUFFER_FLUSHING is a reason of exit from gst_omx_video_dec_loop, which finally stop a task.

I wonder, why you don't have in log messages about flushing. What OMX driver (for what device) do you use?
Comment 9 minfrin 2016-11-26 23:42:25 UTC
I'm using a Raspberry Pi 3.

I have discovered it is possible to extract logging out of the graphics card itself, and when I run the omxh264dec directly into omxh264enc I get the decoder behaving as follows:

pi@towerofpi9:~$ sudo vcdbg log msg 2>&1 | grep decode

// good start, we begin receiving packets to decode

222323.608: video_decode:3:RIL: setup image requirements (544x576(544x576)x7)
222327.784: video_decode:3:RIL: decoding 24140 bytes, 2361053us, flags 30, (cb:0x3e4e9780)
222327.797: video_decode:3:RIL: asking for more
222327.848: video_decode:3:RIL: decoding 8352 bytes, 2200522us, flags 10, (cb:0x3e4e979c)
222327.862: video_decode:3:RIL: asking for more
222328.261: video_decode:3:RIL: new image requirements (544x576(544x576)x7)
222331.257: video_decode:3:RIL: decoding 4153 bytes, 2120348us, flags 10, (cb:0x3e4e97b8)
222331.269: video_decode:3:RIL: asking for more
222334.031: video_decode:3:RIL: decoding 1999 bytes, 2080321us, flags 10, (cb:0x3e4e97d4)
222334.042: video_decode:3:RIL: asking for more
222380.267: video_decode:3:RIL: decoding 2174 bytes, 2160310us, flags 10, (cb:0x3e4e97f0)
222380.279: video_decode:3:RIL: asking for more
222403.209: video_decode:3:RIL: decoding 5503 bytes, 2280310us, flags 10, (cb:0x3e4e980c)
222403.220: video_decode:3:RIL: asking for more
222419.430: video_decode:3:RIL: decoding 2611 bytes, 2240304us, flags 10, (cb:0x3e4e9828)
222419.441: video_decode:3:RIL: asking for more

// we start getting callbacks at this point

222419.733: video_decode:3:RIL:cb:next_frame_time:2080321000
222419.751: video_decode:3:RIL:cb:discard (cb:0x3e4e97d4), w:1 v:1 f:0 s:0
222547.894: video_decode:3:RIL: decoding 1927 bytes, 2320304us, flags 10, (cb:0x3e4e9844)
222547.906: video_decode:3:RIL: asking for more
222548.201: video_decode:3:RIL:cb:discard (cb:0x3e4e97b8), w:1 v:1 f:0 s:0
222627.263: video_decode:3:RIL: decoding 15224 bytes, 2680303us, flags 10, (cb:0x3e4e9860)
222627.275: video_decode:3:RIL: asking for more
222627.589: video_decode:3:RIL:cb:discard (cb:0x3e4e97f0), w:1 v:1 f:0 s:0
222653.226: video_decode:3:RIL: decoding 9048 bytes, 2520303us, flags 10, (cb:0x3e4e987c)
222653.240: video_decode:3:RIL: asking for more
222653.569: video_decode:3:RIL:cb:discard (cb:0x3e4e979c), w:1 v:1 f:0 s:0
222676.996: video_decode:3:RIL: decoding 4368 bytes, 2440303us, flags 10, (cb:0x3e4e9898)
222677.009: video_decode:3:RIL: asking for more
222677.308: video_decode:3:RIL:cb:discard (cb:0x3e4e9828), w:1 v:1 f:0 s:0
222691.396: video_decode:3:RIL: decoding 1874 bytes, 2400303us, flags 10, (cb:0x3e4e98b4)
222691.410: video_decode:3:RIL: asking for more
222691.700: video_decode:3:RIL:cb:discard (cb:0x3e4e980c), w:1 v:1 f:0 s:0
222733.372: video_decode:3:RIL: decoding 2100 bytes, 2480303us, flags 10, (cb:0x3e4e98d0)
222733.384: video_decode:3:RIL: asking for more
222733.678: video_decode:3:RIL:cb:discard (cb:0x3e4e9844), w:1 v:1 f:0 s:0
222753.796: video_decode:3:RIL: decoding 5020 bytes, 2600303us, flags 10, (cb:0x3e4e98ec)
222753.810: video_decode:3:RIL: asking for more

// we seem to get actual frames now

222754.149: video_decode:3:RIL:cb:I time:2361053, cb:0x3e4e9780, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
222754.206: video_decode:3:RIL:cb:B time:2400303, cb:0x3e4e98b4, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
222754.226: video_decode:3:RIL:resolution changing
223368.123: video_decode:3:RIL:output dec_frames_full:3e4ea700 out:3e512460
223368.987: video_decode:3:RIL:output done, dec_frames_full:3e4ea774 out:0
223370.926: video_decode:3:RIL:output dec_frames_full:3e4ea774 out:3e5124e0
223371.879: video_decode:3:RIL:output done, dec_frames_full:0 out:0
223376.495: video_decode:3:RIL: decoding 1963 bytes, 2560303us, flags 10, (cb:0x3e4e9908)
223376.509: video_decode:3:RIL: asking for more
223376.846: video_decode:3:RIL:cb:B time:2440303, cb:0x3e4e9898, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223376.870: video_decode:3:RIL:output dec_frames_full:3e4ea7e8 out:3e512560
223378.386: video_decode:3:RIL:output done, dec_frames_full:0 out:0
223389.831: video_decode:3:RIL: decoding 1911 bytes, 2640303us, flags 10, (cb:0x3e4e9924)
223389.847: video_decode:3:RIL: asking for more
223390.172: video_decode:3:RIL:cb:B time:2480303, cb:0x3e4e98d0, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223390.197: video_decode:3:RIL:output dec_frames_full:3e4ea85c out:3e5125e0
223391.665: video_decode:3:RIL:output done, dec_frames_full:0 out:0
223397.317: video_decode:3:RIL: decoding 14845 bytes, 3000303us, flags 10, (cb:0x3e4e9940)
223397.329: video_decode:3:RIL: asking for more
223400.407: video_decode:3:RIL: decoding 8384 bytes, 2840303us, flags 10, (cb:0x3e4e995c)
223400.423: video_decode:3:RIL: asking for more
223400.833: video_decode:3:RIL:cb:B time:2520303, cb:0x3e4e987c, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223410.292: video_decode:3:RIL: decoding 4390 bytes, 2760303us, flags 10, (cb:0x3e4e9978)
223410.303: video_decode:3:RIL: asking for more
223410.637: video_decode:3:RIL:cb:B time:2560303, cb:0x3e4e9908, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223413.963: video_decode:3:RIL: decoding 1950 bytes, 2720303us, flags 10, (cb:0x3e4e9994)
223413.977: video_decode:3:RIL: asking for more
223414.291: video_decode:3:RIL:cb:B time:2600303, cb:0x3e4e98ec, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223417.884: video_decode:3:RIL: decoding 2287 bytes, 2800303us, flags 10, (cb:0x3e4e99b0)
223417.896: video_decode:3:RIL: asking for more
223418.219: video_decode:3:RIL:cb:B time:2640303, cb:0x3e4e9924, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223421.297: video_decode:3:RIL: decoding 4460 bytes, 2920303us, flags 10, (cb:0x3e4e99cc)
223421.311: video_decode:3:RIL: asking for more
223421.636: video_decode:3:RIL:cb:P time:2680303, cb:0x3e4e9860, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223421.689: video_decode:3:RIL:cb:B time:2720303, cb:0x3e4e9994, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0
223425.006: video_decode:3:RIL: decoding 2044 bytes, 2880303us, flags 10, (cb:0x3e4e99e8)
223425.020: video_decode:3:RIL: asking for more
223425.338: video_decode:3:RIL:cb:B time:2760303, cb:0x3e4e9978, ar:64/33, crop:0x0 544x576, interval:40000000, interlace:1,1,0

// suddenly, no more callbacks

223428.601: video_decode:3:RIL: decoding 2043 bytes, 2960303us, flags 10, (cb:0x3e4e9a04)
223428.613: video_decode:3:RIL: asking for more
223433.073: video_decode:3:RIL: decoding 14092 bytes, 3320303us, flags 10, (cb:0x3e4e9a20)
223433.085: video_decode:3:RIL: asking for more
223443.807: video_decode:3:RIL: decoding 8278 bytes, 3160303us, flags 10, (cb:0x3e4e9a3c)
223443.821: video_decode:3:RIL: asking for more
223447.946: video_decode:3:RIL: decoding 5403 bytes, 3080303us, flags 10, (cb:0x3e4e9a58)
223447.959: video_decode:3:RIL: asking for more
223451.931: video_decode:3:RIL: decoding 2508 bytes, 3040303us, flags 10, (cb:0x3e4e9a74)
223451.944: video_decode:3:RIL: asking for more
223456.181: video_decode:3:RIL: decoding 2662 bytes, 3120303us, flags 10, (cb:0x3e4e9a90)
223456.194: video_decode:3:RIL: asking for more
223460.990: video_decode:3:RIL: decoding 5032 bytes, 3240303us, flags 10, (cb:0x3e4e9aac)
223461.003: video_decode:3:RIL: asking for more
223469.775: video_decode:3:RIL: decoding 2365 bytes, 3200303us, flags 10, (cb:0x3e4e9ac8)
223469.788: video_decode:3:RIL: asking for more
223475.646: video_decode:3:RIL: decoding 2221 bytes, 3280303us, flags 10, (cb:0x3e4e9ae4)
223475.659: video_decode:3:RIL: asking for more
223530.151: video_decode:3:RIL: decoding 25602 bytes, 3640303us, flags 30, (cb:0x3e4e9b00)
223530.165: video_decode:3:RIL: asking for more
223569.608: video_decode:3:RIL: decoding 7733 bytes, 3480303us, flags 10, (cb:0x3e4e9b1c)
223569.621: video_decode:3:RIL: asking for more
223576.654: video_decode:3:RIL: decoding 5181 bytes, 3400303us, flags 10, (cb:0x3e4e9b38)
223576.666: video_decode:3:RIL: asking for more
223588.723: video_decode:3:RIL: decoding 2259 bytes, 3360303us, flags 10, (cb:0x3e4e9b54)
223588.735: video_decode:3:RIL: asking for more
223621.823: video_decode:3:RIL: decoding 2313 bytes, 3440303us, flags 10, (cb:0x3e4e9b70)
223621.837: video_decode:3:RIL: asking for more
223636.732: video_decode:3:RIL: decoding 4955 bytes, 3560303us, flags 10, (cb:0x3e4e9b8c)
223636.746: video_decode:3:RIL: asking for more
223657.505: video_decode:3:RIL: decoding 2369 bytes, 3520303us, flags 10, (cb:0x3e4e9ba8)
223657.518: video_decode:3:RIL: asking for more
223740.549: video_decode:3:RIL: decoding 2257 bytes, 3600303us, flags 10, (cb:0x3e4e9bc4)
223740.562: video_decode:3:RIL: asking for more
223796.533: video_decode:3:RIL: decoding 13347 bytes, 3960303us, flags 10, (cb:0x3e4e9be0)
223796.546: video_decode:3:RIL: asking for more
223835.193: video_decode:3:RIL: decoding 8569 bytes, 3800303us, flags 10, (cb:0x3e4e9bfc)
223835.207: video_decode:3:RIL: asking for more
223844.099: video_decode:3:RIL: decoding 4585 bytes, 3720303us, flags 10, (cb:0x3e4e9c18)
223844.114: video_decode:3:RIL: asking for more
223852.417: video_decode:3:RIL: decoding 2397 bytes, 3680303us, flags 10, (cb:0x3e4e9c34)
223852.430: video_decode:3:RIL: asking for more
223891.909: video_decode:3:RIL: decoding 2312 bytes, 3760303us, flags 10, (cb:0x3e4e9c50)
223891.922: video_decode:3:RIL: asking for more
223907.790: video_decode:3:RIL: decoding 5108 bytes, 3880303us, flags 10, (cb:0x3e4e9c6c)
223907.804: video_decode:3:RIL: asking for more
223923.713: video_decode:3:RIL: decoding 2616 bytes, 3840303us, flags 10, (cb:0x3e4e9c88)
223923.726: video_decode:3:RIL: asking for more
224011.923: video_decode:3:RIL: decoding 2479 bytes, 3920303us, flags 10, (cb:0x3e4e9ca4)
224011.936: video_decode:3:RIL: asking for more
224075.808: video_decode:3:RIL: decoding 13723 bytes, 4280303us, flags 10, (cb:0x3e4e9cc0)
224075.822: video_decode:3:RIL: asking for more
224119.127: video_decode:3:RIL: decoding 9421 bytes, 4120303us, flags 10, (cb:0x3e4e9cdc)
224119.139: video_decode:3:RIL: asking for more
224134.403: video_decode:3:RIL: decoding 5754 bytes, 4040303us, flags 10, (cb:0x3e4e9cf8)
224134.416: video_decode:3:RIL: asking for more
224148.553: video_decode:3:RIL: decoding 3066 bytes, 4000303us, flags 10, (cb:0x3e4e9d14)
224148.564: video_decode:3:RIL: asking for more
224188.054: video_decode:3:RIL: decoding 2654 bytes, 4080303us, flags 10, (cb:0x3e4e9d30)
224188.066: video_decode:3:RIL: asking for more
224197.547: video_decode:3:RIL: decoding 5316 bytes, 4200303us, flags 10, (cb:0x3e4e9d4c)
224197.566: video_decode:3:RIL: asking for more
224214.089: video_decode:3:RIL: decoding 2524 bytes, 4160303us, flags 10, (cb:0x3e4e9d68)
224214.106: video_decode:3:RIL: asking for more
224301.559: video_decode:3:RIL: decoding 2719 bytes, 4240303us, flags 10, (cb:0x3e4e9d84)
224301.578: video_decode:3:RIL: asking for more
224358.659: video_decode:3:RIL: decoding 13821 bytes, 4600303us, flags 10, (cb:0x3e4e9da0)
224358.677: video_decode:3:RIL: asking for more
224390.496: video_decode:3:RIL: decoding 8669 bytes, 4440320us, flags 10, (cb:0x3e4e9dbc)
224390.517: video_decode:3:RIL: asking for more
224406.641: video_decode:3:RIL: decoding 4800 bytes, 4360326us, flags 10, (cb:0x3e4e9dd8)
224406.661: video_decode:3:RIL: asking for more
224420.517: video_decode:3:RIL: decoding 2339 bytes, 4320332us, flags 10, (cb:0x3e4e9df4)
224420.537: video_decode:3:RIL: asking for more
224453.355: video_decode:3:RIL: decoding 2394 bytes, 4400332us, flags 10, (cb:0x3e4e9e10)
224453.374: video_decode:3:RIL: asking for more
224468.612: video_decode:3:RIL: decoding 5020 bytes, 4520337us, flags 10, (cb:0x3e4e9e2c)
224468.632: video_decode:3:RIL: asking for more
224486.511: video_decode:3:RIL: decoding 2448 bytes, 4480343us, flags 10, (cb:0x3e4e9e48)
224486.530: video_decode:3:RIL: asking for more
224653.833: video_decode:3:RIL: decoding 2513 bytes, 4560343us, flags 10, (cb:0x3e4e9e64)
224653.853: video_decode:3:RIL: asking for more
224713.624: video_decode:3:RIL: decoding 27329 bytes, 4920348us, flags 30, (cb:0x3e4e9e80)
224713.642: video_decode:3:RIL: asking for more
224766.332: video_decode:3:RIL: decoding 9225 bytes, 4760359us, flags 10, (cb:0x3e4e9e9c)
224766.348: video_decode:3:RIL: asking for more
224779.945: video_decode:3:RIL: decoding 5881 bytes, 4680359us, flags 10, (cb:0x3e4e9eb8)
224779.961: video_decode:3:RIL: asking for more
224785.573: video_decode:3:RIL: decoding 2547 bytes, 4640359us, flags 10, (cb:0x3e4e9ed4)
224785.591: video_decode:3:RIL: asking for more
224814.986: video_decode:3:RIL: decoding 2622 bytes, 4720359us, flags 10, (cb:0x3e4e9ef0)
224815.003: video_decode:3:RIL: asking for more
224837.437: video_decode:3:RIL: decoding 5590 bytes, 4840359us, flags 10, (cb:0x3e4e9f0c)
224837.454: video_decode:3:RIL: asking for more
224852.394: video_decode:3:RIL: decoding 2636 bytes, 4800359us, flags 10, (cb:0x3e4e9f28)
224852.411: video_decode:3:RIL: asking for more
224941.373: video_decode:3:RIL: decoding 2374 bytes, 4880358us, flags 10, (cb:0x3e4e9f44)
224941.389: video_decode:3:RIL: asking for more
224990.446: video_decode:3:RIL: decoding 13314 bytes, 5240358us, flags 10, (cb:0x3e4e9f60)
224990.462: video_decode:3:RIL: asking for more
225023.058: video_decode:3:RIL: decoding 8752 bytes, 5080358us, flags 10, (cb:0x3e4e9f7c)
225023.078: video_decode:3:RIL: asking for more
225048.351: video_decode:3:RIL: decoding 5105 bytes, 5000358us, flags 10, (cb:0x3e4e9f98)
225048.371: video_decode:3:RIL: asking for more
225058.508: video_decode:3:RIL: decoding 2481 bytes, 4960358us, flags 10, (cb:0x3e4e9fb4)
225058.528: video_decode:3:RIL: asking for more
225085.374: video_decode:3:RIL: decoding 2484 bytes, 5040358us, flags 10, (cb:0x3e4e9fd0)
225085.394: video_decode:3:RIL: asking for more
225107.343: video_decode:3:RIL: decoding 4644 bytes, 5160358us, flags 10, (cb:0x3e4e9fec)
225107.362: video_decode:3:RIL: asking for more
225117.303: video_decode:3:RIL: decoding 2673 bytes, 5120358us, flags 10, (cb:0x3e4ea008)
225117.321: video_decode:3:RIL: asking for more
225197.247: video_decode:3:RIL: decoding 2127 bytes, 5200358us, flags 10, (cb:0x3e4ea024)
225197.266: video_decode:3:RIL: asking for more
225253.310: video_decode:3:RIL: decoding 12836 bytes, 5560358us, flags 10, (cb:0x3e4ea040)
225253.330: video_decode:3:RIL: asking for more
225311.350: video_decode:3:RIL: decoding 8612 bytes, 5400358us, flags 10, (cb:0x3e4ea05c)
225311.368: video_decode:3:RIL: asking for more
225318.159: video_decode:3:RIL: decoding 5067 bytes, 5320358us, flags 10, (cb:0x3e4ea078)
225318.178: video_decode:3:RIL: asking for more
225326.041: video_decode:3:RIL: decoding 2710 bytes, 5280358us, flags 10, (cb:0x3e4ea094)
225326.058: video_decode:3:RIL: asking for more
225358.392: video_decode:3:RIL: decoding 2410 bytes, 5360358us, flags 10, (cb:0x3e4ea0b0)
225358.410: video_decode:3:RIL: asking for more
225372.811: video_decode:3:RIL: decoding 5574 bytes, 5480358us, flags 10, (cb:0x3e4ea0cc)
225372.828: video_decode:3:RIL: asking for more
225383.517: video_decode:3:RIL: decoding 2583 bytes, 5440358us, flags 10, (cb:0x3e4ea0e8)
225383.534: video_decode:3:RIL: asking for more
225465.921: video_decode:3:RIL: decoding 2316 bytes, 5520358us, flags 10, (cb:0x3e4ea104)
225465.933: video_decode:3:RIL: asking for more
225523.481: video_decode:3:RIL: decoding 12573 bytes, 5880358us, flags 10, (cb:0x3e4ea120)
225523.494: video_decode:3:RIL: asking for more
225562.546: video_decode:3:RIL: decoding 8271 bytes, 5720358us, flags 10, (cb:0x3e4ea13c)
225562.562: video_decode:3:RIL: asking for more
225566.221: video_decode:3:RIL: decoding 4546 bytes, 5640358us, flags 10, (cb:0x3e4ea158)
225566.234: video_decode:3:RIL: asking for more
225580.588: video_decode:3:RIL: decoding 2503 bytes, 5600358us, flags 10, (cb:0x3e4ea174)
225580.600: video_decode:3:RIL: asking for more
225623.347: video_decode:3:RIL: decoding 2626 bytes, 5680358us, flags 10, (cb:0x3e4ea190)
225623.360: video_decode:3:RIL: asking for more
225635.841: video_decode:3:RIL: decoding 5393 bytes, 5800358us, flags 10, (cb:0x3e4ea1ac)
225635.854: video_decode:3:RIL: asking for more
225666.334: video_decode:3:RIL: decoding 2575 bytes, 5760358us, flags 10, (cb:0x3e4ea1c8)
225666.348: video_decode:3:RIL: asking for more
225724.574: video_decode:3:RIL: decoding 2730 bytes, 5840358us, flags 10, (cb:0x3e4ea1e4)
225724.587: video_decode:3:RIL: asking for more
225772.234: video_decode:3:RIL: decoding 11491 bytes, 6160358us, flags 10, (cb:0x3e4e97d4)
225772.248: video_decode:3:RIL: asking for more
225814.333: video_decode:3:RIL: decoding 7261 bytes, 6040356us, flags 10, (cb:0x3e4e97b8)
225814.344: video_decode:3:RIL: asking for more
225827.537: video_decode:3:RIL: decoding 5143 bytes, 5960353us, flags 10, (cb:0x3e4e97f0)
225827.548: video_decode:3:RIL: asking for more
225844.679: video_decode:3:RIL: decoding 2857 bytes, 5920352us, flags 10, (cb:0x3e4e979c)
225844.689: video_decode:3:RIL: asking for more
225891.662: video_decode:3:RIL: decoding 2496 bytes, 6000352us, flags 10, (cb:0x3e4e9828)
225891.674: video_decode:3:RIL: asking for more
225895.068: video_decode:3:RIL: decoding 4114 bytes, 6120350us, flags 10, (cb:0x3e4e980c)
225895.080: video_decode:3:RIL: asking for more
226057.897: video_decode:3:RIL: decoding 2355 bytes, 6080349us, flags 10, (cb:0x3e4e9844)
226057.908: video_decode:3:RIL: asking for more
226099.888: video_decode:3:RIL: decoding 26365 bytes, 6280349us, flags 30, (cb:0x3e4e98b4)
226099.900: video_decode:3:RIL: asking for more
226114.744: video_decode:3:RIL: decoding 5685 bytes, 6240342us, flags 10, (cb:0x3e4e9898)
226114.758: video_decode:3:RIL: asking for more
226185.112: video_decode:3:RIL: decoding 2670 bytes, 6200340us, flags 10, (cb:0x3e4e9780)
226185.125: video_decode:3:RIL: asking for more
226242.586: video_decode:3:RIL: decoding 10679 bytes, 6600340us, flags 10, (cb:0x3e4e98d0)
226242.600: video_decode:3:RIL: asking for more
226275.839: video_decode:3:RIL: decoding 8748 bytes, 6440337us, flags 10, (cb:0x3e4e987c)
226275.851: video_decode:3:RIL: asking for more
226291.469: video_decode:3:RIL: decoding 5643 bytes, 6360335us, flags 10, (cb:0x3e4e9908)
226291.484: video_decode:3:RIL: asking for more
226314.483: video_decode:3:RIL: waiting for a cb structure

// and we're stuck
Comment 10 minfrin 2016-12-07 12:52:24 UTC
I got some information back from the RPi people at https://github.com/raspberrypi/firmware/issues/686 as follows:

> video_decode:3:RIL:resolution changing is the codec reports that
> the stream has changed format, ie there was a new set of header
> bytes in the stream. The codec processing will stop until you
> disable the output port and reconfigure it for the new
> resolution/format as buffers need reallocating.

> waiting for a cb structure means that you have fed in the maximum
> number of frames to the input FIFO and it is refusing to take any
> more, normally because you aren't draining the output (because you
> haven't dealt with the resolution change). There are 32 callback
> structures allocated which more than covers most normal situations.

The hang is happening in gstomxh264dec.

What seems to be happening in my case is that the graphics card is noticing that the resolution has changed, logged as follows:

222754.226: video_decode:3:RIL:resolution changing

What is supposed to happen is that gst-omx drains the port, then reconfigures the port, and only then will further packets be processed by the graphics card.

What appears to be happening is that gst-omx drains the port, but then never reconfigures the port. The graphics card buffers get full, and then hang. The whole pipeline is hung at this point.

Does gst-omx support dynamic resolution changes?
Comment 11 minfrin 2016-12-10 12:52:32 UTC
Yes, we do support dynamic resolution changes, although the mechanism currently doesn't seem to work.

After some more picking apart of the code, it seems we hang because we haven’t reconfigured the port correctly in response to OMX_EventPortSettingsChanged.

What we’re supposed to do when OMX_EventPortSettingsChanged arrives is disable the port, reconfigure, then enable the port again.

What we do instead is miss out the disable step - we attempt to reconfigure and enable only.

The reason we miss out the disable step is because the port reports itself as already disabled, and so the disable step is skipped.

It appears we might have never been enabled because the following two lines have appeared:

0:00:26.136954131 16918 0x740c8630 DEBUG            omxvideodec gstomxvideodec.c:963:gst_omx_video_dec_reconfigure_output_port:<omxh264dec-omxh264dec0> Failed to negotiate with feature memory:GLMemory
0:00:26.245094851 16918 0x740c8630 ERROR            omxvideodec gstomxvideodec.c:971:gst_omx_video_dec_reconfigure_output_port:<omxh264dec-omxh264dec0> Failed to negotiate RGBA for EGLImage

At a glance, these two log statements seem unrelated, however they appear here, where the second is a fallback after the failure of the first:

     gst_omx_port_get_port_definition (self->dec_out_port, &port_def);
     GST_VIDEO_DECODER_STREAM_LOCK (self);
     state = gst_video_decoder_set_output_state (GST_VIDEO_DECODER (self),
         GST_VIDEO_FORMAT_RGBA, port_def.format.video.nFrameWidth,
         port_def.format.video.nFrameHeight, self->input_state);

     /* at this point state->caps is NULL */
     if (state->caps)
       gst_caps_unref (state->caps);
     state->caps = gst_video_info_to_caps (&state->info);
     gst_caps_set_features (state->caps, 0,
         gst_caps_features_new (GST_CAPS_FEATURE_MEMORY_GL_MEMORY, NULL));

     /* try to negotiate with caps feature */
     if (!gst_video_decoder_negotiate (GST_VIDEO_DECODER (self))) {

       GST_DEBUG_OBJECT (self,
           "Failed to negotiate with feature %s",
           GST_CAPS_FEATURE_MEMORY_GL_MEMORY);

       if (state->caps)
         gst_caps_replace (&state->caps, NULL);

       /* fallback: try to use EGLImage even if it is not in the caps feature */
       if (!gst_video_decoder_negotiate (GST_VIDEO_DECODER (self))) {
         gst_video_codec_state_unref (state);
         GST_ERROR_OBJECT (self, "Failed to negotiate RGBA for EGLImage");
         GST_VIDEO_DECODER_STREAM_UNLOCK (self);
         goto no_egl;
       }
     }

This code then follows the no_egl path, which seems to bypass this step:

     err = gst_omx_port_set_enabled (self->dec_out_port, TRUE);
     if (err != OMX_ErrorNone)
       goto no_egl;

The decoder is now running, but in an unexpectedly “disabled” state, which throws everything off track.

It seems there are two problems, the first being that gst_video_decoder_negotiate() fails twice in a row.

The first gst_video_decoder_negotiate() fails because the downstream omxh264enc element does not support GLMemory for some reason. It appears support for this was added to the decoder, but not the encoder.

The second gst_video_decoder_negotiate() fails because the downstream omxh264enc element does not support RGBA. It is not clear why RGBA was chosen, given that it is not a supported input format for much of OMX.

Given that we failed to negotiate twice, we now follow the no_egl path, however this path doesn't appear to place OMX in a state where it can renegotiate in future.

It is not yet clear to me what the no_egl path does that is different to the path that would have been taken if this RPI specific code wasn't there at all.
Comment 12 minfrin 2016-12-11 02:29:18 UTC
Simple workaround:

diff --git a/omx/gstomxvideodec.c b/omx/gstomxvideodec.c
index 1e7d436..6c07c9f 100644
--- a/omx/gstomxvideodec.c
+++ b/omx/gstomxvideodec.c
@@ -928,6 +928,7 @@ gst_omx_video_dec_reconfigure_output_port (GstOMXVideoDec * self)
 
   /* At this point the decoder output port is disabled */
 
+goto skip;
 #if defined (USE_OMX_TARGET_RPI) && defined (HAVE_GST_GL)
   {
     OMX_STATETYPE egl_state;
@@ -1106,6 +1110,7 @@ gst_omx_video_dec_reconfigure_output_port (GstOMXVideoDec * self)
     self->eglimage = FALSE;
   }
 #endif
+skip:
   port = self->dec_out_port;
 
   /* Update caps */

Seems the GST_CAPS_FEATURE_MEMORY_GL_MEMORY support inside gst_omx_video_dec_reconfigure_output_port() breaks dynamic resolution changes.

gst_omx_video_dec_reconfigure_output_port() is called on dynamic resolution callbacks, but instead of reading the dynamic resolution from the callback, we hardcode the resolution instead to GLMemory+RGBA, then RGBA, completely blowing away the resolution we just received.

I am not sure what the fix is, as I am not 100% sure what the GLMemory code is trying to do. It looks like this code is in the wrong place, we should be negotiating GLMemory support, not forcing GLMemory support downstream surely?
Comment 13 minfrin 2016-12-11 12:43:21 UTC
Moved the dynamic resolution change issue into its own bug: https://bugzilla.gnome.org/show_bug.cgi?id=775948
Comment 14 minfrin 2016-12-11 20:44:04 UTC
Created attachment 341776 [details] [review]
Swap flush with stop

Cleaner version of the hack patch - this seems to be working for me so far.
Comment 15 minfrin 2016-12-11 20:47:40 UTC
Created attachment 341777 [details] [review]
Swap flush with stop

Patch cleaner still.
Comment 16 minfrin 2016-12-18 19:49:33 UTC
Created attachment 342182 [details] [review]
Flush before we stop the srcpad loop. Flushing could hang if  output port queue of pending buffers was empty.

Formal fix.
Comment 17 Sebastian Dröge (slomo) 2016-12-19 08:52:34 UTC
commit a1e613dd89b7b8d07d12fdb2657e75aa9ece5450
Author: Graham Leggett <minfrin@sharp.fm>
Date:   Sun Dec 18 19:46:55 2016 +0000

    omx*dec: Flush before we stop the srcpad loop
    
    Flushing could otherwise hang if output port queue of pending buffers was empty
    
    https://bugzilla.gnome.org/show_bug.cgi?id=774654