After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 762509 - vaapidecoder: h264: decoder stores too many pictures in the DPB before output
vaapidecoder: h264: decoder stores too many pictures in the DPB before output
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gstreamer-vaapi
git master
Other Linux
: Normal normal
: git master
Assigned To: Matt Staples
GStreamer Maintainers
P1
: 781748 (view as bug list)
Depends on: 783588
Blocks:
 
 
Reported: 2016-02-23 05:45 UTC by Jan Schmidt
Modified: 2017-07-10 17:50 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
debug log showing number of pictures in DPB (65.38 KB, text/x-csrc)
2016-02-23 05:45 UTC, Jan Schmidt
  Details
Patch for discussion. May break more complex streams. (1.13 KB, patch)
2016-02-23 05:46 UTC, Jan Schmidt
none Details | Review
Proposed fix (1.19 KB, patch)
2016-06-24 17:19 UTC, Jan Schmidt
rejected Details | Review
push frames downstream asap instead of waiting until they're ejected from the DPB (4.25 KB, patch)
2017-03-24 13:31 UTC, Matt Staples
none Details | Review
interlaced sample to test (976.56 KB, video/mp4)
2017-04-07 23:51 UTC, sreerenj
  Details
sample having negative poc which is also became undecodable with the proposed patch (1.17 MB, application/octet-stream)
2017-04-08 00:01 UTC, sreerenj
  Details
send frames out as soon as possible (4.89 KB, patch)
2017-05-26 15:19 UTC, Matt Staples
reviewed Details | Review
infrastructure for vaapide decoder-specific properties (22.28 KB, patch)
2017-06-05 14:33 UTC, Matt Staples
reviewed Details | Review
libs: decoder: h264: push frames as soon as possible (5.02 KB, patch)
2017-07-07 17:47 UTC, Víctor Manuel Jáquez Leal
committed Details | Review

Description Jan Schmidt 2016-02-23 05:45:13 UTC
Created attachment 321918 [details]
debug log showing number of pictures in DPB

The H.264 VAAPI decoder only outputs a frame if the DPB has as many frames as the DPB size (as taken from the maximum of the level/profile info and the SPS num_ref_frames), which means it ends up decoding and storing lots of decoded frames which could otherwise be already output.

The frames are then all output together when an IDR arrives. Even in a sequence with only I & P slices, the decoder accumulates many frames (up to 9 in the attached log, to which I've added a printf showing the DPB growing and shrinking).

That also means the VAAPI decoder's response to the latency query is way too low, so the decoder won't work properly in a live playback scenario.

Attaching a patch for discussion which attempts to bump out pictures from the DPB more often.
Comment 1 Jan Schmidt 2016-02-23 05:46:27 UTC
Created attachment 321919 [details] [review]
Patch for discussion. May break more complex streams.
Comment 2 sreerenj 2016-03-07 15:33:18 UTC
In theory, we are supposed to invoke the dpb_bump() and eventually outputting from dpb too after each frame decode, *if needed* . If doing dpb_bump() after just finishing the dpb_add() (which is your patch) is fixing something, then real issue could be somewhere else IMHO. Could you please share the video sample?
Comment 3 Jan Schmidt 2016-03-31 16:18:05 UTC
I put up a sample stream at http://noraisin.net/down/h264_constrained_baseline.pcap

You can play it with

gst-launch-1.0 filesrc location=h264_constrained_baseline.pcap ! pcapparse ! application/x-rtp,media=video,clock-rate=90000,encoding-name=H264,payload=96 ! rtph264depay ! vaapisink

If you check the logs, you'll see it that when it gets to the IDR, it has a large backlog of frames that are all then decoded and output, even though they could have been output earlier:

0:00:03.551815645  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder_h264.c:3255:init_picture: <IDR>
0:00:03.551831009  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 16 (surface 0x04000003)
0:00:03.551851334  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 17 (surface 0x04000004)
0:00:03.551863345  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 18 (surface 0x04000005)
0:00:03.551873942  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 19 (surface 0x04000006)
0:00:03.551885331  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 20 (surface 0x04000007)
0:00:03.551895753  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 21 (surface 0x04000008)
0:00:03.551906464  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 22 (surface 0x04000009)
0:00:03.551916365  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 23 (surface 0x0400000a)
0:00:03.551926310  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:376:push_frame: push frame 24 (surface 0x0400000b)
0:00:03.551934691  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder_h264.c:1893:init_picture_poc_0: decode picture order count type 0
0:00:03.551942149  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder_h264.c:4128:decode_slice: slice (29599 bytes)
0:00:03.551993820  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder_h264.c:2147:init_picture_refs_pic_num: decode picture numbers
0:00:03.551999397  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder_h264.c:2826:exec_picture_refs_modification: execute ref_pic_list_modification()
0:00:03.552005130  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder_objects.c:262:gst_vaapi_picture_decode: decode picture 0x0400000c
0:00:03.552156281  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 16 (surface 0x04000003)
0:00:03.552198621  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 17 (surface 0x04000004)
0:00:03.552208102  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 18 (surface 0x04000005)
0:00:03.552223381  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 19 (surface 0x04000006)
0:00:03.552235232  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 20 (surface 0x04000007)
0:00:03.552248771  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 21 (surface 0x04000008)
0:00:03.552258471  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 22 (surface 0x04000009)
0:00:03.552267505  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 23 (surface 0x0400000a)
0:00:03.552276886  4169       0xc520f0 DEBUG                  vaapi gstvaapidecoder.c:397:pop_frame: pop frame 24 (surface 0x0400000b)
Comment 4 sreerenj 2016-04-11 13:13:21 UTC
Okay, finally I did some analysis :)

-- This patch is actually tweaking the spec....As per spec we are allowed to do dpb_bump() in 6 scenarios (section C.4.5.3) which all are covered
by the current code .

In detail: the only cases where we can do dpb_bump() other than reaching a dpb full condition are:
   1: the stream should have adaptive memory control enabled for ref picture  marking (which is not the case for attached stream)
   2: an IDR frame appeared in the stream

So I am not confident whether the patch is safe for all the decoding scenarios. For eg: if there is a p frame (which is not yet decoded) which encoded with lower poc than the poc of currently outputting picture (i mean currently outputting picture due to the enablement of your patch) , it will break the whole process.

-- As per my knowledge, for a correctly encoded stream, it should have a VUI() header with bitstream_restriction_flag set to TRUE and max_dec_frame_buffering
field set with appropriate value which get preference over the value we calculated otherwise from LEVEL limits.
So in theory, the stream you have attached should have the VUI header extension with max_dec_frame_buffering value set to 1. Because all the frames have at most one ref frames, and i can't see any order change in POC. But unfortunately this stream doesn't have the VUI extension, so we calculated the dpb size
from LEVEL specification (Table provided in A.3.1) which is 9.
Which ended up accumulating more frames in dpb before outputting the queued frames.

-- Are you getting better performance with software decoder?? I can see buffer dropping while using avdec_h264 too.
Comment 5 Jan Schmidt 2016-04-11 13:19:58 UTC
To be clear, this patch is *not* safe, and definitely causes trouble with different streams. It's only here to make the problem clearer.
Comment 6 sreerenj 2016-04-11 13:29:51 UTC
(In reply to Jan Schmidt from comment #5)
> To be clear, this patch is *not* safe, and definitely causes trouble with
> different streams. It's only here to make the problem clearer.

Yup, but is there any problem here? :)
Everything working as expected, ideally the stream should have a VUI extension, but missing unfortunately, which causes more queuing .
Comment 7 Stanislav 2016-04-11 21:50:49 UTC
I have found 1 more important bug - when used vaapih264enc and then vaapidecode in same pipeline - image is sluttering every few seconds - checked on 2 different intel CPU's (i7 and recent i5) - BUT patch from this thread fixes it, and also it reduces decoder latency by about 50%.
example pipeline:

gst-launch-1.0 decklinkvideosrc mode=11 connection=2 ! deinterlace mode=1 method=4 fields=2 ! videoconvert ! vaapipostproc format=23 ! vaapih265enc rate-control=cqp ! video/x-h265,width=1920,height=1080,framerate=25/1,stream-format=byte-stream ! h264parse ! vaapidecodebin ! vaapisink sync=false
Comment 8 Stanislav 2016-04-11 21:55:19 UTC
mistake in example pipeline, correct version:

gst-launch-1.0 decklinkvideosrc mode=11 connection=2 ! deinterlace mode=1 method=4 fields=2 ! videoconvert ! vaapipostproc format=23 ! vaapih264enc rate-control=cqp ! video/x-h264,width=1920,height=1080,framerate=25/1,stream-format=byte-stream ! h264parse ! vaapidecodebin ! vaapisink sync=false
Comment 9 Jan Schmidt 2016-06-24 17:19:03 UTC
Created attachment 330336 [details] [review]
Proposed fix

I was passed this patch and asked to submit it upstream. It provides an actual fix for the problem by correcting an incorrect comparison - the number of reference frames in the SPS should be the maximum frames we retain in the decoder, not the minimum
Comment 10 sreerenj 2016-06-27 13:03:35 UTC
Review of attachment 330336 [details] [review]:

I am rejecting this patch because of 2 reasons:

1: It is breaking the MVC playback.
You can try with reference streams MVCDS-1.264 or any of the MVCDS samples.

2: As per the spec (E.2.1)  The value of max_dec_frame_buffering shall be
*greater than or equal* to max_num_ref_frames (== sps->num_ref_frames) . And if the syntax element
max_dec_frame_buffering is not present(== in theory we consider this as no VUI header, but still approximating the value of max_dec_frame_buffering for finding the dpb size),
the value should be inferred to be equal to MaxDpbFrames (which is equal to 9 in the mentioned stream).
Comment 11 Matt Staples 2017-03-20 23:09:23 UTC
I have had this same problem with a number of sources (specifically, security cameras accessed via RTSP) that don't provide the optional vui header information. I see the same behavior where lots of decoded pictures are held in the DPB, and not sent downstream until they're bumped out by the next reference frame.  For some sources, the sink ends up dropping many of those frames (as too late), creating a very jittery presentation. 

Note that if I replace the vaapi decoder with avdec_h264, the same streams play smoothly and with low latency.  In order to get the vaapi decoder to play smoothly, without dropping any frames, I have to add about 500ms of latency to the pipeline, so that's not really a viable solution for viewing and controlling PTZ cameras.

So I think there may be two problems here: one is that given the current design, the vaapi h264 decoder element seems to be under-reporting its latency.  If it's going to potentially hold on to, say, 16 frames, then I think it needs to report that much latency so that the sink doesn't end up dropping frames when they're eventually released.

The second problem is that I think the current design could be improved to reduce latency by following the ffmpeg library's model and outputting frames for display as soon possible, but while leaving references to those frames in the DPB per requirements of the h264 spec.  (Even if a is still potentially needed for reference, it should be displayable as soon as its fully decoded and there are no sequence gaps that need to be filled in by later frames.)

So I've been experimenting with that approach, and I believe I've proved the concept, but I don't know the best way to determine when there are gaps in the sequence.  h264dec.c from ffmpeg has some complicated bookkeeping along those lines, but I'm not yet familiar enough either set of code to see how to apply the ffmpeg logic to vaapidecoder_h264.  
Is that something that could be inferred by comparing the number of 'ready' frames in the DPB with the GstVaapiDecoderH264Private members, RefPicList0_count and RefPicList1_count?
Comment 12 Jan Schmidt 2017-03-22 05:40:42 UTC
I think you're on the right track, and it's like what I wanted to try but have never found time for.
Comment 13 Matt Staples 2017-03-24 13:31:21 UTC
Created attachment 348647 [details] [review]
push frames downstream asap instead of waiting until they're ejected from the DPB

This patch works for all of the camera sources that I tried, including some that that produce non-trivial GOP patterns like I(BBrBP)*.  Still, it relies to some extent on the POC being counted by 2s. So it might need some tweaks to handle other encodings, for which I didn't have any examples to test against.
Comment 14 Jan Schmidt 2017-04-04 15:10:56 UTC
I was hoping that Sree might have time to review your patch, with his stronger knowledge of the spec. I'm trying to find time in case he doesn't have any spare.
Comment 15 Víctor Manuel Jáquez Leal 2017-04-07 12:17:18 UTC
@sree ping?
Comment 16 sreerenj 2017-04-07 23:50:50 UTC
I just tried to decode few samples before reviewing to make sure it actually works. But unfortunately it is not working for interlaced samples.
I can have a look into the patch when it doesn't break anything :)
Comment 17 sreerenj 2017-04-07 23:51:41 UTC
Created attachment 349510 [details]
interlaced sample to test
Comment 18 sreerenj 2017-04-07 23:57:05 UTC
As a side note, POC can be any valu including -ve numbers.
Comment 19 sreerenj 2017-04-08 00:01:38 UTC
Created attachment 349511 [details]
sample having negative poc which is also became undecodable with the proposed patch
Comment 20 Matt Staples 2017-04-08 01:15:16 UTC
Thanks.  It may be a few days before I have time to dig back into it, but the samples should be really helpful.
Comment 21 Víctor Manuel Jáquez Leal 2017-04-27 06:12:50 UTC
*** Bug 781748 has been marked as a duplicate of this bug. ***
Comment 22 Matt Staples 2017-05-25 12:47:57 UTC
Sorry, I haven't had time to work much on this, but I do have an update.

Correcting my previous patch so that it plays interleaved files was trivial.  It's just a matter of adding "&& gst_vaapi_frame_store_is_complete (priv->dpb[found_index])" after "*can_be_output = found_picture...".

I could post an updated patch that at fixes that problem. It does get both of the attached samples to play. The reason I haven't posted an update yet is that, while they play, frames can be played out of order if the input POC values are out of order as they are in the attached samples.

After initial failures at trying to correct that problem, I started studying the libav (libav) decoder to see how it handled that scenario.  Turns out it does not properly handle out-of-order frames either.  Unlike my currently broken solution, which actually sends them out of order, libav appears to just drop any frames that have earlier POCs than the most recently sent frame.

A quick test with the second sample showed that vaapidecode (without my changes) produced 73 frames, while libav produced only 66 frames.

So if libav's solution is an acceptable, I believe I could quickly update my patch to work as well as that. If not, then maybe there's a solution patch that uses an optional mode (e.g, "quick" vs "strict" modes, defaulting to the current strict behavior)?
Comment 23 sreerenj 2017-05-25 17:51:22 UTC
Which frame poc is out of order?I can see all frames have increasing POC while outputting. But We have some heuristics in gst-vaapi to handle the loss-of picture to make the playback smooth. Are you talking about the same ?
Comment 24 Matt Staples 2017-05-26 00:28:23 UTC
The decode order in attachment 349511 [details] starts with POC values -72, -71, -76, -75, -78, -77, -74, -73,...

With the original vaapi code, the first few frames (or rather field pairs) are held by the DPB and then sent out in proper numerical order starting with field pair (-77,-78).

My change causes the first complete field pair (-71,-72) to go out as soon as it's fully decoded, so when it subsequently completes decoding field pair (-76, -75), it's too late to send that out in order.

The libav decoder appears to handle that problem by simply dropping any frames that can't be sent out in order.  I'd like to amend my patch to do the same thing, however, I'm not sure that's a spec-compliant implementation.  Does the libav decoder pass compliance tests?
Comment 25 sreerenj 2017-05-26 06:28:05 UTC
(In reply to Matt Staples from comment #24)
> The decode order in attachment 349511 [details] starts with POC values -72,
> -71, -76, -75, -78, -77, -74, -73,...
> 
> With the original vaapi code, the first few frames (or rather field pairs)
> are held by the DPB and then sent out in proper numerical order starting
> with field pair (-77,-78).
> 

So what is the issue here. POC of frames in decoding order won't be incremental if there are B frames.
For eg: 
Decode order:              I P B B P (decode order)
POC in Dec order:          0 6 2 4 12
POC in Display order:      0 2 4 6 8


For the above stream:
Decode order:  -72/-71  -76/-75   -78/-77  -74/-73 
Display order: -78/-77  -76/-75   -74/-73  -71/-72
Comment 26 Matt Staples 2017-05-26 15:19:49 UTC
Created attachment 352653 [details] [review]
send frames out as soon as possible

This patch replaces my previous patch, with corrections for handling interleaved data, and for handling negative POC values.  It plays all of the attached samples (as well as all other sources I've tested with).

In reply to sreerenj's comment (#25):
I agree with everything you've said; the un-modified vaapi code decodes and displays in the correct order.  This bug only has to do with how long it holds onto frames before sending them downstream.  So it's about timing, not order.

The display-order problems that I mentioned above were just with my proposed patch, and only with the attached samples, not with any other sources I've tested with, including those with B-frames.  
I've been struggling to correct that, and I believe this new patch works as well as the software decoder does with those files.

My patch doesn't have any problems with your first example. It sends the initial I-frame (POC=0) downstream immediately.  The next decoded frame, POC=6, indicates a gap, so it can't be sent downstream until after frames with POC=2 and POC=4 are decoded.  So the timing of sending frames downstream looks something like the following: 
Decode order:              I  P  B  B    P
POC in Dec order:          0  6  2  4    12
Frames sent after decode:   0     2  4 6

The attached test file has the problem that first decoded frame is not also the first first frame in display order. (I believe that's unusual or maybe something that only happens when a file starts in mid-GOP?)  Anyway, frame -72/-71 gets sent downstream immediately.  Then, when frame -76/-75 is decoded, it's too late to display it in the proper order, so I have no choice but to drop it and the next couple frames.  The next frame sent downstream is -70/-69 and then the rest of file plays in the proper order with no dropped frames. 

Note that the software decoder appears to have the same issue and actually drops more frames with that file than my patched code does.
Comment 27 sreerenj 2017-05-26 21:43:23 UTC
(In reply to Matt Staples from comment #26)
> Created attachment 352653 [details] [review] [review]
> send frames out as soon as possible
> 
> 
> The attached test file has the problem that first decoded frame is not also
> the first first frame in display order. (I believe that's unusual or maybe
> something that only happens when a file starts in mid-GOP?)  Anyway, frame
> -72/-71 gets sent downstream immediately.  Then, when frame -76/-75 is
> decoded, it's too late to display it in the proper order, so I have no
> choice but to drop it and the next couple frames.  The next frame sent
> downstream is -70/-69 and then the rest of file plays in the proper order
> with no dropped frames. 

It could be rare, but technically it is valid to have a GOP of "IBB..." in decoding order, and B frames come before the first I frame in display order. So dropping frame is not seems to be the solution.
Comment 28 Víctor Manuel Jáquez Leal 2017-05-29 16:56:22 UTC
Review of attachment 352653 [details] [review]:

Regardless Sree's comment about the spec problem this patch might bring, let me comment the patch itself:

1. You included in the patch a change to the common submodule: https://bugzilla.gnome.org/attachment.cgi?id=352653&action=diff#a/common_sec1
   :) I guess that's no needed in you patch.

::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c
@@ +838,3 @@
+        found_picture = pic;
+        found_index = i;
+        found_poc = pic->base.poc;

thanks for removing this comma operator. It always made me wary.

@@ +1692,3 @@
+                  &can_output)) >= 0) && can_output) {
+    dpb_output (decoder, priv->dpb[found_index]);
+  }

IMO this code is a bit complex. I would rewrite it like this:

while (TRUE) {
  found_index = dpb_find_lowest_poc_for_output (decoder, priv->current_picture, NULL, &can_output);
  if (found_output < 0 || !can_output)
    break;
  dpb_output (decoder, priv->dpb[found_index])
}

@@ +1727,3 @@
+  GST_DEBUG ("added %d to dpb", picture->base.poc);
+  /* TODO: only call the following if we're in optional
+     'low-latency' mode? */

we would need to add a property for all the decoders when it only would make sense for H264, and that's confusing.

Another option would be, when registering the element, the property will be added if the encoder is H264.
Comment 29 Matt Staples 2017-05-29 18:49:57 UTC
(In reply to sreerenj from comment #27)
> (In reply to Matt Staples from comment #26)
> > Created attachment 352653 [details] [review] [review] [review]
> > send frames out as soon as possible
> > 
> 
> It could be rare, but technically it is valid to have a GOP of "IBB..." in
> decoding order, and B frames come before the first I frame in display order.
> So dropping frame is not seems to be the solution.

Could you point me to an example of that?  I'm curious how libav decoder would handle that. I suspect it wouldn't handle it well since it drops frames in a similar way to my proposed patch.  As I've learned though the gstlibavdec element is not requesting strict compliance from the ffmpeg decoder code, so I believe the vaapidecoder is simply more correct.

I found that unlike gstlibavenc, which exposes a 'compliance' property, gstlibavdec just uses the default value of FF_COMPLIANCE_NORMAL.  By hacking the code to use FF_COMPLIANCE_STRICT, I got it to behave more like vaapidecoder does. Specifically, with the 2nd attached file, it holds on to about 9 more frames before sending any downstream, and it doesn't appear to drop any frames.  Whereas the as-is (FF_COMPLIANCE_NORMAL) version of libavdec behaves more like my patched version of vaapidecoder.

So anyway, I think there's a president for having some sort of compliance property to allow users to decide between strict compliance and lower latency (for applicable video sources).  It might be even be nice to add that property to both libavdec and vaapidec, although to avoid introducing problems with existing code, they'd need to have different default behavior.

Would that be an acceptable solution?
Comment 30 sreerenj 2017-05-29 20:10:42 UTC
(In reply to Matt Staples from comment #29)
> (In reply to sreerenj from comment #27)
> > (In reply to Matt Staples from comment #26)
> > > Created attachment 352653 [details] [review] [review] [review] [review]
> > > send frames out as soon as possible
> > > 
> > 
> > It could be rare, but technically it is valid to have a GOP of "IBB..." in
> > decoding order, and B frames come before the first I frame in display order.
> > So dropping frame is not seems to be the solution.
> 
> Could you point me to an example of that? 

The attached sample is having similar pattern IBBBP... I don't have other samples with me now.But I am pretty sure I have seen similar samples before. IIRC, it was something encoded with MediaSDK.
Comment 31 sreerenj 2017-05-29 20:23:23 UTC
I liked the idea of compliance property STRICT/NORMAL(may be something other than normal could be more ideal since we tweaking the spec).

BTW, there are other things which we should consider: your patch checking for a difference in POC <=2 , which is not true always. There could be poc's like 0,4,8 with only I and P frames. A perfect real life use case example is Hierarchical P frame encoded sample with 3 layers were the application(streaming source or something) or the decoder it self is dropping the 3rd layer.

Your patch will help in to improve the latency issue of some samples like the one mentioned in comment 3. 
But I believe, the patch should go under a property, not the default case for sure.
Comment 32 sreerenj 2017-05-29 20:38:44 UTC
H265 decoder seems to have a better mechanism to output the frames from dpb. Initially I thought we could adopt something from H265, but it is not easy since HEVC has a mandatory field in SPS to check the numer of reorder frames.
Comment 33 Matt Staples 2017-05-30 14:04:59 UTC
(In reply to sreerenj from comment #31)
> I liked the idea of compliance property STRICT/NORMAL(may be something other
> than normal could be more ideal since we tweaking the spec).
> 
Ok, I'll work on adding a new property.  And per the code review, I'll try to make it specific to h264 instead of being available to all flavors of vaapidecode.
I was assuming it would default to the current behavior, which I might call "strict". Maybe "low-latency" for the new mode?

> BTW, there are other things which we should consider: your patch checking
> for a difference in POC <=2 , which is not true always. There could be poc's
> like 0,4,8 with only I and P frames. A perfect real life use case example is
> Hierarchical P frame encoded sample with 3 layers were the
> application(streaming source or something) or the decoder it self is
> dropping the 3rd layer.
> 
In the case of POC spacing > 2, I the patch would be unable to send frames frames sooner.  So it wouldn't provide low-latency, but since all frames will eventually get sent in the proper order upon normal calls to dpb_bump(), it will naturally fall back to original (strict) behavior.
Comment 34 Orestis Floros 2017-06-02 21:15:27 UTC
(In reply to Matt Staples from comment #33)
> Ok, I'll work on adding a new property.  And per the code review, I'll try
> to make it specific to h264 instead of being available to all flavors of
> vaapidecode.

Did you progress with that? A vaapih264dec-specific property is needed for bug 732265 and bug 732266 too.
Comment 35 Matt Staples 2017-06-03 01:32:21 UTC
(In reply to Orestis Floros from comment #34)
> (In reply to Matt Staples from comment #33)
> > Ok, I'll work on adding a new property.  And per the code review, I'll try
> > to make it specific to h264 instead of being available to all flavors of
> > vaapidecode.
> 
> Did you progress with that? A vaapih264dec-specific property is needed for
> bug 732265 and bug 732266 too.

No, I was hoping to get some time to work on it this weekend.  I didn't realize I'd be in uncharted territory though, so that should make it interesting!
Comment 36 Matt Staples 2017-06-05 14:33:55 UTC
Created attachment 353187 [details] [review]
infrastructure for vaapide decoder-specific properties

This patch adds infrastructure to vaapidecide for allowing each type-specific decoder to define its own properties.  h264's new low-latency property serves as a first example.  I added it as a separate patch (as opposed to combining it with the previous patch for this bug) since that might make it easier to review this feature independently, and perhaps use this new infrastructure for the other two open issues that need to add h264-specific properties.
Comment 37 Matt Staples 2017-06-06 13:27:52 UTC
Review of attachment 353187 [details] [review]:

::: gst-libs/gst/vaapi/gstvaapidecoder_h264.c
@@ +4966,3 @@
+      GstVaapiDecoderPropsH264 *propsh264 = (GstVaapiDecoderPropsH264 *)
+          props;
+

This doesn't support changes to properties after decoding starts.  If that's a requirement, then decoder->priv could be modified to hold a reference to propsh264 instead of simply copying the property value over here.

::: gst/vaapi/gstvaapidecode.c
@@ +985,3 @@
   g_mutex_clear (&decode->surface_ready_mutex);
 
+  gst_vaapi_decoder_props_replace (&decode->decoder_props, NULL);

This should be wrapped in "if (decoder->decoder_props)"
Comment 38 Víctor Manuel Jáquez Leal 2017-06-06 15:12:54 UTC
Review of attachment 353187 [details] [review]:

Overall the proposal, from my point of view, is bit convoluted. I guess it can be done simpler by just installing properties, with a function pointer in decoders map structure, at class_init()

::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
@@ +2164,2 @@
   guint num_roi;
+#endif

This changeset is unrelated with the purpose of the patch.
Comment 39 Matt Staples 2017-06-06 17:00:56 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #38)
> Review of attachment 353187 [details] [review] [review]:
> 
> Overall the proposal, from my point of view, is bit convoluted. I guess it
> can be done simpler by just installing properties, with a function pointer
> in decoders map structure, at class_init()
> 
> ::: gst-libs/gst/vaapi/gstvaapiencoder_h264.c
> @@ +2164,2 @@
>    guint num_roi;
> +#endif
> 
That's what I initially thought too.  I ran into several problems though:

First, the various decoder classes derive from GstVaapiMiniObject instead of GObject, so the normal property registration scheme doesn't work for them.

Second, the decoder object doesn't get constructed when the GstVaapiDecode is constructed.  Instead, that's delayed until caps are negotiated.  Also, the decoders class_init isn't called until the first object is created. 

So if properties were installed in the h264 decoder's class_init, then they wouldn't be available to gst-inspect or gst-launch.  That's why I added the decoder-specific install_props function.

A related problem is that gst-launch will set properties before the pipeline is linked together and playing. Hence the addition of the GstVaapiDecoderProps class.  That allows codec-specific properties to get configured before the runtime caps negotiation that eventually creates the codec-specific Decoder object.  When the Decoder object is created, the 'props' object gets passed to its constructor to initialize it.

So it's definitely more involved than I had originally envisioned, but I couldn't think of anything simpler that would work with the existing design.  I'm totally open to suggestions though.


> This changeset is unrelated with the purpose of the patch.

Yes, I wasn't sure of the procedure for thaat. This, and two other bugs (732265 and 732266) need to add codec-specific properties, but there's no framework in place to support that yet.  Maybe a new bug (feature request) should be opened, and all three bugs could reference that as a dependency?  Then this last patch could be moved there to keep this thread focused on the original bug?
Comment 40 sreerenj 2017-06-12 23:26:24 UTC
Hm, more work than imagined !

BTW, how about using an enum property(eg: property like "tune") than a boolean low-latency field?  So that, may be we could add more adventurous stuffs in future under a LESS_RESTRICTIVE codec compliance option.

This is just a suggestion, make patch only if everyone agrees. If it doesn't make sense then go ahead with the property "low-latency" itself :)
Comment 41 sreerenj 2017-06-13 01:25:05 UTC
Another way could be to create a GstVaapiDecoderXXX each time we register the element in gst/vaapi/gstvaapidecode.c, but only with a general caps. Then later set the actual caps while negotiating . But this might need major changes in the current design !
Comment 42 Matt Staples 2017-06-14 03:08:17 UTC
(In reply to sreerenj from comment #40)
> Hm, more work than imagined !
> 
> BTW, how about using an enum property(eg: property like "tune") than a
> boolean low-latency field?  So that, may be we could add more adventurous
> stuffs in future under a LESS_RESTRICTIVE codec compliance option.
> 
> This is just a suggestion, make patch only if everyone agrees. If it doesn't
> make sense then go ahead with the property "low-latency" itself :)

That sounds ok to me.  I'm not sure about the exact naming though.  Did you have specific names in mind for the property itself and the default and new (low-latency equivalent) values?
Comment 43 Matt Staples 2017-06-14 03:12:42 UTC
(In reply to sreerenj from comment #41)
> Another way could be to create a GstVaapiDecoderXXX each time we register
> the element in gst/vaapi/gstvaapidecode.c, but only with a general caps.
> Then later set the actual caps while negotiating . But this might need major
> changes in the current design !

I'd like to explore that idea further, but we should move this discussion of the property framework to the new dependent bug 783588.  I just uploaded a new patch there, which is independent of the low-latency behavior of this bug.
Comment 44 Víctor Manuel Jáquez Leal 2017-06-14 14:52:52 UTC
I was thinking in another approach to activate this 'low-latency' mode. Assuming that this case is going to happen mostly with live sources, why not enable it when upstream reports the source as a live one?

    query = gst_query_new_latency ();
    is_live = FALSE;
    /* Check if upstream is live. If it isn't we can enable frame based
     * threading, which is adding latency */
    if (gst_pad_peer_query (GST_VIDEO_DECODER_SINK_PAD (vdec), query)) {
      gst_query_parse_latency (query, &is_live, NULL, NULL);
    }
    gst_query_unref (query);

    gst_vaapi_decoder_h264_set_low_latency (decoder, is_live);

What do you think?
Comment 45 Matt Staples 2017-06-15 02:35:35 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #44)
> I was thinking in another approach to activate this 'low-latency' mode.
> Assuming that this case is going to happen mostly with live sources, why not
> enable it when upstream reports the source as a live one?
> 
>     query = gst_query_new_latency ();
>     is_live = FALSE;
>     /* Check if upstream is live. If it isn't we can enable frame based
>      * threading, which is adding latency */
>     if (gst_pad_peer_query (GST_VIDEO_DECODER_SINK_PAD (vdec), query)) {
>       gst_query_parse_latency (query, &is_live, NULL, NULL);
>     }
>     gst_query_unref (query);
> 
>     gst_vaapi_decoder_h264_set_low_latency (decoder, is_live);
> 
> What do you think?

Interesting idea, but I'm not sure the need for this mode correlates entirely with live sources. Wouldn't playback of recorded live streams have the same behavior?  

Although I called it 'low-latency' mode, it's largely about avoiding the jittery rendering that's caused when the actual latency exceeds the value that the decoder reports to the sink. So I think it would be useful in playback as well as for live streams.

Conversely, I don't know whether it's possible for live streams to have the IBB... pattern for which the low-latency mode breaks strict conformance. If it is, then users might still want the strict behavior for those streams.
Comment 46 Víctor Manuel Jáquez Leal 2017-07-07 17:47:31 UTC
Created attachment 355114 [details] [review]
libs: decoder: h264: push frames as soon as possible

Push frames downstream as soon as possible instead of waiting until
they are ejected from the DPB.

This patch makes the decoder not comply with the H.264 specification,
but it is required for some video cameras.

https://bugzilla.gnome.org/show_bug.cgi?id=762509

Signed-off-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
Comment 47 Víctor Manuel Jáquez Leal 2017-07-07 17:49:07 UTC
Comment on attachment 353187 [details] [review]
infrastructure for vaapide decoder-specific properties

this belongs to bug 783588
Comment 48 Víctor Manuel Jáquez Leal 2017-07-07 17:51:10 UTC

(In reply to Víctor Manuel Jáquez Leal from comment #46)
> Created attachment 355114 [details] [review] [review]
> libs: decoder: h264: push frames as soon as possible
> 
> Push frames downstream as soon as possible instead of waiting until
> they are ejected from the DPB.
> 
> This patch makes the decoder not comply with the H.264 specification,
> but it is required for some video cameras.
> 
> https://bugzilla.gnome.org/show_bug.cgi?id=762509
> 
> Signed-off-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>

Matt, please check this version of your patch. If it is OK I'll push it next week with patches on bug 783588
Comment 49 Matt Staples 2017-07-09 16:42:22 UTC
(In reply to Víctor Manuel Jáquez Leal from comment #48)
> 
> (In reply to Víctor Manuel Jáquez Leal from comment #46)
> > Created attachment 355114 [details] [review] [review] [review]
> > libs: decoder: h264: push frames as soon as possible
> > 
> > Push frames downstream as soon as possible instead of waiting until
> > they are ejected from the DPB.
> > 
> > This patch makes the decoder not comply with the H.264 specification,
> > but it is required for some video cameras.
> > 
> > https://bugzilla.gnome.org/show_bug.cgi?id=762509
> > 
> > Signed-off-by: Víctor Manuel Jáquez Leal <vjaquez@igalia.com>
> 
> Matt, please check this version of your patch. If it is OK I'll push it next
> week with patches on bug 783588

This looks good to me.  Thanks!
Comment 50 Víctor Manuel Jáquez Leal 2017-07-10 17:50:00 UTC
Attachment 355114 [details] pushed as 66d26da - libs: decoder: h264: push frames as soon as possible