GNOME Bugzilla – Bug 752802
avviddec: Sends incorrect CAPS event before the correct one
Last modified: 2015-08-17 08:55:17 UTC
In gst_ffmpegviddec_get_buffer2(), the AVPicture, for some 1080p streams contains 1088 instead of 1080 as height, So it negotiates with the wrong format, and later rengotiates to the right one when a frame is actually produced. This is a regression that was introduced by 5b6c8ee8 and it's caught by a number of gst-validate tests, using sample H1_00094_1920x1280_MTS. This possibly also needs to be fixed in ffmpeg. Although I wonder if that size isn't the size that needs to be allocated as a scratch buffer so if it's not the correct allocation size instead of the final picture size.
Created attachment 308111 [details] [review] avviddec: Don't try to negotiate using width/height from get_buffer2 These are coded values, not the size of the output frame. So they can not be used in the caps, the real ones will come with the decoded frame. The allocation query is also pretty much useless as direct rendering is hard disabled anyway.
(In reply to Olivier Crête from comment #1) > Created attachment 308111 [details] [review] [review] > avviddec: Don't try to negotiate using width/height from get_buffer2 > > These are coded values, not the size of the output frame. So they can not > be used in the caps, the real ones will come with the decoded frame. The > allocation > query is also pretty much useless as direct rendering is hard disabled > anyway. Why did we "hard disabled" that ? It use to work fine, I actually spent a lot of time on it.
@stormer: That's a question for slomo, if I re-enable it, it crashes in some of the gst-validate tests. In practice, I don't think we can do direct rendering before a frame is produced with libav and GstVideoDecoder because it doesn't differentiate between the caps in the allocation query and the caps event.
(In reply to Olivier Crête from comment #3) > @stormer: That's a question for slomo, if I re-enable it, it crashes in some > of the gst-validate tests. > > In practice, I don't think we can do direct rendering before a frame is > produced with libav and GstVideoDecoder because it doesn't differentiate > between the caps in the allocation query and the caps event. I don't think this entirely true. In 90% of the case, you can get the width/height from upstream. The rest of the caps is the same. It probably crash in a case where the stream change resolution. Imho, if we know it from the upstream caps, we should drain the decoder and then start again, it's much simpler. I guess I'll have to check that later. That means we'll move from zero to 2 copies in the avdec_* ! glimagesink case now, sad moment.
Comment on attachment 308111 [details] [review] avviddec: Don't try to negotiate using width/height from get_buffer2 As discussed on IRC, this replaces one problem with another... and the "if (TRUE ||)" should go away in any case
Created attachment 308264 [details] [review] avviddec: Use the AVContext width/height for the initial negotiation Using the size from the AVFrame doesn't work as it is the coded size, not the actual size. This is a second take on the patch, since it is the first buffer, there is no renegotiation and the AVContext will have the right size.
Review of attachment 308264 [details] [review]: ::: ext/libav/gstavviddec.c @@ +846,3 @@ + ffmpegdec->pic_height = picture->height; + else + ffmpegdec->pic_height = context->height; Will the picture always have the correct size after the first negotiation? That is, you never get 1088 as height in the picture in get_buffer2 after the first call to it?
After the first negotiation, it will not do it from get_buffer2() but only after the picture is out if I understand correctly?
Created attachment 308339 [details] [review] avviddec: Disable direct rendering before a frame has been produced The information in the AVFrame in get_buffer2() is incomplete, the interlacing information is missing and the width/height is bogus. So we can not use it to produces a CAPS event. Instead we need to wait for an actual buffer to be produced to do the caps negotiation and enable direct rendering. Partially reverts the effects of 982f526
Actually, this doesn't work either, since we can't change the stride after doing it once with the fallback alloc, we need to do the negotiation either in _get_buffer2() or earlier (like from the input state). and I guess if anything changed, then you lose direct rendering anyway.
This is worse than I though, here are some bits from my discussion on #ffmpeg-devel <nevcairiel> interlacing is of no consequence to the memory requirements for the image buffer, and all other relevant properties are well known at that time <ocrete> in many hardware, you need to initialise the display to get the ram from <nevcairiel> and get_buffer2 is not a callback to initialize your display, its a callback to get a memory buffer to decode the frame into <nevcairiel> if you use it for something else, you get to deal with problems from that :)( <wm4> are you sure software decoding would be fast enough to even reach realtime on such a crappy restricted device? <wm4> but a memcpy kills it? <wm4> at this point you could probably decode a frame to software memory, initialize the display, and then decode to video ram <wm4> by starting again So I think we have two options: 1. Drop the whole downstream allocation codepath from gst-libav, it may have the side-effect of simplifying the code quite a bit. 2. Decode enough to get one output frame, get the caps, and then start decoding for real...
(In reply to Olivier Crête from comment #11) > So I think we have two options: > 1. Drop the whole downstream allocation codepath from gst-libav, it may have > the side-effect of simplifying the code quite a bit. > 2. Decode enough to get one output frame, get the caps, and then start > decoding for real... Or 3., switch back to libav :) 2. is not a solution as the format might change in the middle of the stream, which then only leaves 1. and 3.
Hmm even if we do 1, how would we get the resolution after decoding? The width/height of the AVFrame is coded_width/height... the width/height of the context is the display width/height of the last *input*.
Advantage of 1 is also that we don't anymore push buffers downstream that are still read-write mapped ;)
The width/height are in the final AVFrame is ok, it's only wrong during get_buffer2(). ffmpeg sets it to the coded value just in this case.
Then let's go with 1. and be unhappy about the performance degradation?
Option 4: In get_buffer2(), we can do an allocation query, but using only the partial caps we can get to, which is height/width/pixel-aspect-ratio without first pushing a caps event, this means not using the bufferpool support from GstVideoDecoder. Then hope that these will still work with the final caps, and if they don't, make a copy.
Option 5: Before the initial output buffer has been created, create malloc'ed buffers with a stride that matches the one that will be computed in decide_allocation() later on, this way we can switch to direct rendering if possible later. If downstream turns out not to accept this stride, then we can normalize it with gst_video_frame_copy().
What's up with this? Are we trying to figure out which option to go for? What are the chances ffmpeg people will fix up to their API in the longer run?
(In reply to Tim-Philipp Müller from comment #19) > What's up with this? Are we trying to figure out which option to go for? The plan that should work to gain back partial direct rendering is to start in direct rendering using internal video buffer pool (aligned as needed). Then, when we know the final caps, we negotiate the allocation, try the new pool (allocating a buffer) check if the strides match and if so, exchange the pools to effectively do direct rendering with downstream pool. In all other case, we'd copy the frame out, which is what ffmpeg do internal in non-direct rendering mode. > What are the chances ffmpeg people will fix up to their API in the longer > run? None, their argument is strong, they start decoding as soon as they can figure-out the allocation size. Waiting longer make no sense and would increase delays. GStreamer does not have this distinction (allocation size vs final caps). I'm starting to agree this is slightly a GStreamer limitation to confuse the allocation information with the final caps. I have comunicated intention to try and address this limitation (allowing allocation query to happen before the caps event, and with caps that may or may not match the final caps (caps should still match in allocation size, at least when combined with VideoAlignment).
(In reply to Nicolas Dufresne (stormer) from comment #20) > > What are the chances ffmpeg people will fix up to their API in the longer > > run? > > None, their argument is strong, they start decoding as soon as they can > figure-out the allocation size. Waiting longer make no sense and would > increase delays. GStreamer does not have this distinction (allocation size > vs final caps). I'm starting to agree this is slightly a GStreamer > limitation to confuse the allocation information with the final caps. The caps in the ALLOCATION query are the allocation sizes, the caps in the CAPS event/query are the display/render sizes. It's all there :) The main problem is that it's not well supported to have ALLOCATION query before CAPS event.
(In reply to Sebastian Dröge (slomo) from comment #21) > The caps in the ALLOCATION query are the allocation sizes, the caps in the > CAPS event/query are the display/render sizes. It's all there :) > > The main problem is that it's not well supported to have ALLOCATION query > before CAPS event. That and GstVideoDecoder baseclass forcing those caps to be the same.
And it's not just "not well" GstBaseTransform does not support allocation query before caps, basically because it has no idea yet it it's going pass-through or not.
That's both not unfixable though, and something we should fix sooner or later.
I'm not sure yet how the transform element could decide between forwarding the query or replying to it without caps. Let me know if you have an idea. Choosing between passthrough or not may depend on the interlacing mode, and that is one of the information we don't initially get from libav/ffmpeg.
It can assume that the final caps will give the same answer as the caps in the query, if they don't, it can always send a reconfigure event upstream.
So let's go here with the internal buffer pool, as described by Nicolas?
Created attachment 309367 [details] [review] aviddec: Re-enable direct rendering This is achieved by using a tempory internal pool. We can then switch to a downstream pool if the downstream pool buffer have matching strides.
Attachment 309367 [details] pushed as e3bdfb5 - aviddec: Re-enable direct rendering I've cleanup a bit more, removed uneeded have_dr boolean in the class instance and changed few comments to clarify. I've also rechecked the resulting behavior and it works as expected. I think it's worth mentionning that first few frames will be copied from the internal pool memory to the external pool memory until the decoder have started allocating from the downstream pool. The reason is that for now, gluploader (and probably other) don't allow going upward in the selection of the best upload path. So if we endup sending sysmemory, it will choose the raw upload path, and will never go back to the basic upload path for GL memory. It's also the best in the case downstream could only receive specific memory type like DMABUF.