Bug 752802 – avviddec: Sends incorrect CAPS event before the correct one

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 752802 - avviddec: Sends incorrect CAPS event before the correct one


Summary:	avviddec: Sends incorrect CAPS event before the correct one


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	gst-libav
Version:	git master
Hardware:	Other Linux

Importance:	Normal blocker
Target Milestone:	1.5.90
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2015-07-23 23:27 UTC by Olivier Crête
Modified:	2015-08-17 08:55 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
avviddec: Don't try to negotiate using width/height from get_buffer2 (1.70 KB, patch) 2015-07-24 22:42 UTC, Olivier Crête	needs-work	Details \| Review
avviddec: Use the AVContext width/height for the initial negotiation (1.37 KB, patch) 2015-07-27 22:54 UTC, Olivier Crête	reviewed	Details \| Review
avviddec: Disable direct rendering before a frame has been produced (1.90 KB, patch) 2015-07-28 21:18 UTC, Olivier Crête	none	Details \| Review
aviddec: Re-enable direct rendering (18.53 KB, patch) 2015-08-16 17:09 UTC, Nicolas Dufresne (ndufresne)	none	Details \| Review

Description Olivier Crête 2015-07-23 23:27:21 UTC

In gst_ffmpegviddec_get_buffer2(), the AVPicture, for some 1080p streams contains 1088 instead of 1080 as height,  So it negotiates with the wrong format, and later rengotiates to the right one when a frame is actually produced.

This is a regression that was introduced by 5b6c8ee8 and it's caught by a number of gst-validate tests, using sample H1_00094_1920x1280_MTS.

This possibly also needs to be fixed in ffmpeg. Although I wonder if that size isn't the size that needs to be allocated as a scratch buffer so if it's not the correct allocation size instead of the final picture size.

Comment 1 Olivier Crête 2015-07-24 22:42:55 UTC

Created attachment 308111 [details] [review]
avviddec: Don't try to negotiate using width/height from get_buffer2

These are coded values, not the size of the output frame. So they can not
be used in the caps, the real ones will come with the decoded frame. The allocation
query is also pretty much useless as direct rendering is hard disabled anyway.

Comment 2 Nicolas Dufresne (ndufresne) 2015-07-24 23:17:16 UTC

(In reply to Olivier Crête from comment #1)
> Created attachment 308111 [details] [review] [review]
> avviddec: Don't try to negotiate using width/height from get_buffer2
> 
> These are coded values, not the size of the output frame. So they can not
> be used in the caps, the real ones will come with the decoded frame. The
> allocation
> query is also pretty much useless as direct rendering is hard disabled
> anyway.

Why did we "hard disabled" that ? It use to work fine, I actually spent a lot of time on it.

Comment 3 Olivier Crête 2015-07-24 23:54:53 UTC

@stormer: That's a question for slomo, if I re-enable it, it crashes in some of the gst-validate tests.

In practice, I don't think we can do direct rendering before a frame is produced  with libav and GstVideoDecoder because it doesn't differentiate between the caps in the allocation query and the caps event.

Comment 4 Nicolas Dufresne (ndufresne) 2015-07-25 00:02:47 UTC

(In reply to Olivier Crête from comment #3)
> @stormer: That's a question for slomo, if I re-enable it, it crashes in some
> of the gst-validate tests.
> 
> In practice, I don't think we can do direct rendering before a frame is
> produced  with libav and GstVideoDecoder because it doesn't differentiate
> between the caps in the allocation query and the caps event.

I don't think this entirely true. In 90% of the case, you can get the width/height from upstream. The rest of the caps is the same. It probably crash in a case where the stream change resolution. Imho, if we know it from the upstream caps, we should drain the decoder and then start again, it's much simpler. I guess I'll have to check that later. That means we'll move from zero to 2 copies in the avdec_* ! glimagesink case now, sad moment.

Comment 5 Sebastian Dröge (slomo) 2015-07-27 18:43:16 UTC

Comment on attachment 308111 [details] [review]
avviddec: Don't try to negotiate using width/height from get_buffer2

As discussed on IRC, this replaces one problem with another... and the "if (TRUE ||)" should go away in any case

Comment 6 Olivier Crête 2015-07-27 22:54:41 UTC

Created attachment 308264 [details] [review]
avviddec: Use the AVContext width/height for the initial negotiation

Using the size from the AVFrame doesn't work as it is the
coded size, not the actual size.

This is a second take on the patch, since it is the first buffer, there
is no renegotiation and the AVContext will have the right size.

Comment 7 Sebastian Dröge (slomo) 2015-07-28 07:22:03 UTC

Review of attachment 308264 [details] [review]:

::: ext/libav/gstavviddec.c
@@ +846,3 @@
+    ffmpegdec->pic_height = picture->height;
+  else
+    ffmpegdec->pic_height = context->height;

Will the picture always have the correct size after the first negotiation? That is, you never get 1088 as height in the picture in get_buffer2 after the first call to it?

Comment 8 Olivier Crête 2015-07-28 16:25:12 UTC

After the first negotiation, it will not do it from get_buffer2() but only after the picture is out if I understand correctly?

Comment 9 Olivier Crête 2015-07-28 21:18:17 UTC

Created attachment 308339 [details] [review]
avviddec: Disable direct rendering before a frame has been produced

The information in the AVFrame in get_buffer2() is incomplete, the
interlacing information is missing and the width/height is bogus. So
we can not use it to produces a CAPS event. Instead we need to wait for
an actual buffer to be produced to do the caps negotiation and enable
direct rendering.

Partially reverts the effects of 982f526

Comment 10 Olivier Crête 2015-07-28 21:47:25 UTC

Actually, this doesn't work either, since we can't change the stride after doing it once with the fallback alloc, we need to do the negotiation either in _get_buffer2() or earlier (like from the input state). and I guess if anything changed, then you lose direct rendering anyway.

Comment 11 Olivier Crête 2015-07-29 16:48:53 UTC

This is worse than I though, here are some bits from my discussion on #ffmpeg-devel

<nevcairiel> interlacing is of no consequence to the memory requirements for the image buffer, and all other relevant properties are well known at that time
<ocrete> in many hardware, you need to initialise the display to get the ram from
<nevcairiel> and get_buffer2 is not a callback to initialize your display, its a callback to get a memory buffer to decode the frame into
<nevcairiel> if you use it for something else, you get to deal with problems from that :)(
<wm4> are you sure software decoding would be fast enough to even reach realtime on such a crappy restricted device?
<wm4> but a memcpy kills it?
<wm4> at this point you could probably decode a frame to software memory, initialize the display, and then decode to video ram
<wm4> by starting again


So I think we have two options:
1. Drop the whole downstream allocation codepath from gst-libav, it may have the side-effect of simplifying the code quite a bit.
2. Decode enough to get one output frame, get the caps, and then start decoding for real...

Comment 12 Sebastian Dröge (slomo) 2015-07-29 17:22:29 UTC

(In reply to Olivier Crête from comment #11)

> So I think we have two options:
> 1. Drop the whole downstream allocation codepath from gst-libav, it may have
> the side-effect of simplifying the code quite a bit.
> 2. Decode enough to get one output frame, get the caps, and then start
> decoding for real...

Or 3., switch back to libav :)


2. is not a solution as the format might change in the middle of the stream, which then only leaves 1. and 3.

Comment 13 Sebastian Dröge (slomo) 2015-07-29 17:35:17 UTC

Hmm even if we do 1, how would we get the resolution after decoding? The width/height of the AVFrame is coded_width/height... the width/height of the context is the display width/height of the last *input*.

Comment 14 Sebastian Dröge (slomo) 2015-07-29 17:35:56 UTC

Advantage of 1 is also that we don't anymore push buffers downstream that are still read-write mapped ;)

Comment 15 Olivier Crête 2015-07-29 17:46:20 UTC

The width/height are in the final AVFrame is ok, it's only wrong during get_buffer2(). ffmpeg sets it to the coded value just in this case.

Comment 16 Sebastian Dröge (slomo) 2015-07-29 17:59:05 UTC

Then let's go with 1. and be unhappy about the performance degradation?

Comment 17 Olivier Crête 2015-07-29 19:23:02 UTC

Option 4: In get_buffer2(), we can do an allocation query, but using only the partial caps we can get to, which is height/width/pixel-aspect-ratio without first pushing a caps event, this means not using the bufferpool support from GstVideoDecoder. Then hope that these will still work with the final caps, and if they don't, make a copy.

Comment 18 Olivier Crête 2015-07-29 22:00:30 UTC

Option 5: Before the initial output buffer has been created, create malloc'ed buffers with a stride that matches the one that will be computed in decide_allocation() later on, this way we can switch to direct rendering if possible later. If downstream turns out not to accept this stride, then we can normalize it with gst_video_frame_copy().

Comment 19 Tim-Philipp Müller 2015-08-11 14:17:30 UTC

What's up with this? Are we trying to figure out which option to go for?

What are the chances ffmpeg people will fix up to their API in the longer run?

Comment 20 Nicolas Dufresne (ndufresne) 2015-08-11 15:06:31 UTC

(In reply to Tim-Philipp Müller from comment #19)
> What's up with this? Are we trying to figure out which option to go for?

The plan that should work to gain back partial direct rendering is to start in direct rendering using internal video buffer pool (aligned as needed). Then, when we know the final caps, we negotiate the allocation, try the new pool (allocating a buffer) check if the strides match and if so, exchange the pools to effectively do direct rendering with downstream pool. In all other case, we'd copy the frame out, which is what ffmpeg do internal in non-direct rendering mode.

> What are the chances ffmpeg people will fix up to their API in the longer
> run?

None, their argument is strong, they start decoding as soon as they can figure-out the allocation size. Waiting longer make no sense and would increase delays. GStreamer does not have this distinction (allocation size vs final caps). I'm starting to agree this is slightly a GStreamer limitation to confuse the allocation information with the final caps.

I have comunicated intention to try and address this limitation (allowing allocation query to happen before the caps event, and with caps that may or may not match the final caps (caps should still match in allocation size, at least when combined with VideoAlignment).

Comment 21 Sebastian Dröge (slomo) 2015-08-11 15:23:14 UTC

(In reply to Nicolas Dufresne (stormer) from comment #20)

> > What are the chances ffmpeg people will fix up to their API in the longer
> > run?
> 
> None, their argument is strong, they start decoding as soon as they can
> figure-out the allocation size. Waiting longer make no sense and would
> increase delays. GStreamer does not have this distinction (allocation size
> vs final caps). I'm starting to agree this is slightly a GStreamer
> limitation to confuse the allocation information with the final caps.

The caps in the ALLOCATION query are the allocation sizes, the caps in the CAPS event/query are the display/render sizes. It's all there :)

The main problem is that it's not well supported to have ALLOCATION query before CAPS event.

Comment 22 Nicolas Dufresne (ndufresne) 2015-08-11 15:51:18 UTC

(In reply to Sebastian Dröge (slomo) from comment #21)
> The caps in the ALLOCATION query are the allocation sizes, the caps in the
> CAPS event/query are the display/render sizes. It's all there :)
> 
> The main problem is that it's not well supported to have ALLOCATION query
> before CAPS event.

That and GstVideoDecoder baseclass forcing those caps to be the same.

Comment 23 Nicolas Dufresne (ndufresne) 2015-08-11 15:52:29 UTC

And it's not just "not well" GstBaseTransform does not support allocation query before caps, basically because it has no idea yet it it's going pass-through or not.

Comment 24 Sebastian Dröge (slomo) 2015-08-11 19:19:51 UTC

That's both not unfixable though, and something we should fix sooner or later.

Comment 25 Nicolas Dufresne (ndufresne) 2015-08-11 20:04:01 UTC

I'm not sure yet how the transform element could decide between forwarding the query or replying to it without caps. Let me know if you have an idea. Choosing between passthrough or not may depend on the interlacing mode, and that is one of the information we don't initially get from libav/ffmpeg.

Comment 26 Olivier Crête 2015-08-11 20:13:12 UTC

It can assume that the final caps will give the same answer as the caps in the query, if they don't, it can always send a reconfigure event upstream.

Comment 27 Sebastian Dröge (slomo) 2015-08-14 09:45:56 UTC

So let's go here with the internal buffer pool, as described by Nicolas?

Comment 28 Nicolas Dufresne (ndufresne) 2015-08-16 17:09:04 UTC

Created attachment 309367 [details] [review]
aviddec: Re-enable direct rendering

This is achieved by using a tempory internal pool. We can then switch to a
downstream pool if the downstream pool buffer have matching strides.

Comment 29 Nicolas Dufresne (ndufresne) 2015-08-17 08:54:43 UTC

Attachment 309367 [details] pushed as e3bdfb5 - aviddec: Re-enable direct rendering

I've cleanup a bit more, removed uneeded have_dr boolean in the class instance
and changed few comments to clarify. I've also rechecked the resulting behavior
and it works as expected. I think it's worth mentionning that first few frames
will be copied from the internal pool memory to the external pool memory until
the decoder have started allocating from the downstream pool. The reason is
that for now, gluploader (and probably other) don't allow going upward in the
selection of the best upload path. So if we endup sending sysmemory, it will
choose the raw upload path, and will never go back to the basic upload path for
GL memory. It's also the best in the case downstream could only receive specific
memory type like DMABUF.