GNOME Bugzilla – Bug 687183
videodecoder: Allow to negotiate a buffer pool before output format is known
Last modified: 2013-11-11 12:43:34 UTC
There is no way we can force a HW decoder like vaapidecoder to decode into a specific format.The h/w is selecting it's best format for a specific chromatype(which we can specify, 4:2:0, 4:2:2 or 4:4:4).If i understood correctly, all the gstreamer video decoders should set the GST_VIDEO_FORMAT to the caps during the negotiation. In case of vaapi, we will get the decoded_video_format from the vasurface which has data rendered by the driver.So there is no way to get the DECODER_OUTPUT_VIDEO_FORMAT until we finish the decoding of one buffer at least.So IMHO we need a better negotiation approach to handle this in gstreamer-1.0.
The only way to do this is have the decoder allocate suitable memory, decode into that, then negotiate and hope the sink supports the vasurface.
hm,,that is bad..which means i have to create/use a buffer which is not belongs to the pool!!
(In reply to comment #2) > hm,,that is bad..which means i have to create/use a buffer which is not belongs > to the pool!! No, you create buffers from a bufferpool that is optimal for the decoder to decode into.
Then it needs full cycle of pool deactivate-activate if the default format changes...right?
Yes, you will have to guess and if you guess wrong, it'll have to change.
That is what I am doing now :) ...but not a nice idea since different drivers may choose different formats as their preferred choice.
Is this still a problem with the way how we thought about implementing this, with caps features, specific VA memory, etc?
Yes. This is still a major issue.There is nothing to do with capsfeatures specific vamemory etc. A new caps with video/x-surface might be a solution :) ..
Well, "video/x-raw(VASurface)" is the new "video/x-surface". You're right that you would need to negotiate a format, but that becomes only relevant if someone wants to map the buffer to system memory.
(In reply to comment #9) > Well, "video/x-raw(VASurface)" is the new "video/x-surface". You're right that > you would need to negotiate a format, but that becomes only relevant if someone > wants to map the buffer to system memory. How can you negotiate a buffer-pool with out knowing the o/p video-format (format from decoder output).? What i did in my branch is : hard-coded the video-format to NV12 (which is the only format suppring by sandybride-intel) and negotiated the pool. But this is not the right way. This format will be different for different drivers. We need to decode the first buffer and then check what format has been providing by the driver.
Well, you need to negotiate a format before you can allocate things and have a buffer pool. Isn't it the case for VA that there might be an internal format but in the end you can generate whatever you want if you map the surface to system memory? In that case, just negotiate whatever downstream wants and be done with it. Or am I missing something here?
We can only specify the chroma type (4:2:0, 4:2:2 or 4:4:4) during vaSurface creation. Then the h/w is selecting it's best format(NV12, YV12 etc) for a specific chromatype. We have no control over it. Thats why i said, "we will have got the surface format by the time it finishes the decoding of first buffer" During system memory mapping: There are ways to (vaDeriveImage API in libva) derive the surface data with out any colorspace conversion. It will retrieve the underline surface data as it is. Is it clear now? Or Gwenole can explain it in a better way if needed :)
Yes, I don't have an answer about how to solve that though. We should find a solution that does not introduce new, non-"video/x-raw" caps. Maybe Wim has an idea?
Ok, so after talking to Wim it is allowed to do an allocation query before caps are set and even without caps inside the query. That's what you could do. However that means in autoplugging scenarios that you won't get your queries answered at all because downstream is only linked when the caps are known.
(In reply to comment #14) > Ok, so after talking to Wim it is allowed to do an allocation query before caps > are set and even without caps inside the query. That's what you could do. > Does it meant to negotiate a pool with out setting output video format from decoder (and invoke gst_video_decoder_set_output_fromat() after we finishes the decoding of first buffer)? or to set the o/p format to GST_VIDEO_FORMAT_ENCODED? Then i worried that it will lead to buffer-pool re-negotiation again.. > However that means in autoplugging scenarios that you won't get your queries > answered at all because downstream is only linked when the caps are known.
(In reply to comment #15) > (In reply to comment #14) > > Ok, so after talking to Wim it is allowed to do an allocation query before caps > > are set and even without caps inside the query. That's what you could do. > > > > Does it meant to negotiate a pool with out setting output video format from > decoder (and invoke gst_video_decoder_set_output_fromat() after we finishes the > decoding of first buffer)? or to set the o/p format to > GST_VIDEO_FORMAT_ENCODED? Yes, not ENCODED. Will need changes to the base class though as it currently requires to have a format set before negotiating a pool. > Then i worried that it will lead to buffer-pool re-negotiation again.. Can happen, yes
And it also needs changes in decodebin so it does autoplugging in your specific case, although no caps event was seen yet... so you get to the sink with the allocation query. You should btw set all possible formats in the allocation query caps, so NV12, I420, YV12, NV21, etc. That way decodebin can at least do some decisions
(In reply to comment #16) > (In reply to comment #15) > > (In reply to comment #14) > > > Ok, so after talking to Wim it is allowed to do an allocation query before caps > > > are set and even without caps inside the query. That's what you could do. > > > > > > > Does it meant to negotiate a pool with out setting output video format from > > decoder (and invoke gst_video_decoder_set_output_fromat() after we finishes the > > decoding of first buffer)? or to set the o/p format to > > GST_VIDEO_FORMAT_ENCODED? > > Yes, not ENCODED. Will need changes to the base class though as it currently > requires to have a format set before negotiating a pool. > > > Then i worried that it will lead to buffer-pool re-negotiation again.. > > Can happen, yes Hm,,I don't want to do the bufferpool renegotiation. Here I guess, there will be a re-nego for sure since we set the o/p after the decoding of first buffer(may be we can avoid this with more patches to basedecoder, not sure though). IF there is a case of pool-renegotiation then it doesn't give any advantage over the current implementation :)
Well, there should be renegotiation if decodebin finishes autoplugging because of the allocation query. However patches in decodebin (there's a TODO comment already) and in GstVideoDecoder are definitely required to make this work properly. Different question though, can't you just create the buffer pool in the decoder and then let the buffers of that be used downstream (in the sink)? Are there any disadvantages with that?
(In reply to comment #19) > > Different question though, can't you just create the buffer pool in the decoder > and then let the buffers of that be used downstream (in the sink)? Are there > any disadvantages with that? It is possible. In my older gstreamer-vaapi-1.0 branch the decoder was creating pool when the renderer is s/w_sink. Current gstreamer-vaapi has entirely different mechanisms (and not supporting vaapidecode + s_w_renderer because of many reasons). Also it is using internal pools for handling vaapisurfaces. Have to think about the disadvantages :) One of the major issue with the buffer-pool re-negotiation (in case of vaapi): The vaapi surfaces are bound to vacontext. So we cannot just create vasurfaces for decoding .If we destroy and create a new vasurface , then it meant to destory the vacontext also, which is an expensive operation. Gwenole was trying to do something with the driver for this. But don't know about the progress in this.
Hi, I think I also missed some notifications. Anyway, you should not assume anything about the underlying VA surface format. Never. You cannot know it (no API for it), and you may not be able to actually retrieve anything from it. It just serves as an "opaque" type. e.g. some VA surfaces can be encrypted. Once you decoded to it, you cannot read from it (but only the GPU for rendering). I am not really concerned about vaapidecode + download use cases, because I don't actually see any sensible one but for debugging purposes (validation). Thus, I don't really want to expose that. The only use cases that matter are (i) vaapidecode + HW renderer, and (ii) SW decode + HW renderer, with HW renderer supporting VASurfaceID as source.
(In reply to comment #20) > One of the major issue with the buffer-pool re-negotiation (in case of vaapi): > The vaapi surfaces are bound to vacontext. So we cannot just create vasurfaces > for decoding .If we destroy and create a new vasurface , then it meant to > destory the vacontext also, which is an expensive operation. Gwenole was trying > to do something with the driver for this. But don't know about the progress in > this. Well, there unfortunately won't be progress on that. Some VA drivers, that are not really open, depend on the fact that a VA context holds VA surfaces, as in: it also holds extra internal buffers/mappings to those VA surfaces. If we kill the VA context, we kill those internal buffers(!), which could also represent some decode state that the HW requires for correctly decoding the next frames. In short, it's better to stick to the current model: you allocate a decode session (VA context) once, and it is live until the encoded resolution changes, thus requiring new VA surfaces. Though, in order to initialize that context, you will also need to know the maximum number of possible VA surfaces to allocate, i.e. parsing at least the sequence packets/layer.
(In reply to comment #21) > Hi, I think I also missed some notifications. Anyway, you should not assume > anything about the underlying VA surface format. Never. You cannot know it (no > API for it), and you may not be able to actually retrieve anything from it. It > just serves as an "opaque" type. e.g. some VA surfaces can be encrypted. Once > you decoded to it, you cannot read from it (but only the GPU for rendering). > > I am not really concerned about vaapidecode + download use cases, because I > don't actually see any sensible one but for debugging purposes (validation). > Thus, I don't really want to expose that. Well, it's something you should really expose to conform with how things are supposed to work in GStreamer :) It will be slow and not always work, but it should work if possible. > In short, it's better to stick to the current model: you allocate a decode > session (VA context) once, and it is live until the encoded resolution changes, > thus requiring new VA surfaces. Though, in order to initialize that context, > you will also need to know the maximum number of possible VA surfaces to > allocate, i.e. parsing at least the sequence packets/layer. Could you maybe just allocate inside the decoder? That sounds like a safer solution with these constraints...
Also just noticed that decodebin should not wait for vaapidec to send a caps event before it continues autoplugging. vaapidec only has raw caps on the srcpad template, so could immediately be considered complete even before we get the caps event. Which would also allow the ALLOCATION query to go through to the sink. Still needs to implemented though, and has the disadvantage that for some pads you can't immediately get the current caps after decodebin has exposed them.
Does that make sense for vaapi? Are you working on that?
I am not working on this at the moment. It seems like we are discussing about multiple issues here..are we??? May be I can come back to this sometimes later if Gwenole (or someone else) is not working on this.
Ok, so what should we do with this bug? There are many potential solutions to all your problems, some need to be implemented in GStreamer first... just depends on you now I guess ;)
Hm,,okay. I have to read all the comments once more :) For the gstreamer-vaapi, I would to like get some comments from Gwenole regarding the removal of "video/x-sruface" + 1.2 porting first ;)
I think Victor did a lot of work on that already, especially with WebKit and there video/x-surface is not used anymore if building against GStreamer 1.1
Yup i have seen those patches. Actually i am the one who removed x-surface and added capsfeatures on top of his patches ;).. still need some review from gb.
Removal of the video/x-surface media was already planned the day it was introduced in -bad of GStreamer 0.10. The reason is that GStreamer 1.0 now offer API that let you better represent and let you manipulate in software those HW buffer allocated buffers.
Created attachment 250471 [details] [review] videodecoder: try to negotiate the buffer pool even though there is no o/p format. I am attaching a patch to handle this case in basedecoder. The rest of the parts need to handle in gstreamer-vaapi I guess. I didn't look into the gstreamer-vaapi changes and the autplugging issues. Still needs some work to get the autoplugging stuffs working correctly as per slomo's comments. right?
Yes, also I'm not sure if that patch is enough. I think there are more places in the base class that assume an output context after negotiation.
(In reply to comment #33) > Yes, also I'm not sure if that patch is enough. I think there are more places > in the base class that assume an output context after negotiation. Aha, you are right. I will try to add necessary changes. Thanks... Note: I will look into the gstreamer-vaapi issues later(once __gb__ integrate the existing 1.2 stuffs)
Created attachment 251352 [details] [review] videodecoder: try to negotiate the buffer pool even though there is no o/p format. I kept the gvd_allocate_output_frmae() untouched and kept all the sanity check inside this routine as it is. So effectively apps can invoke gvd_allocate_output_buffer() with out setting the output video format but not the gvd_allocate_output_frame(). Please let me know if there any issues with this patch.
Created attachment 251376 [details] [review] videodecoder: error out from __allocate_output_frame() if it fails to negotiate. This patch is just an enhancement to videodecoder to error out from gst_video_decoer_allocate_output_frame() if it fails to negotiate (irrespective of the issues in this bug report)
Comment on attachment 251376 [details] [review] videodecoder: error out from __allocate_output_frame() if it fails to negotiate. This is not correct, you don't want to return NOT_NEGOTIATED if the negotiation failed just because the srcpad or downstream was flushing for example
(In reply to comment #35) > Created an attachment (id=251352) [details] [review] > videodecoder: try to negotiate the buffer pool even though there is no o/p > format. Can you show me the code where this is used in gst-vaapi? Just to understand how it would be used properly. This probably also should be documented somewhere.
Created attachment 251450 [details] [review] videodecoder: error out from __allocate_output_frame() if it fails to negotiate. rework based on slomo's comment..
(In reply to comment #38) > (In reply to comment #35) > > Created an attachment (id=251352) [details] [review] [details] [review] > > videodecoder: try to negotiate the buffer pool even though there is no o/p > > format. > > Can you show me the code where this is used in gst-vaapi? Just to understand > how it would be used properly. This probably also should be documented > somewhere. I haven't implemented this in gstreamer-vaapi. That surely need more effort.
Created attachment 257412 [details] [review] videodecoder: try to negotiate the buffer pool even though there is no o/p format. I have rebased the older patch on top of git master. There might be other changes also. Let's implement this feature in gstreamer-vaapi first.
The patch to gstreamer-vaapi is here : https://gitorious.org/vaapi/sree-gstreamer-vaapi/commits/48f317c637119c44b88ff932fe7759719946c613 The patch is delaying the gst_video_decoder_set_output_state() invocation in vaapidecoder until it finishes the decoding of one buffer. Still it is not doing anything with the allocation query :) because the current gstreamer-vaapi is using separate surface pool and which is creating inside the decoder. And the pool/memory/meta which is created via "allocation query + other gstreamer methods" is getting only references to the actual decoded surface.
I tried the two patches and they don't fix my problem in WebKit. The decoder still tries to negotiate NV12 with the sink, even though the sink provides a GLUploadMeta.
(In reply to comment #43) > I tried the two patches and they don't fix my problem in WebKit. The decoder > still tries to negotiate NV12 with the sink, even though the sink provides a > GLUploadMeta. Decoder will always tries to negotiate the the format which the driver internally selected for decoding that we don't have any control at the moment. And which is NV12.
(In reply to comment #44) > (In reply to comment #43) > > I tried the two patches and they don't fix my problem in WebKit. The decoder > > still tries to negotiate NV12 with the sink, even though the sink provides a > > GLUploadMeta. > > Decoder will always tries to negotiate the the format which the driver > internally selected for decoding that we don't have any control at the moment. > And which is NV12. But if the sink requires a GL texture anyway, why would it matter? I don't understand why I need to add NV12 support to my sink, especially for VA-API.
We have to negotiate the format that the decoder providing with the sink which accepts the buffer. And when using Uploadmeta it is doing necessary format conversion with in sink. And again for explicit format conversion, we are supposed to use the vapostproc element which is not yet ready.
@slomo: Is there any problem to commit the proposed patch? Another question regarding the format negotiation issue: Is it possible to negotiate an output buffer format as ENCODED between decoder and sink, and keep it as ENCODED itself throughout the lifetime of the pipeline.? Use case: Suppose the the decoded buffer is encrypted and we don't know about the type/format of the raw data with in the buffer.Only dedicated sink can do rendering in this case. Which means that we have to restrict the encrypted_mem to systeam_mem mapping since it is not possible to do it with out knowing the output format.
Review of attachment 257412 [details] [review]: ::: gst-libs/gst/video/gstvideodecoder.c @@ +2941,3 @@ gst_video_info_init (&vinfo); + if (outcaps) + gst_video_info_from_caps (&vinfo, outcaps); The vinfo is probably used some lines below, not sure if the default values in there make sense or something else should be done instead @@ +3198,3 @@ GST_VIDEO_DECODER_STREAM_LOCK (decoder); + if (G_UNLIKELY (!decoder->priv->output_state + || decoder->priv->output_state_changed How can you even allocate a buffer without knowing the output format? I think this looks wrong. Negotiation shouldn't be allowed to be triggered from buffer allocation if the output format is not known yet, it should be triggered elsewhere.
(In reply to comment #48) > Review of attachment 257412 [details] [review]: > > ::: gst-libs/gst/video/gstvideodecoder.c > @@ +2941,3 @@ > gst_video_info_init (&vinfo); > + if (outcaps) > + gst_video_info_from_caps (&vinfo, outcaps); > > The vinfo is probably used some lines below, not sure if the default values in > there make sense or something else should be done instead > I think this is correct. Because the vinfo.size is zero in both places where it uses and that is what we needed. right? > @@ +3198,3 @@ > GST_VIDEO_DECODER_STREAM_LOCK (decoder); > + if (G_UNLIKELY (!decoder->priv->output_state > + || decoder->priv->output_state_changed > > How can you even allocate a buffer without knowing the output format? I think > this looks wrong. I think there is no issue here also :). As per our previous discussions in irc, we can create buffer pool with out knowing the caps. Which means the upstream element would just return a bufferpool object. Downstream element can create the pool using its internal mem_allocator which doesn't need to provide the exact format but only the chroma type. >Negotiation shouldn't be allowed to be triggered from buffer > allocation if the output format is not known yet, it should be triggered > elsewhere. No . We should allow negotiation triggering from buffer allocation point itself so that we will get the buffer pool object from upstream.
(In reply to comment #48) > Review of attachment 257412 [details] [review]: > > ::: gst-libs/gst/video/gstvideodecoder.c > @@ +2941,3 @@ > gst_video_info_init (&vinfo); > + if (outcaps) > + gst_video_info_from_caps (&vinfo, outcaps); > > The vinfo is probably used some lines below, not sure if the default values in > there make sense or something else should be done instead I think this is correct. Because the vinfo.size is zero in both places where it uses and that is what we needed. right? > > @@ +3198,3 @@ > GST_VIDEO_DECODER_STREAM_LOCK (decoder); > + if (G_UNLIKELY (!decoder->priv->output_state > + || decoder->priv->output_state_changed > > How can you even allocate a buffer without knowing the output format? I think > this looks wrong. I think there is no issue here also :). As per our previous discussions in irc, we can create buffer pool with out knowing the caps. Which means the downstream element would just return a bufferpool object. Upstream element can create the pool using its internal mem_allocator which doesn't need to provide the exact format but only the chroma type. >Negotiation shouldn't be allowed to be triggered from buffer > allocation if the output format is not known yet, it should be triggered > elsewhere. No . We should allow negotiation triggering from buffer allocation point itself so that we will get the buffer pool object from downstream.
Please forget about the commet49 where i made some mistakes in explanation (the words downstream/upstream were misplaced). I have corrected it in comment 50.
(In reply to comment #44) > (In reply to comment #43) > > I tried the two patches and they don't fix my problem in WebKit. The decoder > > still tries to negotiate NV12 with the sink, even though the sink provides a > > GLUploadMeta. > > Decoder will always tries to negotiate the the format which the driver > internally selected for decoding that we don't have any control at the moment. > And which is NV12. However if the GLTextureUploadMeta will then provide something that behaves like an RGBA texture, the caps should also say RGBA.
So, do the vaapidecode src caps need to be updated or not, Sree?
(In reply to comment #53) > So, do the vaapidecode src caps need to be updated or not, Sree? I think we are trying to discuss two topics here :(. If i understood correctly, your question is "Do we need to update the src caps of vaapidecode with RGB or (what ever format) that the sink can support via GLUploadMeta?" .. I would say *no* :) Could you please create a new bug against gstreamer-vaapi for that (if you have issues). Otherwise this bug report will become a mess-up :)
Ok then, see bug 711828.
(In reply to comment #55) > Ok then, see bug 711828. Thanks :)
@slomo: Are you agreeing with comment_50 ? Do you have other concerns about the patch ?
(In reply to comment #50) > (In reply to comment #48) > > Review of attachment 257412 [details] [review] [details]: > > > > ::: gst-libs/gst/video/gstvideodecoder.c > > @@ +2941,3 @@ > > gst_video_info_init (&vinfo); > > + if (outcaps) > > + gst_video_info_from_caps (&vinfo, outcaps); > > > > The vinfo is probably used some lines below, not sure if the default values in > > there make sense or something else should be done instead > > I think this is correct. Because the vinfo.size is zero in both places where it > uses and that is what we needed. right? Probably > > @@ +3198,3 @@ > > GST_VIDEO_DECODER_STREAM_LOCK (decoder); > > + if (G_UNLIKELY (!decoder->priv->output_state > > + || decoder->priv->output_state_changed > > > > How can you even allocate a buffer without knowing the output format? I think > > this looks wrong. > > > I think there is no issue here also :). As per our previous discussions in irc, > we can create buffer pool with out knowing the caps. Which means the downstream > element would just return a bufferpool object. Upstream element can create > the pool using its internal mem_allocator which doesn't need to provide the > exact format but only the chroma type. Yes, seems correct after all. But you should add some checks afterwards to see if we either have a buffer pool now or have an output_state with size!=0 (see fallback path). If neither of these two is TRUE an error should be returned, probably GST_FLOW_NOT_NEGOTIATED.
Created attachment 259550 [details] [review] videodecoder: try to negotiate the buffer pool even though there is no o/p format.
Done! Please let me know if it needs any further modification.
commit cd52ff313e2c9eca0d20787f598b1d752da4c5f7 Author: Sreerenj Balachandran <sreerenj.balachandran@intel.com> Date: Mon Nov 11 14:10:53 2013 +0200 videodecoder: try to negotiate the buffer pool even though there is no o/p format We could have allocation query before caps event and even without caps inside the query. In such cases , the downstream can return a bufferpool object with out actually configuring it. This feature is helpful to negotiate the bufferpool with out knowing the output video format. For eg: some hardware accelerated decoders can interpret the o/p video format only after it finishes the decoding of one buffer at least. https://bugzilla.gnome.org/show_bug.cgi?id=687183
Thanks!