GNOME Bugzilla – Bug 737599
Some decoders need AVI's header suggested buffer size
Last modified: 2018-01-23 11:16:47 UTC
Some decoders need AVI's header dwSuggestedBufferSize to properly calculate the size for the input buffers. The attached patch adds that information to the source GstCaps object.
Created attachment 287361 [details] [review] riff: Add input buffer size to GstCaps
Review of attachment 287361 [details] [review]: I'm not sure this is the right fix. That would mean the "some decoder" only works with muxed stream that do expose this information. I'm not saying this issue does not exist, e.g. v4l2videodec currently pick an arbitrary buffer size, but in the longer term, the idea is that it should be able to reallocate it's input buffers to fit large size. When demuxer will start using downstream buffer pools, they should setup the pool to fit this size, so the decoder never have to reallocate. I'd like a wider reflection on this to take place, so we actually solve the global issue, and not just AVI use case for "some decoder".
I agree that this is not a general solution. However, I found out that the proprietary decoder (I do not have the sources) that I have to work with can decode some very specific files (interleaved divx on avi containers) only if it has a clue on the needed input buffer size. I think that the the value in the AVI header can be considered a valuable hint on the input buffer size. As we have it, we can pass it around and it might be helpful (it actually is for me). I would say that is harmless to do so, we are just providing more information in case it is needed, regardless of any future more general solution.
(In reply to comment #3) > I agree that this is not a general solution. However, I found out that the > proprietary decoder (I do not have the sources) that I have to work with can > decode some very specific files (interleaved divx on avi containers) only if it > has a clue on the needed input buffer size. > > I think that the the value in the AVI header can be considered a valuable hint > on the input buffer size. As we have it, we can pass it around and it might be > helpful (it actually is for me). I would say that is harmless to do so, we are > just providing more information in case it is needed, regardless of any future > more general solution. No, if we commit to this, we will have to maintain it. Vendor specific code should remain vendor specific. If we upstream a mechanism to solve this issue, we should upstream the right one and commit to it.
Ok. Let's discuss about the general solution. Certainly, the ideal situation would be to have the right input buffer size from the start, and we must try to do that for any stream. However, calculating the input buffer size can be tricky, especially in cases like live streams. Therefore, in some cases we will need to re-allocate input buffers, and it is something that must be supported by the media stack. But this is costly in terms of time, especially for hardware decoders, as we could need to gather contiguous physical memory, flash caches, etc., possible causing video freezes. So we should avoid that if possible. In case the stream/container already provides a sensible suggestion for the input buffer size, we can take it as an *optional* parameter for our initial calculations of the buffer size, so we avoid the situation just described. So this change fits in that situation (it provides information that can be used as part of the parameters for calculating the input buffer size) and it is then part of the general solution. In case other containers provide this kind of information we could also use it, once we agree on the name and format of the parameter: it is not vendor specific stuff. Note that this does not conflict with the need to support buffer re-allocation: it is just a way to diminish the chance of having to re-allocate. I do not see this as really different from the "width" and "height" that are filled in GstCaps, as we could also calculate them from the stream instead of using those in the container. But it is convenient to get them from the container.
Would someone please decide if they are convinced by the arguments in comment #5 and either merge the patch or mark it as not accepted? I'm getting some pressure to put this in Ubuntu's package but base is currently unpatched (wrt. Debian) and I really want it to stay that way.
At first glance it does not seem like a good idea to me. Could you make a sample file available by any chance where this is a problem?
We have seen this problem playing interleaved divx files with a MediaTek HW decoder. Their OMX closed-source library needs a good hint for the size of the input buffers. For instance: http://trailers.divx.com/divx_prod/profiles/Micayala_DivX1080p_ASP.divx http://trailers.divx.com/divx_prod/profiles/Fashion_DivX720p_ASP.divx This change will eventually be more beneficial when we upstream additional changes in gst-plugins-bad for hardware accelerated decoding.
The decoder should be able to find this kind of information in the codec_data buffer in the caps, either explicitly, or implicitly via the profile/level and resolution. Have you tried to dissect that yet?
No. But pointers on how/where codec specific data is created will be appreciated.
I'm not sure I understand the question. It's in the avi/riff headers and is extracted by libgstriff: Try: $ gst-discoverer-1.0 -v 737599-Micayala_DivX1080p_ASP.divx | grep codec_data In order to make sense of this codec data blurb you need to have a look at the spec for the codec.
Created attachment 288646 [details] Micayala decoded CSD I have decoded the CSD for the Micayala video, which is composed of MPEG4 headers. You can find the CSD and all the fields in the attached text file. The more interesting fields for this bug are vbv_buffer_size and vbv_occupancy, which could be used as hints for calculating the input buffer size. The size is in both cases 2 MBytes. I have also analysed the file with avprobe, doing: $ avprobe -show_packets Micayala_DivX1080p_ASP.divx This command shows metadata for the different packets that compose the input stream. The one with biggest size was [packets.packet.6035] codec_type=video stream_index=0 pts=N/A pts_time=N/A dts=2317 dts_time=96.638208 duration=1 duration_time=0.041708 size=461151.000000 pos=222467064 flags=K The number 461151 is the one we also find in the dwSuggestedBufferSize field in the AVI header. Therefore, the codec has inserted the maximum packet size (461151) as the suggested input buffer size. This is more accurate than the 2 MB of the MPEG4 headers, which look more like an upper limit. So we can use the CSD data to calculate the input buffers, but it seems like codecs tend to put there big default values, while they put the largest chunk they found while coding in the header, which is the exact value we need. This is just for DivX, I guess it can be interesting to see what other codecs do. In my opinion this makes information in the AVI header useful, as you cannot really find the same info in the CSD, unless you parse the full file to check for the maximum packet size before starting the decoding.
Thank you for your further analysis. Let's close this bug though. I don't think this is something we want to do, and this isn't really something that should be needed. The video bitstream + codec_data should have (and does have) all the information required.