GNOME Bugzilla – Bug 785011
videodecoder: Default buffer pool supports GstVideoAlignment but downstream might not
Last modified: 2017-12-06 20:35:08 UTC
Created attachment 355731 [details] tarball of log files of 1.10.4 vs 1.12.0 So I tracked it down as far as between 1.10.04 (worked) and 1.12.0 (doesn't work) ... playback of my MKV files has the video all junk. like this: http://www.enlightenment.org/ss/e-596c574b7ef3f0.39995477.png it only seems to happen with MKV. MP4 files with the exact same YUV format (GST_VIDEO_FORMAT_I420 + GST_VIDEO_COLOR_MATRIX_BT709) of the exact same resolution are fine, but MKV are not. The plane offsets, stride values per frame are the exact same between a working MP4 video source and non-working MKV video source. I know MKV is just a container... but basically the metadata (plane offsets, stride, video buffer format) are constant between 1.10.4 and 1.12.0. i've attached xz zipped tarball of logs for, 1.10.4 and 1.12.0 playing the exact same file. FYI - we have our own video sink. we map the gst buffers, then use offset and stride values to supply yuv etc. data to a canvas buffer. the mapping succeeds. the querying of metadata works. just the data in the buffer seems to be junk in 1.12 (or of a completely different layout/format) vs 1.10.4 even though all the format/layout data is the same. this is not using vaapi at all. no vdpau either. FYI there is another bug with vaapi where the strides and plane offsets are also wrong when vaapi is decoding. but that's a separate matter.
Can you provide such a MKV file and a way to reproduce the problem?
ummmm.... all my mkv files are big... i can just take the first 1mb of one... http://devs.enlightenment.org/~raster/t.mkv it exhibits the junk. yes gst-play etc. work well because they don't use mmaped buffers. yes - we keep the buffer mapped for as long as we need the data.
Ok so you have a custom sink that uses mmap'd memory. Can you give some more details about how that works? Does it provide memory to the decoder (I assume you use avdec_h264?) or copy inside the sink from the decoder memory to the mmap'd memory? What exactly is that mmap'd memory based on? Do you get properly decoded video inside your sink, e.g. if you write out the memory to a file and check what it contains? Also it seems unlikely that this is related to Matroska, it must be something related to the decoder element. That it works for MP4 must be coincidence.
no special allocation of buffers. it's what playbin decides. we just get the show_frame calls and then ref the gst buffer, send to mainloop to apply to canvas. the map doesn't fail... so since we don't specially allocate the memory... it should just keep working as long as buffers are mappable. the memory the sink gets is forwarded by handle right to the mainloop that then forwards pointers to that memory to the canvas... and we keep the buffer mapped as long as the canvas is pointing to this memory. the canvas uses the same convert/texture upload functions and code regardless of gst version... so downgrade gst... then it works. so i see correctly decoded data at that point on screen with 1.10.4 and garbage with 1.12.0 indeed i don't think it's related to matroska. it is just a container format - as i mentioned in the original. :) it is something deeper, but oddly both mp4 and mkv files use the same size, stride, plane offsets and yuv format ... so it's something specific to the decoder that is ending up being run with the mkv as opposed to the mp4. the mkv's have been working for years... until some time between 1.10.4 and 1.12.0 of gst. could it be that the decoder is re-using the buffer even though we keep it mapped and filling it with junk before we have a chance to upload? i haven't checked the buffer at show_Frame time specifically ... i'd have to write myself a little saver and decoder to something i can more easily see... haven't done that yet.
Ah so "mmap" was just a typo, we're only talking about a mapped buffer here. Not mmap'd memory :) You could use gst_buffer_copy_deep() in your sink instead of gst_buffer_ref() to check that theory. That would also be my first guess, yes.
indeed i'll give it a go and see. yeah. i mmap is a typo. i use mmap a lot... :) habit to just type it in a technical situation. :) if a deep copy gets us clean data then we're staring at this "map state of buffer is ignored now when gst recycles" where before it worked. if not.... hmmmmm....
and doing a deep copy instead of a ref. still garbage. send->frame = gst_buffer_ref(buffer); changed to: send->frame = gst_buffer_copy_deep(buffer); :(
Ok, so you already have garbage at that point. Can you check with gst_video_convert_sample() (if you don't have a GstSample, create one from the buffer and caps you have there) to see if the buffer is really containing garbage at that point? E.g. convert it to "image/jpeg" and write it out to a file.
i'll give it a go... but i'm pretty certain i have garbage because the converters on our side we use with libxine, libvlc and also vlc with shm buffers, gstreamer 0.10 to... and even then given the exact same size, yuv format, plane and stride values they work with an mp4, fail with the mkv. changing zero code except downgrading to 0.10.4 from 1.12.0 fixes it. so i'm 99.99% sure the converters/display pipeline is right. if the only thing that changes is gst version it goes from broken to working (if downgrading). i'll give it a go to try save it out... but i'm pretty sure what the result will be. it must have something to do with the code path we trigger because we map buffers ... :/ i'll see about getsample.
oh btw... if you watch the garbage carefully... it changes over time that is consistent with the video content changing. it's as if there is actually decoded data there but in some totally bizarre layout/format ... i've seen similar kind sof garbage when stride and plane offsets are totally wrong (this is the case for vaapi btw - i figured out the plane offsets and stride values are not correctly rounded to what the hw decoder actually uses - with some guesswork i figured out the strides had to be 64byte aligned and plane was similar - last i checked this was still broken with vaapi+gst just saying that the effects i see are very similar visually).
Yes, it looks like a stride issue, but I assume you're handling strides and plane offsets correctly? That is, you use gst_video_frame_map() and work with the data from the GstVideoFrame? Which decoder is used in your case, avdec_h264?
my hunch seems to be onto something. here is a video. it's stride is 856 per line for the y plane: http://www.enlightenment.org/ss/e-596db48c546332.81805607.png but if i force it to 1440: http://www.enlightenment.org/ss/e-596db4d2c9f1f3.33464318.png i'm beginning to make out sensible content in the y plane... well kind of... we match format with: gst_video_info_from_caps(&info, caps); where caps are the ones provided to set_caps in the sink. we match info.finfo->format and info.colorimetry.matrix to equal out table of fromat and matrix entries. format and matrix end up being GST_VIDEO_FORMAT_I420 and GST_VIDEO_COLOR_MATRIX_BT709 for both the working mp4 and the "broken" mkv.
it's not just stride. plane offsets seem to leave junk around... and yup. using stride and plane offsets. the way we feed data to the canvas is through a series of row pointers for planar data. first its N pointers for each y row THEN N/2 for u and N/2 for V after that... so all we need to do is calculate the pointer location for the beginning of the row. at least for planar formats like I420 this is fine (it allows for non-constant strides interestingly enough...) so the converter is very simple: ptr = gst_data + info->plane_offset[0]; jump = info->stride[0]; for (i = 0; i < rh; i++, ptr += jump) rows[i] = ptr; ptr = gst_data + info->plane_offset[1]; jump = info->stride[1]; for (j = 0; j < (rh / 2); j++, i++, ptr += jump) rows[i] = ptr; ptr = gst_data + info->plane_offset[2]; jump = info->stride[2]; for (j = 0; j < (rh / 2); j++, i++, ptr += jump) rows[i] = ptr; :)
gst_video_convert_sample() gets me a correct output image as a png. i threw this in: GstSample *from_sample, *to_sample; GstBuffer *buf; GstMapInfo map_info; GstCaps *caps; GError *err = NULL; g_object_get(send->sink, "last-sample", &from_sample, NULL); caps = gst_caps_from_string("image/png"); to_sample = gst_video_convert_sample(from_sample, caps, GST_CLOCK_TIME_NONE, &err); gst_caps_unref(caps); gst_sample_unref(from_sample); buf = gst_sample_get_buffer(to_sample); if (gst_buffer_map(buf, &map_info, GST_MAP_READ)) { if (!g_file_set_contents("/home/raster/out.png", (const char *)map_info.data, map_info.size, &err)) { GST_WARNING ("Could not save thumbnail: %s", err->message); g_error_free (err); } } gst_sample_unref(to_sample); gst_buffer_unmap(buf, &map_info); quick and dirty but got me out.png.
Can you try using gst_video_frame_map() and getting strides/offsets from the GstVideoFrame instead of the GstVideoInfo as generated by the caps? Can you also link to your sink code, or is it not public?
i dumped just the y plane data: http://www.enlightenment.org/ss/e-596dc245b80205.49695288.png i dumped it as a pgm (quick and dirty): printf("y plane off=%i stride=%i\n", (int)send->info.offset[0], (int)send->info.stride[0]); FILE *f = fopen("/home/raster/out.pgm", "w"); fprintf(f, "P5\n%i %i\n255\n", send->info.width, send->info.height); int i; char *p = map.data; p += send->info.offset[0]; for (i = 0; i < send->info.height; i++) { fwrite(p, send->info.width, 1, f); p += send->info.stride[0]; } fclose(f); the y plane offset and stride: y plane off=0 stride=856 map.data is the mapped data ptr. send->info is the GstVideoInfo struct content from gst... so what is in the mapped data memory is seemingly in some other layout/format than what the format + colormatrix say and what plane and stride values say... or they are simply not even using the same memory/data... ?
There are "default" strides/offsets for a given video format plus resolution. This is what you get in GstVideoInfo when you fill it from caps. However, decoders + sinks may negotiate different arrangements, also on a per-frame basis. There is the possibility to attach strides/offsets to individual video buffers in form of GstVideoMeta. Also, theoretically buffers could have multiple memories, one for each plane. If you use the gst_video_frame_map() API it will handle all that automatically and properly for you.
well the GstVideoInfo we get is from the set_caps() in the sink. so we use whatever we get from gst_video_info_from_caps() with the caps provided... so the caps should be right. correct? the caps should then change if they negotiate something new even on the fly? we get the strides/pplane offsets buffer by buffer from the info we get from the caps... we've been mapping the buffer like forever using the info offset/stride... i never knew gst_video_frame_map() even existed. if the buffer map is invalid... shouldn't it fail? shouldn't the offset/stride data be... like all 0's or something? because from an api point of view... it looks like it is/should be working.
The buffer is not invalid, and from the caps you can only get incomplete information about strides/offsets in the general case. That's why I asked for the code of your sink: if you only want the "default" values that you can get from the caps, you must not claim supporting GstVideoMeta and GstVideoAlignment in your sink for example. If you don't claim that, getting buffers with strides/offsets different than the defaults from the caps would be a bug elsewhere.
oh btw. i forgot. gst_video_frame_map() works. i can get sensible data out of it. why did this change and was mapping the gst buffer always invalid?
Why are you ignoring my questions? :) It might be a bug, but that depends on your sink's code. If your sink claims to support GstVideoMeta, GstVideoAlignment, and similar, than you could've always gotten such buffers. There is additional information in the buffer than for getting information about strides/offsets that you need to make use of. Mapping the buffer is not invalid, but you have to interpret the data correctly. If your sink does not claim to support GstVideoMeta, etc., then this is a bug in the decoder. Which one is used here?
i'd give you a link to the code if i could but due to network setup, i can't. well i cant browse our cgit and give you the link... it'd have to wait until another 5 hours or so... we don't advertise anything to do with GstVideoMeta or GstVideoAlignment in our sink at all. it's quite an old sink dating back from 0.10 days... it's been moved along and updated for 1.0 over time...
git clone git://git.enlightenment.org/core/efl.git look in src/modules/gstreamer1/ i can't give a direct link. emotion_sink.c is the sink core, emotion_alloc.c is the buffer allocation code, emotion_convert.c is the conversion from gst buffer info to what evas will understand (note we support multiple media libs so we have to abstract and be pretty generic - we don't support every yuv format on the planet though, but when we encounter a new one we need to support, we add support to the canvas). emotion_gstreamer.c is the core module abstraction api that the emotion core will call into via the engine func table.
https://git.enlightenment.org/core/efl.git/tree/src/modules/emotion/gstreamer1 FWIW. Thanks, I'll take a short look. I wasn't aware that this is the emotion code, which I actually ported to GStreamer 1.x back then.
yeah. thanks. i just cant see cgit from here (as i have to use a ssh tunnel as a socks proxy and socks given the server setup cant see git.enlightenment.org from the tunnel), so i was doing other things until i can get to somewhere i can look. :) and yup. it's emotion. i've done some improvements over time like actually support all the stride and plane info and so on... :) so i'm not trying to ignore your questions... i just didn't have an acceptable url/answer yet. i did have other information i had gathers thanks to your help and pointers. :) i basically have a bit of rejigged code that uses gst_video_frame_map() if it works... falls back to buffer map. is gst_video_frame_map() pretty much guaranteed for gst 1.0 and on? or was it added during a specific revision after 1.0?
Yes, that sink should never ever get buffers with GstVideoMeta or any other meta that affects the memory layout. The VAAPI problem is probably exactly that, but that would be a bug in vaapidec that should be reported separately.
(In reply to Carsten Haitzler (Rasterman) from comment #25) > is gst_video_frame_map() pretty much guaranteed for gst 1.0 and on? or was > it added during a specific revision after 1.0? That's supposed to work for all versions of 1.x, and the recommended way of doing this. In addition you probably want to add code like this https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/d3dvideosink/d3dvideosink.c#n466 (without the CROP meta and only until line 470!) to your sink for improved performance (if you can indeed handle all strides/offsets). Without this, various decoders will have to do an internal copy. Nonetheless, it's a bug somewhere that you found here.
Ok, it's the following code in gstavviddec.c: > have_alignment = > gst_buffer_pool_has_option (pool, GST_BUFFER_POOL_OPTION_VIDEO_ALIGNMENT); Which is of course always TRUE if downstream did not provide a pool. GstVideoBufferPool, which is created by default, always supports that. Just that downstream might not be able to handle that. The problem is not the metas but the buffer pool option.
> That's supposed to work for all versions of 1.x, and the recommended way of doing ok. cool. this means at least this fix can go into emotion and efl 1.20. it seems there is a gstreamer bug for when sinks don't advertise alignment/meta and basic mapping now doesn't work anymore. i'm of course going to put the gst_video_frame_map() in anyway... i just want to be able to tell people "either downgrade gst to 1.10 or upgrade efl to 1.20" ... i might backport to 1.19 efl as a bugfix/workaround too. and absolutely the vaapi issue is another ticket. one thing at a time. :) reality is i actually want to put in a full zero copy path... but it seems pretty involved with gst to do that. well more specifically it kind of doesn't quite work with our abstraction.hiding. there's the eglsink stuff... but we actually try and hide egl vs glx etc. ... we support either pixmap id's, texture id's or fbo id's, or wl_buffers, wl_dmabuf's ... but we totally hide egl (and glx)... o gst and evas abstractions are just not meeting in the middle and i've been scratching my head as to what to do... but ... that is not something to solve in this ticket. when i get to it i'll file a ticket about vaapi and can go from there. i'll look into the d3dvideosink you point to as well - probably before the vaapi stuff. i like to be conservative for now and keep thigns to what is needed to make things work. :)
> Which is of course always TRUE if downstream did not provide a pool. indeed we don't provide a pool. we leave it up to gst to do that... :) aha. so that would explain it. must have been added between 1.10 and 1.20 (that line). :)
(In reply to Carsten Haitzler (Rasterman) from comment #30) > > Which is of course always TRUE if downstream did not provide a pool. > > indeed we don't provide a pool. we leave it up to gst to do that... :) aha. > so that would explain it. must have been added between 1.10 and 1.20 (that > line). :) No, you were just lucky before :) But some things changed to make this happen more likely now it seems. (In reply to Carsten Haitzler (Rasterman) from comment #29) > and absolutely the vaapi issue is another ticket. one thing at a time. :) > reality is i actually want to put in a full zero copy path... but it seems > pretty involved with gst to do that. well more specifically it kind of > doesn't quite work with our abstraction.hiding If you want to discuss this, would be good to get together at the next hackfest or conference for that :) > when i get to it i'll file a ticket about vaapi and can go from there. vaapi should also work with gst_video_frame_map() > i'll look into the d3dvideosink you point to as well - probably before the > vaapi stuff You literally just need to implement those 5 lines of code there. Don't look at anything else of d3dvideosink, it's not that great :) It was just the first place that came to my mind that has the GstVideoMeta code implemented.
I guess one solution here would be to allow configuring on the GstVideoBufferPool if it should support these things during creation, and not support it for pools created by default in base classes, etc. Also needs careful checking everywhere, where GstVideoBufferPool is currently used to cover similar cases.
For VAAPI, see https://bugzilla.gnome.org/show_bug.cgi?id=785054
(In reply to Sebastian Dröge (slomo) from comment #32) > I guess one solution here would be to allow configuring on the > GstVideoBufferPool if it should support these things during creation, and > not support it for pools created by default in base classes, etc. Also needs > careful checking everywhere, where GstVideoBufferPool is currently used to > cover similar cases. No, one should just not rely on the pool option to decide about using GstVideoMeta or not. There is only one valid way to know if downstream support GstVideoMeta, and this is using gst_query_find_allocation_meta().
> In addition you probably want to add code like this https://cgit.freedesktop.org/gstreamer/gst-plugins-bad/tree/sys/d3dvideosink/d3dvideosink.c#n466 (without the CROP meta and only until line 470!) Ahhh a bit more complicated than just a few lines of adding the meta property things... i have to add the whole propose_allocation thing ... :)we totally left that alone. would just adding the meta and crop meta (with NULL) then calling parent class propose alloc be enough? i'd have to look at the implementation to have a better guess... digging that up may take time. i.e.: static gboolean emotion_video_sink_propose_allocation(GstBaseSink *bsink, GstQuery *query) { gst_query_add_allocation_meta(query, GST_VIDEO_META_API_TYPE, NULL); gst_query_add_allocation_meta(query, GST_VIDEO_CROP_META_API_TYPE, NULL); return GST_CALL_PARENT_WITH_DEFAULT(GST_BASE_SINK_CLASS, propose_allocation, (bsink, query), TRUE); } ?
Yes, that's exactly what I meant :)
For the actual bug here, as discussed on IRC the GstVideoAlignment should only ever be used if GstVideoMeta is supported. Which avviddec seems to do though. So not sure yet what happens here exactly.
> Yes, that's exactly what I meant :) awesome. so there is a fix in upstream efl git master. we're actually due to release 1.20 really soon. also this seems to have fixed some vaapi issues too with strides/offsets EXCEPT for one of my videos where mapping buffers entirely fails all the time... can't map. :( sad. this likely is a hardware limitation or a vaapi limitation with how those buffers can be accessed from the cpu side... but zero copy methods do seem to work. so that'll be on my list of things to look into when i next get some time. so so so many things to do. thanks so much for the help and guidance. for now i think this bug is a "dual issue". it's partly efl just being so old school still mapping buffers directly and not using the frame map api... and that now broke.. and it breaking is an incompatibility with gstreamer than wasn't detected as no one was old school directly mapping buffers and using those offsets/strides. it may be that we have to actually roll emotion into evas at some point instead of have it outside. then it can have access to egl stuff (or glx and so on)... and then the egl etc. sinks would be totally viable. until then i might go for only supporting zero copy on wayland and use wl_dmabuf's ... i can't remember. there is a wyaland sink. does it produce wl_dmabuf's?
There is waylandsink, which render to a wl_surface (or subsurface) that you pass from your application. It can only consume dmabuf though. There might still be some work on this one, as compositor support is flicky in Weston and missing in gnome-shell. I don't know the state of EFL compositor in this regard. p.s. You can also import DMABuf into GL.
waylandsink does not support padded SHM buffers, yet it still gets padded buffers from libav based decoders. The result on screen is shifted Y, U and V planes which doesn't look very good. waylandsink could support those using the wp_viewport extension by applying cropping on the buffer. But until this is the case, the buffer pool negotiation shouldn't result in padded buffers. Note that I did apply the patch on weston so that GL texture uploads use the correct stride for U and V planes (https://patchwork.freedesktop.org/patch/180767/).
Hum actually waylandsink uses a GstVideoBufferPool so it's a bug that it does not use the video alignment from upstream. I'll open another bug for this.
Ok, I'm tryign to figure-out what's the bug left here, to me there is no bug really. (In reply to Sebastian Dröge (slomo) from comment #37) > For the actual bug here, as discussed on IRC the GstVideoAlignment should > only ever be used if GstVideoMeta is supported. Which avviddec seems to do > though. So not sure yet what happens here exactly. That's not true, you can enable GstVideoAlignment as long as downstream supports GstVideoMeta. The trick is that the strides and offsets are updated accordingly (and stored in GstVideoMeta). The problem is whenever something upstream assumes downstream support GstVideoMeta. While it's not validated, upstream need to pick padding that does not disturb the chroma. That how we avoid spurious copy in something like "avdec_... ! videoconvert ! ximagesink". avdec_ uses a default pool and activate the video alignment to do direct rendering, and then videoconvert will video_frame_map() as if there was nothing special about the buffer.
The issue with waylandsink is fixed in another bug, efl sink is an outside project, and got fixed, so I'm closing this one now. Thanks for the report.