After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 754120 - avdec_hevc: Segfault in hevc decode with glimagesink and AVX2 CPU feature
avdec_hevc: Segfault in hevc decode with glimagesink and AVX2 CPU feature
Status: RESOLVED FIXED
Product: GStreamer
Classification: Platform
Component: gst-libav
git master
Other Linux
: Normal blocker
: 1.5.91
Assigned To: GStreamer Maintainers
GStreamer Maintainers
Depends on:
Blocks:
 
 
Reported: 2015-08-26 14:06 UTC by sreerenj
Modified: 2015-09-15 15:46 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
sample h265 stream (346.25 KB, application/octet-stream)
2015-08-26 14:06 UTC, sreerenj
  Details
avvidec: fix crash in AVX2-based H.265 decoder code path with ximagesink/glimagesink (1.07 KB, patch)
2015-08-28 08:42 UTC, Tim-Philipp Müller
committed Details | Review
avvideodec: ensure required mem alignment fixing avdec_h265 crashes with ximagesink/glimagesink (3.52 KB, patch)
2015-09-11 22:45 UTC, Tim-Philipp Müller
none Details | Review
videopool: ensure allocation alignment is consistent with video alignment requirements (1.99 KB, patch)
2015-09-11 22:46 UTC, Tim-Philipp Müller
accepted-commit_now Details | Review
avvideodec: ensure required mem alignment fixing avdec_h265 crashes with ximagesink/glimagesink (v2) (3.06 KB, patch)
2015-09-15 09:34 UTC, Tim-Philipp Müller
committed Details | Review
videopool: ensure allocation alignment is consistent with video alignment requirements (v2) (2.02 KB, patch)
2015-09-15 09:36 UTC, Tim-Philipp Müller
committed Details | Review
gl: memory: take into account video stride alignment requirements (2.56 KB, patch)
2015-09-15 11:00 UTC, Tim-Philipp Müller
none Details | Review
gl: bufferpool take into account video stride alignment requirements (v2) (3.45 KB, patch)
2015-09-15 12:36 UTC, Tim-Philipp Müller
committed Details | Review

Description sreerenj 2015-08-26 14:06:58 UTC
Created attachment 310033 [details]
sample h265 stream

gst-launch-1.0 filesrc location= SLIST_D_Sony_9.bin ! h265parse ! avdec_h265 ! glimagesink

There are other samples too which are failing.
Comment 1 Nicolas Dufresne (ndufresne) 2015-08-26 15:41:08 UTC
While comparing setup, Tim found that it crash in a routine using AVX2 extension. It makes a lot of sense that I can't reproduce as this routine is not called (I have avx, but not avx2 here). Looks like it's i7 specific crash.
Comment 2 Sebastian Dröge (slomo) 2015-08-26 19:36:42 UTC
The big question now: does it also crash when using ffplay (from the same ffmpeg version!) on this file?
Comment 3 Sebastian Dröge (slomo) 2015-08-26 19:40:54 UTC
Crashes here, but only when using glimagesink, not when using xvimagesink. Also not when using videoconvert to convert to e.g. RGBA and then glimagesink. So I would assume it's something special to glimagesink here, maybe alignment isn't properly handled.

ffplay does not crash.
Comment 4 Nicolas Dufresne (ndufresne) 2015-08-26 21:48:04 UTC
AVX2 enabled CPU owner will have to debug this, or provide me with more information as I don't own such a PC (sorry).
Comment 5 Nicolas Dufresne (ndufresne) 2015-08-26 22:06:25 UTC
I forgot to mention, but with glimagesink, we have one buffer per plane, which is less forgiving then 1 allocation for all planes (like ffmpeg itself and most of our sink would do).
Comment 6 Tim-Philipp Müller 2015-08-28 08:42:47 UTC
Created attachment 310157 [details] [review]
avvidec: fix crash in AVX2-based H.265 decoder code path with ximagesink/glimagesink

This fixes it for me.
Comment 7 Tim-Philipp Müller 2015-08-28 08:56:37 UTC
Waiting for sree to confirm.

I don't fully understand why this is required though, it might just be coincidence that this fixes it. I didn't find anything that says that 128-bit alignment is required for vinserti128.

It crashes here:

ff_hevc_transform_add32_8_avx2 () at libavcodec/x86/hevc_res_add.asm:186
186	    TR_ADD_SSE_16_32_8   0, r0,      r0+r2

which is defined as:

%macro TR_ADD_SSE_16_32_8 3
    mova             xm2, [r1+%1   ]
    mova             xm6, [r1+%1+16]
%if cpuflag(avx2)
    vinserti128       m2, m2, [r1+%1+32], 1
    vinserti128       m6, m6, [r1+%1+48], 1
%endif
%if cpuflag(avx)
    psubw             m1, m0, m2
    psubw             m5, m0, m6
%else
    mova              m1, m0
    mova              m5, m0
    psubw             m1, m2
    psubw             m5, m6
%endif
    packuswb          m2, m6
    packuswb          m1, m5

    mova             xm4, [r1+%1+mmsize*2   ]
    mova             xm6, [r1+%1+mmsize*2+16]
%if cpuflag(avx2)
    vinserti128       m4, m4, [r1+%1+96 ], 1
    vinserti128       m6, m6, [r1+%1+112], 1
%endif
%if cpuflag(avx)
    psubw             m3, m0, m4
    psubw             m5, m0, m6
%else
    mova              m3, m0
    mova              m5, m0
    psubw             m3, m4
    psubw             m5, m6
%endif
    packuswb          m4, m6
    packuswb          m3, m5

    paddusb           m2, [%2]
    paddusb           m4, [%3]
    psubusb           m2, m1
    psubusb           m4, m3
    mova            [%2], m2
    mova            [%3], m4
%endmacro
Comment 8 Sebastian Dröge (slomo) 2015-08-28 09:02:16 UTC
I would expect an instruction that works on 128 bit data requires the data to be aligned to 128 bits :)
Comment 9 Tim-Philipp Müller 2015-08-28 09:07:25 UTC
Even then, that's 16 bytes (what we currently use for alignment), not 32 bytes.

Need to dig a bit more.
Comment 10 sreerenj 2015-08-28 09:36:52 UTC
Tested, and it is working fine with vaapisink and glimagesink...

I am not sure about the alignment issue , but what I can see is that, all Intel platforms >= IVB are using 128 bit alignment in vaapi-intel-driver..
Comment 11 Nicolas Dufresne (ndufresne) 2015-08-28 12:16:46 UTC
Review of attachment 310157 [details] [review]:

Looks fine to me. Arguably the problem should be reported to ffmpeg, as the align2 function should deal with that.
Comment 12 Tim-Philipp Müller 2015-08-28 12:26:52 UTC
I'm not really convinced yet that this is a proper fix and not just something that makes it work by coincidence.
Comment 13 Tim-Philipp Müller 2015-08-30 13:19:32 UTC
Did some more poking around.

Does not seem to be threading related (setting max-threads=1 and adding locking to the function doesn't help.

If I comment out the two 'mova' at the end of TR_ADD_SSE_16_32_8 it works:
http://git.videolan.org/?p=ffmpeg.git;a=blob;f=libavcodec/x86/hevc_res_add.asm;hb=ed450d4acfc9fea0dae2df2b0a543ec0d602d31a#l132

Can't see any obvious alignment issues with the pointers passed to it, judging from printfs (and it doesn't even seem to require 16-byte alignment).
Comment 14 Sebastian Dröge (slomo) 2015-09-02 16:10:40 UTC
Any news here, new insights?
Comment 15 Tim-Philipp Müller 2015-09-02 18:28:23 UTC
Not really. I am not sure if the alignment thing is not a red herring.

This also seems to "fix" it for me:

@@ -700,7 +700,7 @@ gst_ffmpegviddec_ensure_internal_pool (GstFFMpegVidDec * ffmpegdec,
   config = gst_buffer_pool_get_config (ffmpegdec->internal_pool);
 
   caps = gst_video_info_to_caps (&info);
-  gst_buffer_pool_config_set_params (config, caps, info.size, 2, 0);
+  gst_buffer_pool_config_set_params (config, caps, info.size, 10, 0);
   gst_buffer_pool_config_set_allocator (config, NULL, &params);
   gst_buffer_pool_config_add_option (config, GST_BUFFER_POOL_OPTION_VIDEO_META);
 
@@ -805,10 +805,8 @@ gst_ffmpegviddec_get_buffer2 (AVCodecContext * context, AVFrame * picture,
     if (c < GST_VIDEO_INFO_N_PLANES (&ffmpegdec->pool_info)) {
       picture->data[c] = GST_VIDEO_FRAME_PLANE_DATA (&dframe->vframe, c);
       picture->linesize[c] = GST_VIDEO_FRAME_PLANE_STRIDE (&dframe->vframe, c);
-
       if (ffmpegdec->stride[c] == -1)
         ffmpegdec->stride[c] = picture->linesize[c];
-
       /* libav does not allow stride changes, decide allocation should check
        * before replacing the internal pool with a downstream pool.
        * https://bugzilla.gnome.org/show_bug.cgi?id=704769
@@ -1802,7 +1800,7 @@ gst_ffmpegviddec_decide_allocation (GstVideoDecoder * decoder, GstQuery * query)
   }
 
   config = gst_buffer_pool_get_config (pool);
-  gst_buffer_pool_config_set_params (config, state->caps, size, min, max);
+  gst_buffer_pool_config_set_params (config, state->caps, size, MAX (10, min), max);
   gst_buffer_pool_config_set_allocator (config, allocator, &params);
 
   have_videometa =
@@ -1873,7 +1871,7 @@ gst_ffmpegviddec_decide_allocation (GstVideoDecoder * decoder, GstQuery * query)
       gst_object_unref (pool);
       pool = gst_video_buffer_pool_new ();
       config = gst_buffer_pool_get_config (pool);
-      gst_buffer_pool_config_set_params (config, state->caps, size, min, max);
+      gst_buffer_pool_config_set_params (config, state->caps, size, MAX (10, min), max);
       gst_buffer_pool_config_set_allocator (config, NULL, &params);
       gst_buffer_pool_set_config (pool, config);
       update_pool = TRUE;

Not sure what to conclude from that :)
Comment 16 Nicolas Dufresne (ndufresne) 2015-09-02 18:43:33 UTC
I'm really surprise of that one, he answer might be in the way the libc heap is implemented. Allocating them all together (instead of on-demand) might result in having consecutive memory where it simply don't crash if you read passed the buffer boundary. Other then that, that pool can create buffer as needed, so it should have no impact.
Comment 17 Sebastian Dröge (slomo) 2015-09-06 15:40:59 UTC
Or the GstVideoAlignment stuff in glimagesink is not 100% correct yet. IIRC Tim said that it only fails with glimagesink but not with xvimagesink?
Comment 18 Nicolas Dufresne (ndufresne) 2015-09-08 14:27:41 UTC
(In reply to Sebastian Dröge (slomo) from comment #17)
> Or the GstVideoAlignment stuff in glimagesink is not 100% correct yet. IIRC
> Tim said that it only fails with glimagesink but not with xvimagesink?

I was trying to say this can be explained by xvimagesink having 1 big buffer allocated, while glimagesink have N independent allocations. If you overshoot xvimagesink allocation, you will generally not crash, but with glimagesink, if you fall out of the last page, you are likely to crash. Some SIMD instructions can confuse valgrind, so I would not trust it here.

Another fact, is that xvimagesink (and generic videopool) will add some extra padding lines between planes, glimagesink does not do that, there is no obvious reason to (upstream should be requesting that if needed).
Comment 19 Sebastian Dröge (slomo) 2015-09-09 15:34:08 UTC
Interesting observation here. It also crashes with the GstVideoAlignment used by videoconvert:

gst-launch-1.0 filesrc location= SLIST_D_Sony_9.bin ! h265parse ! avdec_h265 ! videoconvert ! "video/x-raw,format=AYUV" ! glimagesink


But not with the same and "videoconvert ! xvimagesink" as the sink. Not sure what to make out of that :)
Comment 20 Tim-Philipp Müller 2015-09-11 22:43:42 UTC
 gstavviddec.c:663:gst_ffmpegvideodec_prepare_dr_pool:<avdec_h265-0> aligned dimension 832x480 -> 864x512 padding t:16 l:16 r:16 b:17, stride_align 31:31:31:31
gstvideopool.c:181:video_buffer_pool_set_config: valign 31 31 31 31
gstvideopool.c:236:video_buffer_pool_alloc:<videobufferpool0> alloc 740160 align 15 ...
Comment 21 Tim-Philipp Müller 2015-09-11 22:45:13 UTC
Created attachment 311188 [details] [review]
avvideodec: ensure required mem alignment fixing avdec_h265 crashes with ximagesink/glimagesink

Make sure the alignment requirement in GstAllocationParams
matches the GstVideoAlignment requirements. This fixes
issues with avdec_h265 crashing in the avx2 code path when
used with playbin and ximagesink/glimagesink as videosink.
    
The internal video pool would allocate buffers with an
alignment of 15 even though GstVideoAlignment specified
a stride_align requirement of 31 (which comes from ffmpeg).
Comment 22 Tim-Philipp Müller 2015-09-11 22:46:28 UTC
Created attachment 311189 [details] [review]
videopool: ensure allocation alignment is consistent with video alignment requirements

Make sure GstAllocationParams alignment is not less than
any alignment requirement specified via GstVideoAlignment.
Comment 23 Tim-Philipp Müller 2015-09-11 22:47:27 UTC
All of these patches fix the issue for me (on their own).
Comment 24 Tim-Philipp Müller 2015-09-12 08:35:29 UTC
Olivier suggested on IRC to move the loop that checks the alignments into a utility function. This means a caller would still have to do get_allocator() to get the AllocationParams and then call set_allocator() again if it was updated. It's the most transparent way of doing things though.

Alternatively, one could also simply do all of this in gst_buffer_pool_config_set_video_alignment(). This means that the GstAllocationParams may change underneath the caller, which may or may not be a problem in practice. Needs to be documented I guess.
Comment 25 Sebastian Dröge (slomo) 2015-09-12 08:55:08 UTC
(In reply to Tim-Philipp Müller from comment #24)
> Olivier suggested on IRC to move the loop that checks the alignments into a
> utility function. This means a caller would still have to do get_allocator()
> to get the AllocationParams and then call set_allocator() again if it was
> updated. It's the most transparent way of doing things though.

Is it going to be needed in many places (more than 2) that it makes sense to have an utility function for it?

> Alternatively, one could also simply do all of this in
> gst_buffer_pool_config_set_video_alignment(). This means that the
> GstAllocationParams may change underneath the caller, which may or may not
> be a problem in practice. Needs to be documented I guess.

I think this should be handled explicitly by the caller of that API. The fallback in videopool is IMHO just to prevent stupid mistakes and get a GST_ERROR() out there while still doing something sensible that most likely does not crash :)
Comment 26 Sebastian Dröge (slomo) 2015-09-13 08:00:27 UTC
(In reply to Sebastian Dröge (slomo) from comment #25)
> (In reply to Tim-Philipp Müller from comment #24)
> > Olivier suggested on IRC to move the loop that checks the alignments into a
> > utility function. This means a caller would still have to do get_allocator()
> > to get the AllocationParams and then call set_allocator() again if it was
> > updated. It's the most transparent way of doing things though.
> 
> Is it going to be needed in many places (more than 2) that it makes sense to
> have an utility function for it?

And with that I mean, a utility function doesn't seem very useful for this. Code that uses the videoalignment API has to set the separate stride aligns anyway, it could as well just set the allocation parameters accordingly then. I don't think a utility function will make much of a difference here.
Comment 27 Nicolas Dufresne (ndufresne) 2015-09-13 16:04:28 UTC
Interesting, I had this reflection that the stride alignment was incompletly applied.

I think doing so in the pools is best. Note that set_video_alignment() can't really change the alloc params silently, as for most base class the parameters need to be updated in the query, otherwise they get reset.

We are missing patches for the glpool, which is different. No need for a loop in this case, since we have N allocation, just pick the max between the parameters et de shocked stride align.

It's also likely xnimagesink need to be checked.
Comment 28 Tim-Philipp Müller 2015-09-15 09:34:43 UTC
Created attachment 311338 [details] [review]
avvideodec: ensure required mem alignment fixing avdec_h265 crashes with ximagesink/glimagesink (v2)

Slightly different approach: figure out max_alignment first, then make it consistent. This means that if the alignment requirement in the GstAllocationParams is higher than the ones in GstVideoAlignment, the stride alignments get updated accordingly too. Before it was just one way.
Comment 29 Tim-Philipp Müller 2015-09-15 09:36:40 UTC
Created attachment 311340 [details] [review]
videopool: ensure allocation alignment is consistent with video alignment requirements (v2)

Different approach: figure out max alignment requirement first, then make sure both the GstAllocationParams alignment and the GstVideoAlignment stride requirements are consistent. That means the max will be configured everywhere, also if the params alignment is higher than the strides one. Before the fix-up was only one way, now it's both ways.
Comment 30 Tim-Philipp Müller 2015-09-15 11:00:29 UTC
Created attachment 311347 [details] [review]
gl: memory: take into account video stride alignment requirements

This fixes it with glimagesink for me (without any the other patches).

Not that we fix up mem->valign.stride_align[n] here, which seems the right thing to do but I'm not sure it's actually looked at for anything but the padding.
Comment 31 Matthew Waters (ystreet00) 2015-09-15 11:04:08 UTC
Review of attachment 311347 [details] [review]:

Looks mostly ok to me.

::: gst-libs/gst/gl/gstglmemory.c
@@ +646,3 @@
+  /* update to match max alignment requirement */
+  align_params.align = max_align;
+  mem->valign.stride_align[plane] = max_align;

Why don't we update the entire structure?
Comment 32 Tim-Philipp Müller 2015-09-15 12:36:50 UTC
Created attachment 311358 [details] [review]
gl: bufferpool take into account video stride alignment requirements (v2)

Updated patch: moved the checking/adjusting into glbufferpool, so that it's all done before gst_video_info_align() is called. Apparently calling it multiple times is troublesome. In glmemory we just do sanity checking now but don't adjust anything (maybe we should just drop this part?).
Comment 33 Matthew Waters (ystreet00) 2015-09-15 12:50:11 UTC
Review of attachment 311358 [details] [review]:

Looks fine by me.
Comment 34 Nicolas Dufresne (ndufresne) 2015-09-15 13:34:56 UTC
Review of attachment 311358 [details] [review]:

Why don't gst_video_info_align() maximize the stride alignment then ? We could add a new method, to which we pass a allocation alignment, and get this helper to update everything at once no ?
Comment 35 Tim-Philipp Müller 2015-09-15 15:41:58 UTC
Ok, pushed after further discussion on IRC.

The patches may not be optimal, but they err in favour of being safe. We can improve things some more and add some utility functions after the release. There's bug #755068 to track this.

commit 6fadf448de1cfe1b379b0fff8fa3c0f2eb6a4a79
Author: Tim-Philipp Müller <tim@centricular.com>
Date:   Fri Aug 28 09:38:53 2015 +0100

    avvidec: increase default alignment to 32 bytes
    
    Change default alignment from 16 to 32 bytes, which fixes crashes
    when decoding H.265 using AVX2-based decoder code paths and when
    using ximagesink/glimagesink.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=754120

commit a0ebef96372a61f8dc658995fb4f1480512b365b
Author: Tim-Philipp Müller <tim@centricular.com>
Date:   Fri Sep 11 23:19:21 2015 +0100

    avvideodec: ensure required mem alignment fixing avdec_h265 crashes with ximagesink/glimagesink
    
    Make sure the alignment requirement in GstAllocationParams
    matches the GstVideoAlignment requirements. This fixes
    issues with avdec_h265 crashing in the avx2 code path when
    used with playbin and ximagesink/glimagesink as videosink.
    
    The internal video pool would allocate buffers with an
    alignment of 15 even though GstVideoAlignment specified
    a stride_align requirement of 31 (which comes from ffmpeg).
    
    https://bugzilla.gnome.org/show_bug.cgi?id=754120



commit 8b96b52a62567d70ce827db00fd0f50f79133ee5
Author: Tim-Philipp Müller <tim@centricular.com>
Date:   Fri Sep 11 23:36:47 2015 +0100

    videopool: ensure allocation alignment is consistent with video alignment requirements
    
    Make sure GstAllocationParams alignment is not less than
    any alignment requirement specified via GstVideoAlignment.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=754120

commit fe8de1857a704af9ed6160dfdf8f93131e78ba19
Author: Tim-Philipp Müller <tim@centricular.com>
Date:   Tue Sep 15 11:34:12 2015 +0100

    gl: bufferpool take into account video stride alignment requirements
    
    when allocating memory. Fixes crashes with avdec_h265 in the AVX2
    code path which requires 32-byte stride alignment, but the
    GstAllocationParams only specified a 16-byte alignment.
    
    https://bugzilla.gnome.org/show_bug.cgi?id=754120