Bug 763663 – libvpx segfaults on Windows/x86

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 763663 - libvpx segfaults on Windows/x86


Summary:	libvpx segfaults on Windows/x86


Status:	RESOLVED FIXED

Product:	GStreamer
Classification:	Platform
Component:	cerbero
Version:	git master
Hardware:	Other Windows

Importance:	Normal blocker
Target Milestone:	1.10.3
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2016-03-15 07:54 UTC by Sebastian Dröge (slomo)
Modified:	2017-01-25 12:22 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Fix vpx crashes on win32 by patching the Cerbero recipe (8.73 KB, patch) 2016-05-20 01:53 UTC, Nirbheek Chauhan	committed	Details \| Review
fix crash adding -mstackrealign (793 bytes, patch) 2017-01-25 00:15 UTC, Nicola	committed	Details \| Review

Description Sebastian Dröge (slomo) 2016-03-15 07:54:34 UTC

This worked before

> gst-launch-1.0 videotestsrc ! vp8enc ! fakesink

+ Trace 236082

Thread 5 (Thread 2636.0xea8)

#0 vp8_fast_quantize_b_ssse3
at vp8/encoder/x86/quantize_ssse3.c line 114
#1 vp8_encode_intra4x4block
at vp8/encoder/encodeintra.c line 71
#2 pick_intra4x4block
at vp8/encoder/pickinter.c line 276
#3 pick_intra4x4mby_modes
at vp8/encoder/pickinter.c line 316
#4 vp8_pick_intra_mode
at vp8/encoder/pickinter.c line 1553
#5 vp8cx_encode_intra_macroblock
at vp8/encoder/encodeframe.c line 1196
#6 encode_mb_row
at vp8/encoder/encodeframe.c line 504
#7 vp8_encode_frame
at vp8/encoder/encodeframe.c line 944
#8 encode_frame_to_data_rate
at vp8/encoder/onyx_if.c line 4372
#9 vp8_get_compressed_data
#10 vp8e_encode
at vp8/vp8_cx_iface.c line 946
#11 vpx_codec_encode
at vpx/src/vpx_encoder.c line 223
#12 gst_vpx_enc_handle_frame
at gstvpxenc.c line 1896
#13 gst_video_encoder_chain
at gstvideoencoder.c line 1480
#14 gst_pad_chain_data_unchecked
at gstpad.c line 4155
#15 gst_pad_push_data
at gstpad.c line 4407
#16 gst_pad_push
at gstpad.c line 4526
#17 gst_queue_push_one
at gstqueue.c line 1338
#18 gst_queue_loop
at gstqueue.c line 1485
#19 gst_task_func
at gsttask.c line 332
#20 g_thread_pool_thread_proxy
at gthreadpool.c line 307
#21 g_thread_proxy
at gthread.c line 778
#22 g_thread_win32_proxy
at gthread-win32.c line 450
#23 msvcrt!_cexit
from C:\WINDOWS\system32\msvcrt.dll
#24 ??
#25 msvcrt!_beginthreadex
from C:\WINDOWS\system32\msvcrt.dll
#26 KERNEL32!BaseThreadInitThunk
from C:\WINDOWS\system32\kernel32.dll
#27 ntdll!LdrRemoveLoadAsDataTable
from C:\WINDOWS\SYSTEM32\ntdll.dll
#28 ??
#29 ntdll!LdrRemoveLoadAsDataTable
from C:\WINDOWS\SYSTEM32\ntdll.dll
#30 ??
#31 ntdll!RtlCaptureContext
from C:\WINDOWS\SYSTEM32\ntdll.dll
#32 ??

Comment 1 Sebastian Dröge (slomo) 2016-03-15 07:55:55 UTC

Also works on Windows/x86-64 still

Comment 2 Sebastian Dröge (slomo) 2016-03-17 07:58:58 UTC

Might be a problem with our toolchain and would be good to get a libvpx DLL from somewhere else for testing.

Comment 3 Sebastian Dröge (slomo) 2016-03-21 07:55:16 UTC

Unfortunately we fail to build shared libraries on !Linux

Comment 4 Nirbheek Chauhan 2016-05-20 00:26:35 UTC

I can reproduce it on 32-bit even when I built libvpx and all of gstreamer with Visual Studio 2015 (still as a static library). Also, both vp8enc and vp9enc crash.

Comment 5 Nirbheek Chauhan 2016-05-20 00:32:03 UTC

Backtrace from Visual Studio 2015:

 	_vpx_sub_pixel_variance16xh_ssse3 
 	vpx_sub_pixel_variance16x16_ssse3 Line 387
 	vp8_find_best_sub_pixel_step_iteratively Line 279
 	vp8_rd_pick_inter_mode Line 2367
 	vp8cx_encode_inter_macroblock Line 1271
 	encode_mb_row Line 511
 	vp8_encode_frame Line 947
 	encode_frame_to_data_rate Line 4374
 	vp8_get_compressed_data Line 5553
 	vp8e_encode Line 946
	vpx_codec_encode Line 223
 	gst_vpx_enc_handle_frame Line 1906
 	gst_video_encoder_chain Line 1480
 	gst_pad_chain_data_unchecked Line 4176
 	gst_pad_push_data Line 4431
 	gst_pad_push Line 4547
 	gst_base_src_loop Line 2850
 	gst_task_func Line 332
 	default_func Line 68
 	g_thread_pool_thread_proxy Line 308
 	g_thread_proxy Line 780
 	g_thread_win32_proxy Line 452
 	[External Code]	
 	[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]

Comment 6 Nirbheek Chauhan 2016-05-20 01:53:20 UTC

Created attachment 328238 [details] [review]
Fix vpx crashes on win32 by patching the Cerbero recipe

Upstream has two patches that fix this problem. Attached patch fixes it in Cerbero. 

Honestly I'm surprised this shipped with 1.5.0 even. Looks like upstream doesn't test vpx encoding on Windows properly.

Comment 7 Sebastian Dröge (slomo) 2016-05-20 06:00:24 UTC

commit 6455c862f79423b60af257afeeee9b11fa844234
Author: Nirbheek Chauhan <nirbheek@centricular.com>
Date:   Fri May 20 07:19:33 2016 +0530

    libvpx: Fix crashes on win32
    
    See https://bugzilla.gnome.org/show_bug.cgi?id=763663

Comment 8 Sebastian Dröge (slomo) 2016-06-17 12:23:56 UTC

Still happens here

Comment 9 Nirbheek Chauhan 2016-06-23 19:51:02 UTC

I can no longer reproduce the crash with my MSVC builds, so it's much more difficult to debug this now.

Comment 10 Sebastian Dröge (slomo) 2016-08-26 17:35:12 UTC

This might be solved now with libvpx 1.6.0, someone will have to try :)

Comment 11 Sebastian Dröge (slomo) 2016-08-31 10:27:59 UTC

Still a problem with libvpx 1.6.0

Comment 12 Nicola 2017-01-09 15:48:06 UTC

(In reply to Sebastian Dröge (slomo) from comment #2)
> Might be a problem with our toolchain and would be good to get a libvpx DLL
> from somewhere else for testing.

this bug is not caused by the ancient cerbero toolchain, I just compiled gstreamer and libvpx with archlinux mingw-w64 toolchain (https://www.archlinux.org/packages/?sort=&q=mingw-w64&maintainer=&flagged=) and I confirm the crash on x86 while it works on x86_64

Comment 13 Nicola 2017-01-16 14:08:18 UTC

same problem with libvpx 1.6.1, I tryed to get a backtrace to report upstream with no luck, gdb seems unable to produce a backtrace

(gdb) run
Starting program: D:\Condivisione\MingGW-W64\i686\i686-w64-mingw32-gst-launch-1.0.exe videotestsrc "!" vp8enc "!" fakesink
[New Thread 4324.0x3bc]
[New Thread 4324.0x59c]
[New Thread 4324.0x3f0]
[New Thread 4324.0x1560]
warning: Can not parse XML library list; XML support was disabled at compile time
[New Thread 4324.0x1598]
[New Thread 4324.0x105c]
Setting pipeline to PAUSED ...
[New Thread 4324.0x1090]
Pipeline is PREROLLING ...
Redistribute latency...

Thread 7 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4324.0x1090]
0x648472c9 in ?? ()
(gdb) bt

+ Trace 237062

#0 ??

Comment 14 Nirbheek Chauhan 2017-01-17 19:46:42 UTC

Maybe try fiddling with the CFLAGS used for building libvpx? Turn off optimization (-O0), etc? The fact that this happens with GCC but not MSVC makes me think it's a compiler bug of some sort, perhaps related to optimization.

Comment 15 Nicola 2017-01-24 13:10:42 UTC

here is a better backtrace

(gdb) bt
bt

+ Trace 237080

#0 vpx_tm_predictor_16x16_sse2
from D:\Condivisione\GstCurrent\i686\libvpx.dll.4.1.0
#1 vp8_build_intra_predictors_mby_s
from D:\Condivisione\GstCurrent\i686\libvpx.dll.4.1.0
#2 ??
#3 ??
#4 ??
#5 ??
#6 ??
#7 ??
#8 ??
#9 ??
#10 ??
#11 ??
#12 ??
#13 ??
#14 ??
#15 ??
#16 ??
#17 ??
#18 ??
#19 ??
#20 vp8_bmode_prob
from D:\Condivisione\GstCurrent\i686\libvpx.dll.4.1.0
#0 vpx_tm_predictor_16x16_sse2
from D:\Condivisione\GstCurrent\i686\libvpx.dll.4.1.0
#1 vp8_build_intra_predictors_mby_s
from D:\Condivisione\GstCurrent\i686\libvpx.dll.4.1.0
#2 ??
#3 ??
#4 ??
#5 ??
#6 ??
#7 ??
#8 ??
#9 ??
#10 ??
#11 ??
#12 ??
#13 ??
#14 ??
#15 ??
#16 ??
#17 ??
#18 ??
#19 ??
#20 vp8_bmode_prob
from D:\Condivisione\GstCurrent\i686\libvpx.dll.4.1.0

disass $pc-32,$pc+32
Dump of assembler code from 0x66cdf939 to 0x66cdf979:
   0x66cdf939 <vpx_tm_predictor_16x16_sse2+25>: push   %edx
   0x66cdf93a <vpx_tm_predictor_16x16_sse2+26>: lock movdqa (%edx),%xmm0
   0x66cdf93f <vpx_tm_predictor_16x16_sse2+31>: punpckhbw %xmm1,%xmm2
   0x66cdf943 <vpx_tm_predictor_16x16_sse2+35>: movdqa %xmm0,%xmm4
   0x66cdf947 <vpx_tm_predictor_16x16_sse2+39>: punpckhbw %xmm1,%xmm4
   0x66cdf94b <vpx_tm_predictor_16x16_sse2+43>: punpcklbw %xmm1,%xmm0
   0x66cdf94f <vpx_tm_predictor_16x16_sse2+47>: mov    $0xfffffff8,%edx
   0x66cdf954 <vpx_tm_predictor_16x16_sse2+52>: pshufhw $0xff,%xmm2,%xmm2
=> 0x66cdf959 <vpx_tm_predictor_16x16_sse2+57>: movdqa (%ebx),%xmm3
   0x66cdf95d <vpx_tm_predictor_16x16_sse2+61>: punpckhqdq %xmm2,%xmm2
   0x66cdf961 <vpx_tm_predictor_16x16_sse2+65>: psubw  %xmm2,%xmm0
   0x66cdf965 <vpx_tm_predictor_16x16_sse2+69>: psubw  %xmm2,%xmm4
   0x66cdf969 <vpx_tm_predictor_16x16_sse2+73>: movdqa %xmm3,%xmm5
   0x66cdf96d <vpx_tm_predictor_16x16_sse2+77>: punpckhbw %xmm1,%xmm5
   0x66cdf971 <vpx_tm_predictor_16x16_sse2+81>: punpcklbw %xmm1,%xmm3
   0x66cdf975 <vpx_tm_predictor_16x16_sse2+85>: lea    0x0(,%ecx,8),%esi
End of assembler dump.
(gdb)

Dump of assembler code from 0x66cdf939 to 0x66cdf979:
   0x66cdf939 <vpx_tm_predictor_16x16_sse2+25>: push   %edx
   0x66cdf93a <vpx_tm_predictor_16x16_sse2+26>: lock movdqa (%edx),%xmm0
   0x66cdf93f <vpx_tm_predictor_16x16_sse2+31>: punpckhbw %xmm1,%xmm2
   0x66cdf943 <vpx_tm_predictor_16x16_sse2+35>: movdqa %xmm0,%xmm4
   0x66cdf947 <vpx_tm_predictor_16x16_sse2+39>: punpckhbw %xmm1,%xmm4
   0x66cdf94b <vpx_tm_predictor_16x16_sse2+43>: punpcklbw %xmm1,%xmm0
   0x66cdf94f <vpx_tm_predictor_16x16_sse2+47>: mov    $0xfffffff8,%edx
   0x66cdf954 <vpx_tm_predictor_16x16_sse2+52>: pshufhw $0xff,%xmm2,%xmm2
=> 0x66cdf959 <vpx_tm_predictor_16x16_sse2+57>: movdqa (%ebx),%xmm3
   0x66cdf95d <vpx_tm_predictor_16x16_sse2+61>: punpckhqdq %xmm2,%xmm2
   0x66cdf961 <vpx_tm_predictor_16x16_sse2+65>: psubw  %xmm2,%xmm0
   0x66cdf965 <vpx_tm_predictor_16x16_sse2+69>: psubw  %xmm2,%xmm4
   0x66cdf969 <vpx_tm_predictor_16x16_sse2+73>: movdqa %xmm3,%xmm5
   0x66cdf96d <vpx_tm_predictor_16x16_sse2+77>: punpckhbw %xmm1,%xmm5
   0x66cdf971 <vpx_tm_predictor_16x16_sse2+81>: punpcklbw %xmm1,%xmm3
   0x66cdf975 <vpx_tm_predictor_16x16_sse2+85>: lea    0x0(,%ecx,8),%esi
End of assembler dump.
(gdb) info all-registers
info all-registers
eax            0x3290c40        53021760
ecx            0x10     16
edx            0xfffffff8       -8
ebx            0x385f254        59109972
esp            0x385f208        0x385f208
ebp            0x81818181       0x81818181
esi            0x81818181       -2122219135
edi            0x480    1152
eip            0x66cdf959       0x66cdf959 <vpx_tm_predictor_16x16_sse2+57>
eflags         0x10206  [ PF IF RF ]
cs             0x23     35
ss             0x2b     43
ds             0x2b     43
es             0x2b     43
fs             0x53     83
gs             0x2b     43
st0            0        (raw 0x00000000000000000000)
st1            0        (raw 0x00000000000000000000)
st2            0        (raw 0x00000000000000000000)
st3            0.25     (raw 0x3ffd8000000000000000)
st4            7        (raw 0x4001e000000000000000)
st5            1        (raw 0x3fff8000000000000000)
st6            1.25     (raw 0x3fffa000000000000000)
st7            13.454342644059432       (raw 0x4002d744fccad69d6800)
fctrl          0x27f    639
fstat          0x420    1056
ftag           0xffff   65535
fiseg          0x0      0
fioff          0x66bfa9ca       1723836874
---Type <return> to continue, or q <return> to quit---

foseg          0x0      0
fooff          0x385f33c        59110204
fop            0x0      0
xmm0           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0,
    0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0}, v8_int16 = {0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f,
    0x7f}, v4_int32 = {0x7f007f, 0x7f007f, 0x7f007f, 0x7f007f}, v2_int64 = {0x7f007f007f007f, 0x7f007f007f007f},
  uint128 = 0x007f007f007f007f007f007f007f007f}
xmm1           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>},
  v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
  uint128 = 0x00000000000000000000000000000000}
xmm2           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0xd, 0x0, 0xf0, 0x0, 0xad, 0x0,
    0xba, 0x0, 0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0}, v8_int16 = {0xd, 0xf0, 0xad, 0xba, 0x7f, 0x7f, 0x7f,
    0x7f}, v4_int32 = {0xf0000d, 0xba00ad, 0x7f007f, 0x7f007f}, v2_int64 = {0xba00ad00f0000d, 0x7f007f007f007f},
  uint128 = 0x007f007f007f007f00ba00ad00f0000d}
xmm3           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>},
  v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
  uint128 = 0x00000000000000000000000000000000}
xmm4           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0,
    0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0, 0x7f, 0x0}, v8_int16 = {0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f, 0x7f,
    0x7f}, v4_int32 = {0x7f007f, 0x7f007f, 0x7f007f, 0x7f007f}, v2_int64 = {0x7f007f007f007f, 0x7f007f007f007f},
  uint128 = 0x007f007f007f007f007f007f007f007f}
xmm5           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>},
  v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
  uint128 = 0x00000000000000000000000000000000}
xmm6           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x0, 0x0}, v16_int8 = {0x0 <repeats 16 times>},
  v8_int16 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_int32 = {0x0, 0x0, 0x0, 0x0}, v2_int64 = {0x0, 0x0},
  uint128 = 0x00000000000000000000000000000000}
xmm7           {v4_float = {0x0, 0x0, 0x0, 0x0}, v2_double = {0x8000000000000000, 0x8000000000000000}, v16_int8 = {
    0x75, 0x1a, 0xf8, 0xff, 0xf8, 0xff, 0xf8, 0xff, 0xf8, 0xff, 0xf8, 0xff, 0xf8, 0xff, 0xf8, 0xff}, v8_int16 = {
---Type <return> to continue, or q <return> to quit---

    0x1a75, 0xfff8, 0xfff8, 0xfff8, 0xfff8, 0xfff8, 0xfff8, 0xfff8}, v4_int32 = {0xfff81a75, 0xfff8fff8, 0xfff8fff8,
    0xfff8fff8}, v2_int64 = {0xfff8fff8fff81a75, 0xfff8fff8fff8fff8}, uint128 = 0xfff8fff8fff8fff8fff8fff8fff81a75}
mxcsr          0x1f80   [ IM DM ZM OM UM PM ]
mm0            {uint64 = 0x0, v2_int32 = {0x0, 0x0}, v4_int16 = {0x0, 0x0, 0x0, 0x0}, v8_int8 = {0x0, 0x0, 0x0, 0x0,
    0x0, 0x0, 0x0, 0x0}}
mm1            {uint64 = 0x0, v2_int32 = {0x0, 0x0}, v4_int16 = {0x0, 0x0, 0x0, 0x0}, v8_int8 = {0x0, 0x0, 0x0, 0x0,
    0x0, 0x0, 0x0, 0x0}}
mm2            {uint64 = 0x0, v2_int32 = {0x0, 0x0}, v4_int16 = {0x0, 0x0, 0x0, 0x0}, v8_int8 = {0x0, 0x0, 0x0, 0x0,
    0x0, 0x0, 0x0, 0x0}}
mm3            {uint64 = 0x8000000000000000, v2_int32 = {0x0, 0x80000000}, v4_int16 = {0x0, 0x0, 0x0, 0x8000},
  v8_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80}}
mm4            {uint64 = 0xe000000000000000, v2_int32 = {0x0, 0xe0000000}, v4_int16 = {0x0, 0x0, 0x0, 0xe000},
  v8_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xe0}}
mm5            {uint64 = 0x8000000000000000, v2_int32 = {0x0, 0x80000000}, v4_int16 = {0x0, 0x0, 0x0, 0x8000},
  v8_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80}}
mm6            {uint64 = 0xa000000000000000, v2_int32 = {0x0, 0xa0000000}, v4_int16 = {0x0, 0x0, 0x0, 0xa000},
  v8_int8 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xa0}}
mm7            {uint64 = 0xd744fccad69d6800, v2_int32 = {0xd69d6800, 0xd744fcca}, v4_int16 = {0x6800, 0xd69d, 0xfcca,
    0xd744}, v8_int8 = {0x0, 0x68, 0x9d, 0xd6, 0xca, 0xfc, 0x44, 0xd7}}

I'll try to report upstream

Comment 16 Nicola 2017-01-24 14:27:54 UTC

reported upstream

https://bugs.chromium.org/p/webm/issues/detail?id=1363

Comment 17 Nicola 2017-01-25 00:15:30 UTC

Created attachment 344189 [details] [review]
fix crash adding -mstackrealign

Untested with cerbero, -mstackrealign solves the segfault when cross compiling with gcc 6.3.1 from arch

this CFLAGS was suggested by libvpx developers see

https://bugs.chromium.org/p/webm/issues/detail?id=1363

Comment 18 Sebastian Dröge (slomo) 2017-01-25 08:47:33 UTC

commit a041bb1d0eac0ee597b504992d88c25d776c42e0
Author: Nicola Murino <nicola.murino@gmail.com>
Date:   Wed Jan 25 01:11:15 2017 +0100

    libvpx: Fix crash on win32
    
    Using -mstackrealign solves the segfault when compiling with mingw
    It was suggested by libvpx developers here:
    https://bugs.chromium.org/p/webm/issues/detail?id=1363
    
    https://bugzilla.gnome.org/show_bug.cgi?id=763663

Comment 19 Sebastian Dröge (slomo) 2017-01-25 08:48:04 UTC

Backporting to 1.10 once it's confirmed to not break things more