GNOME Bugzilla – Bug 797039
vaapi: display: race condition when extracting image formats
Last modified: 2018-08-31 16:27:23 UTC
Created attachment 373485 [details] reproduction script Confirmed happening both on stable and master, and on multiple hardware. The attached test program basically repeatedly runs this multi-decoder pipeline: gst-launch-1.0 filesrc location=/tmp/sample.h264 ! tee name=tee ! queue ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink Generating sample Setting pipeline to PAUSED ... ... Pipeline is PREROLLING ... Got context from element 'vaapipostproc3': gst.gl.GLDisplay=context, gst.gl.GLDisplay=(GstGLDisplay)"\(GstGLDisplayX11\)\ gldisplayx11-0"; double free or corruption (fasttop) Error after 4 iterations Note that removing vaapipostproc generates random internal data stream errors.
Might be similar to this one which was affecting encoders: https://bugzilla.gnome.org/show_bug.cgi?id=773546
Hi Florent, Can you post a backtrace?
Well, i'm afraid there is none; running inside gdb: $ gdb python (gdb) run test_corrupt.py Starting program: /usr/bin/python test_corrupt.py [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Setting pipeline to PAUSED ... ... Freeing pipeline ... Setting pipeline to PAUSED ... Pipeline is PREROLLING ... double free or corruption (fasttop) Error after 9 iterations [Inferior 1 (process 7580) exited normally]
Sorry, just managed to reproduce it without my script (gdb) run filesrc location=/tmp/sample.h264 ! tee name=tee ! queue ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink Starting program: /home/fthiery/gst-build/build/subprojects/gstreamer/tools/gst-launch-1.0 filesrc location=/tmp/sample.h264 ! tee name=tee ! queue ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink [Thread debugging using libthread_db enabled] Using host libthread_db library "/usr/lib/libthread_db.so.1". Setting pipeline to PAUSED ... [New Thread 0x7ffff1e3f700 (LWP 8518)] [New Thread 0x7ffff163e700 (LWP 8519)] [New Thread 0x7ffff163e700 (LWP 8520)] [Thread 0x7ffff163e700 (LWP 8519) exited] [New Thread 0x7ffff0e3d700 (LWP 8521)] [New Thread 0x7fffe268c700 (LWP 8522)] [New Thread 0x7fffe1b8b700 (LWP 8523)] [New Thread 0x7fffe138a700 (LWP 8524)] [New Thread 0x7fffe0b89700 (LWP 8525)] [New Thread 0x7fffd3fff700 (LWP 8526)] [New Thread 0x7fffd37fe700 (LWP 8527)] Pipeline is PREROLLING ... [New Thread 0x7fffd2ffd700 (LWP 8528)] double free or corruption (fasttop) Got context from element 'vaapipostproc3': gst.gl.GLDisplay=context, gst.gl.GLDisplay=(GstGLDisplay)"\(GstGLDisplayX11\)\ gldisplayx11-0"; Got context from element 'vaapipostproc3': gst.vaapi.Display=context, gst.vaapi.Display=(GstVaapiDisplay)"\(GstVaapiDisplayGLX\)\ vaapidisplayglx0"; Thread 10 "queue2:src" received signal SIGABRT, Aborted.
+ Trace 238679
Thread 140736750155520 (LWP 8526)
With full details: (gdb) bt full
+ Trace 238680
Do you reproduce the problem too ? Please let me know if there is anything else i can do to help.
If you have a change to run this in valgrind, a valgrind report would help a lot. Running your script doe reproduce the issue locally, with GStreamer master.
It seems to be the same issue that had the encoders, as you mentioned in comment 1, but in postproc.
I am trying to run this until it fails, but of course this is a race condition, valgrind may be slowing it so much that the problem may not happen again. If i ever get it, is the following good enough or is a leak report necessary ? $ while valgrind gst-launch-1.0 filesrc location=/tmp/sample.h264 ! tee name=tee ! queue ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink tee. ! queue ! h264parse ! vaapih264dec ! vaapipostproc ! "video/x-raw, format=(string)NV12, width=(int)1280, height=(int)720, framerate=(fraction)30" ! fakesink; do :; done ... Got EOS from element "pipeline0". Execution ended after 0:00:14.416327794 Setting pipeline to PAUSED ... Setting pipeline to READY ... Setting pipeline to NULL ... Freeing pipeline ... ==27980== ==27980== HEAP SUMMARY: ==27980== in use at exit: 747,612 bytes in 3,654 blocks ==27980== total heap usage: 349,937 allocs, 346,283 frees, 8,687,337,467 bytes allocated ==27980== ==27980== LEAK SUMMARY: ==27980== definitely lost: 16,880 bytes in 4 blocks ==27980== indirectly lost: 5,056 bytes in 12 blocks ==27980== possibly lost: 8,140 bytes in 71 blocks ==27980== still reachable: 700,536 bytes in 3,440 blocks ==27980== of which reachable via heuristic: ==27980== length64 : 792 bytes in 18 blocks ==27980== newarray : 1,648 bytes in 23 blocks ==27980== suppressed: 0 bytes in 0 blocks ==27980== Rerun with --leak-check=full to see details of leaked memory ==27980== ==27980== For counts of detected and suppressed errors, rerun with: -v ==27980== ERROR SUMMARY: 756483 errors from 559 contexts (suppressed: 2 from 2)
Created attachment 373506 [details] [review] libs: display: lock at extracting available image formates When running several vaapi elements at the concurrently, at initialization, there is a race condition when extractin the avaible formats for images and subpictures. This patch add a lock when the those arrays are filled.
Florent, Test this patch. It is quick and unthought, but let me know how it goes with it.
Review of attachment 373506 [details] [review]: Looks like it works for me. That patch looks good also. Florent do you confirm ?
(In reply to Florent Thiéry from comment #9) > I am trying to run this until it fails, but of course this is a race > condition, valgrind may be slowing it so much that the problem may not > happen again. Just let this run for the whole night, did not crash at all as suspected because of the slowdown. I applied your patch, and so far so good, many thanks ! I confirm it fixes the issue.
Attachment 373506 [details] pushed as 1f0b2fb - libs: display: lock at extracting available image formates
Many thanks; would you mind pushing this on 1.14 too (i tested it too) ?
Pushed in branch 1.14 * 4ae96e9b libs: display: lock at extracting available image formates