GNOME Bugzilla – Bug 728502
ximagesrc has a serious shmem leak
Last modified: 2014-05-22 18:15:31 UTC
Created attachment 274672 [details] [review] ximagesrc shmem leak patch Seems I found quite old ximagesrc leak that, I suppose, affects all platforms since 0.10 (or maybe earlier). Here is description of how I discovered it and some technical background about the bug. For my arm board with Ubuntu 12.04 I wrote an app that captures a desktop, encodes video to h264 and streams it to server. Everything went good until I noticed that app always got killed by system within one or two hours since it was started. I began to suspect oom_killer. After hours of memory monitoring I detected that both resident and virtual memory are constantly growing. Most growing was a virtual memory. By disabling all elements in pipeline I found a minimal pipeline that leaks: gst-launch ximagesrc ! queue ! fakesrc Depending on amount of memory in your system you'll get 'Killed' message after running this simple pipeline. I tested this pipeline on following platforms: * ubuntu 12.04 arm + gstreamer0.10 * fresh archlinux x86_64 + gstreamer1.0 * ubuntu 13.10 x86_64 + gstreamer0.10/1.0 All these distros were affected. One note about ubuntu 13.10 x86_64: it leaks with much slower rates than other tested distros, but also leaks. I don't know why, but this is a fact. As I said, I suppose that all other platforms are also affected, since this bug isn't platform specific. How to measure. You need to run something that changes the desktop constantly (looped animated gif for instance). After that you need to run pipeline and then run oom_score checker: cat /proc/$(pidof gst-launch-0.10)/oom_score you'll notice that score is growing: from 3-5 to 15-20 within few minutes. On my arm system with 2Gb ram it could take up to 1 hour to reach score ~500 when process is killed by system. Next measurement: what is leaking. If we'll run a command: cat /proc/$(pidof gst-launch-0.10)/maps | grep SYSV0 we will see something like this: 41d00000-42084000 rw-s 00000000 00:04 316244015 /SYSV00000000 (deleted) 42114000-42498000 rw-s 00000000 00:04 316276788 /SYSV00000000 (deleted) 43000000-43384000 rw-s 00000000 00:04 316309558 /SYSV00000000 (deleted) 4340b000-4378f000 rw-s 00000000 00:04 316342327 /SYSV00000000 (deleted) 4378f000-43b13000 rw-s 00000000 00:04 316375096 /SYSV00000000 (deleted) 43b3e000-43ec2000 rw-s 00000000 00:04 316407866 /SYSV00000000 (deleted) 43f22000-442a6000 rw-s 00000000 00:04 316440636 /SYSV00000000 (deleted) 442f7000-4467b000 rw-s 00000000 00:04 316506174 /SYSV00000000 (deleted) 44738000-44abc000 rw-s 00000000 00:04 316538943 /SYSV00000000 (deleted) These records are shared memory blocks occupied by the process. After several minutes you'll see many of them. They aren't freed and counted by system as used. In normal conditions there should be 2-3 such blocks. Why this happens. An ximagesrc element uses XShm extension (if available in system) to get desktop shots. Each such image (gstbuffer actually) has a memory shared with X server. Once pipeline frees buffer, it returned to ximagesrc and pushed into pool of unused buffers. Each time when element is asked for a new frame, it takes from pool first unused buffer and deletes all other buffers. Here is piece of code responsible for that: while (ximagesrc->buffer_pool != NULL) { ximage = ximagesrc->buffer_pool->data; if ((ximage->width != ximagesrc->width) || (ximage->height != ximagesrc->height)) { gst_ximage_buffer_free (ximage); } ximagesrc->buffer_pool = g_slist_delete_link (ximagesrc->buffer_pool, ximagesrc->buffer_pool); } There only one issue: g_slist_delete_link() only removes item from linked list, frees memory used by slist record, but it doesn't unrefs the buffer which it holds. Buffer lost, shmem not freed as well. Not detectable by valgrind. Patch with fix for 0.10.31 attached. Any comments are appreciated.
Additional drawback of this bug is that on some systems, even when app is killed, system doesn't frees shmem blocks from X server process. Hence X server process leaks also. I observed such behavior on ubuntu 12.04 arm, but on latest ubuntu 13.10 x86_64 system frees these shmem blocks from X server process. But this should be rechecked.
GStreamer 0.10 is not longer supported by the GStreamer project. If you observe similar bug against GStreamer 1.X, please re-open.
As I already stated, I observe it in all versions since 0.10, including 1.2
Created attachment 274679 [details] [review] HEAD ximagesrc shmem leak patch
Review of attachment 274679 [details] [review]: Would it be possible to provide a patch formatted using git --format-patch -1 ? This will also imply passing the code through the style checker. ::: sys/ximage/gstximagesrc.c @@ +457,3 @@ g_mutex_lock (&ximagesrc->pool_lock); while (ximagesrc->buffer_pool != NULL) { + if (ximage != NULL) // pool not empty yet, free previous ximage Style: don't use C++ style comment, we try to avoid right comment too.
Created attachment 274713 [details] [review] fix of shmem leak Hope this time I didn't missed anything.
Review of attachment 274713 [details] [review]: ::: sys/ximage/gstximagesrc.c @@ +458,3 @@ while (ximagesrc->buffer_pool != NULL) { + if (ximage != NULL) + gst_ximage_buffer_free (ximage); And 4 lines lower you'll get a double free if ((meta->width != ximagesrc->width) || (meta->height != ximagesrc->height)). To be honest I don't get what the code is trying to do, but I agree it's not right. Though this is the wrong fix.
Created attachment 276899 [details] [review] [PATCH] ximagesrc: Fix ximage leaks when buffer has more then one ximage From time to time, when the image_pool list has more then 1 element and I suppose at start, all but 1 pooled ximage are leaked. This is due to broken algorithm in gst_ximagesink_src_ximage_get(). There was also a risk of use after free for the case where the ximage size has changed. https://bugzilla.gnome.org/show_bug.cgi?id=728502 --- sys/ximage/gstximagesrc.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
I think this patch is more correct, would it be possible to test on your side too ?
Line after condition sets a new value for freed variable, so there is no double freeing: ::: sys/ximage/gstximagesrc.c @@ +458,3 @@ while (ximagesrc->buffer_pool != NULL) { + if (ximage != NULL) + gst_ximage_buffer_free (ximage); + ximage = ximagesrc->buffer_pool->data; Anyway, your patch also fixes a leak, but on the other hand it changes pool's behavior and brings less optimal memory usage. Once pipeline empties the pool, plugin allocates new buffer. When all buffers returned back to plugin - they are pooled. But there are no guaranties that pipeline will reuse all of them again and when your fix reuses only first buffer (still holding the rest of buffers in memory), we could get "hanged" buffers that never could be reused. My fix doesn't change behavior of original code which supposes freeing of all buffers from pool except the last. Please take this in account when you'll decide which fix will be merged.
This is not the definition of a buffer pool. The main idea of buffer pools (not sure why this one haven't been ported to use GstBufferPool) is to prevent doing allocation. It's base on the fact the if the pipeline needed N buffers at some point, it is likely to need N again. And when that happens, you don't want to go into slow path and reallocate stuff. Yes, buffer may remain unused for a while. But remember this is X11 and the XImage and XShmImage are virtual memory, which means it's never lost completely. And even if the XShmImage was not virtual memory (there might be special ddx out there), it's the cost of smooth streaming. Also, you said "When all buffers returned back to plugin - they are pooled." What's not correct here is that the buffer can be picked from the pool at any time, and come back one by one at any time (and from any thread). There is no "when all buffer returned" state. The only reason we have this loop, is basically to cleanup buffers that were allocated for the different resolution. Otherwise GStreamer crash if you modset to a lower resolution.
I know that sometimes better to avoid extra memory allocations for performance reasons, but I wasn't aware, that this was preferred approach for gstreamer. Thank you for your explanation. > buffer can be picked from the pool at any time, and come back one by one at any time (and from any thread). I completely agree with you how buffer works. It seems I had written my thoughts not clearly and I had simplified the use case, so you misunderstood me. Sorry for that. From my point of view shmem leak is fixed and nothing to add here.
Great, fixed in master and 1.2. 1.2 - commit e76510d7d7a65f77526d70a82df77aa9a8c0d37a master - commit 0746bca19036a89a3d3cef9265ab6e1a0568d563 Author: Nicolas Dufresne <nicolas.dufresne@collabora.com> Date: Tue May 20 17:37:49 2014 -0400 ximagesrc: Fix ximage leaks when buffer has more then one ximage From time to time, when the image_pool list has more then 1 element and I suppose at start, all but 1 pooled ximage are leaked. This is due to broken algorithm in gst_ximagesink_src_ximage_get(). There was also a risk of use after free for the case where the ximage size has changed. https://bugzilla.gnome.org/show_bug.cgi?id=728502
Thanks for your patch and for analysing this, Alexander!
Thanks for your advices guys!