GNOME Bugzilla – Bug 687493
Huge memory leak using Python, Gtk+, GStreamer and vp8enc
Last modified: 2013-09-23 13:56:49 UTC
Created attachment 227958 [details] Python test program With ximagesrc as my source and I capturing 1920x1080 video which is then encoded with vp8 and muxed into a webm container. The attached program develops a big memory leak that sometimes peaks over 5MB/s. I was unable to localize this, however the same pipeline executed with gst-launch-1.0 works OK and there are no memory leaks, at least not this big. It appears that something is wrong with GObject introspection because this leak appears only when doing this in python and not in the command line. I was running this on Ubuntu 12.10.
I can not reproduce this with git master of things. Perhaps you could try 1.0.2 ? Or run in valgrind, like this: $ G_SLICE=always-malloc valgrind python testapp.py 2>&1 | tee valgrind.log and make sure your program shuts down automatically after 10 or 20 seconds or so.
Created attachment 227966 [details] Valgrind log of a 30 seconds run.
This is the leak summary of the 30 seconds run with valgrind: LEAK SUMMARY: definitely lost: 16,612 bytes in 6 blocks indirectly lost: 240 bytes in 10 blocks possibly lost: 88,119,273 bytes in 1,039 blocks still reachable: 18,325,091 bytes in 39,799 blocks suppressed: 0 bytes in 0 blocks
I can't reproduce this with 1.0.1 as on debian sid either, btw. Could you (a) install -dbg packages for gstreamer and plugins, and (b) make another valgrind log please, this time over 60 seconds perhaps. Also, how do you stop it? Do you properly shut down the pipeline in the end or just control-C ?
I tried with the gstreamer developers team PPA which includes 1.0.2 version. But the result with running my test script is the same. Five minutes into the recording and this is what I am getting in /proc VmPeak: 2054440 kB VmSize: 2054440 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 1181684 kB VmRSS: 1173504 kB VmData: 751024 kB VmStk: 136 kB VmExe: 2172 kB VmLib: 23136 kB VmPTE: 2816 kB VmSwap: 0 kB Something is not as it should be.
I stop the pipeline by sending it Gst.Event.new_eos() event then I wait for five seconds and then set the state to NULL. It is a crude mechanism in this case, but I wanted minimum outside influence. The problem, however, is that with longer recordings my computer swaps out before the pipeline is stopped. I'll install debug packages a later today and try with those.
I've noticed something strange happening. When I run this test with valgrind the process is killed when pipeline is set to playing. I tried to run this example with Timer threads and without timer threads. In both cases only a single 'Killed' is written in the terminal and nothing else happens. I am not sure what is getting killed. Python or Valgrind. I assume the former, because I still get the valgrind output.
Instead of using a timer to NULL it, wait until playback is done with something like: self.pipeline.get_bus().timed_pop_filtered(Gst.CLOCK_TIME_NONE, Gst.MessageType.EOS) This will block until the pipeline is actually done playing.
Thanks for the tip. In an actual application I am waiting for EOS and I am doing this properly. This was just a quick hack. :)
We still need a valgrind log with full debugging symbols in the GStreamer stack and libvpx, so setting back to NEEDINFO again. I wonder if this is a duplicate of #682714 - could you check with a more recent GStreamer ? I can still not reproduce this with git master, nor with my debian sid packages (python-gi 3.2.2-1 + gst 1.0.5). The biggest leak I get is in python's PyString_InternInPlace resize , and I don't see any leaks that increase over time (did a 30-second run and a 300-second run).
I again tried the test program, I set the first timeout to 700 seconds. In 5 minutes memory consumption went from < 1% to 6.8% on my 16GB machine. I used htop to monitor memory consuption. It was steadily rising during the recording. After five minutes I stopped the program. This is really disturbing. I wish I could isolate this leak and get some useful valgrind logs. As I mentioned before, with x264enc there is no such leakage. Various leakage was reported many times before for Kazam Screencaster. Relevant bug: https://bugs.launchpad.net/kazam/+bug/981224 If anyone can shed some light on this, I'd really appreciate it. Merely guessing, could it be a problem in PyGObject and introspection? I've seen problems with it before.
Latest testing with Kazam Screen recorder running on Ubuntu 13.04 with GStreamer 1.0.6 turned out everything is ok. Memory consumption was never higher than 1.2% which is expected. I did experience a lot of dropped frames one or two minutes into the video. So this is a separate issue and I'll look into it. It's probably just wrong settings.
I did some more testing and it turns out the memory leak is gone if I use these parameters: self.videnc.set_property("cpu-used", 2) self.videnc.set_property("end-usage", "vbr") self.videnc.set_property("target-bitrate", 800000000) self.videnc.set_property("static-threshold", 1000) self.videnc.set_property("token-partitions", 2) self.videnc.set_property("max-quantizer", 30) self.videnc.set_property("threads", self.cores) There is no memory leaking, but flushing the pipeline after stopping it, takes a lot of time and there are many dropped frames in the video. The resulting framerate is around one frame per second, perhaps even less. Using the following parameters: self.videnc.set_property("cpu-used", 6) self.videnc.set_property("deadline", 1000000) self.videnc.set_property("min-quantizer", 15) self.videnc.set_property("max-quantizer", 15) self.videnc.set_property("threads", self.cores) Framerate is what I expect it to be, flushing the pipeline takes no time - there is no wait after stopping the pipeline, but the memory leak is back. Two minutes into the video htop is showing 6.5% of memory usage on my 16GB machine.
I find it hard to believe that the same python program leaks with one set of settings on vp8enc but not with another. I just don't know how that might happen. Perhaps you could try to write a C test case with the exact same settings?
I just confirmed the case with gst-launch. Here are the command lines used: No leak, terrible framerate: gst-launch-1.0 -e ximagesrc endx=1919 endy=1079 use-damage=false show-pointer=true ! queue ! videorate ! video/x-raw,framerate=25/1 ! videoconvert ! vp8enc end-usage=vbr cpu-used=6 target-bitrate=80000000 threads=4 token-partitions=2 min-quantizer=10 max-quantizer=30 ! queue name=before_mux ! webmmux name=mux ! queue ! filesink location="test-videorate.webm" Leak, excellent framerate: gst-launch-1.0 -e ximagesrc endx=1919 endy=1079 use-damage=false show-pointer=true ! queue ! videorate ! video/x-raw,framerate=25/1 ! videoconvert ! vp8enc end-usage=vbr cpu-used=6 target-bitrate=80000000 threads=4 token-partitions=2 min-quantizer=10 max-quantizer=30 deadline=100000 ! queue name=before_mux ! webmmux name=mux ! queue ! filesink location="test-videorate.webm" The only difference is in the deadline parameter. If deadline is set to around 42000 or more, memory stars leaking. gst-launch is probably written in C, so this should be enough, I guess?
I can confirm that (e.g. tested the commands in David's latest comment) with stock install of Ubuntu 13.04.
I can reproduce this with git master and the second gst-launch-1.0 pipeline, thanks a lot!
Hi, Any fix or patch for this huge memory leak?? regards JasonP
Created attachment 247378 [details] Test case to check for a specific memory leak bug. Here is a C program to try to reproduce this bug and debug it in valgrind. It's the equivalent of David Klasinc's leaking one above, plus the num-buffers property set so that the pipeline automatically stops after a few frames. The instructions to compile it and to run Valgrind on it are in the code. Valgrind reports millions of "possibly lost" bytes, but that might be simply due to the fact that Glib does unusual things with pointers that could cause them to point into the middle of an allocated block. It reports no "definitely lost", nor "indirectly lost" bytes. I am trying to increase the num-buffers parameter to see if there the number of "possibly lost" bytes increase, but my computer has only 2 processors, so it's rather long. Maybe someone could try it with a more recent computer. Otherwise, maybe I am doing something wrong to reproduce this bug, since I can't reproduce an obvious memory leak.
Here is the proper way to build the test case in C above if one has a local version of GStreamer built from Git: libtool --mode=link gcc testcase_leak_vp8enc.c -Wall `pkg-config --cflags --libs gstreamer-1.0` -o testcase Then, here are the details for the definitely lost bytes reported: ==7339== 2,640 (1,360 direct, 1,280 indirect) bytes in 5 blocks are definitely lost in loss record 2,299 of 2,356 ==7339== at 0x4C2CD7B: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==7339== by 0x5BADCF0: g_malloc (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0) ==7339== by 0x5BC2EF2: g_slice_alloc (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0) ==7339== by 0x4E6705E: gst_buffer_new (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.6.0) ==7339== by 0x7692E86: ??? (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstximagesrc.so) ==7339== by 0x76919D2: ??? (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstximagesrc.so) ==7339== by 0x7B003C1: ??? (in /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.6.0) ==7339== by 0x7B01C1A: ??? (in /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.6.0) ==7339== by 0x4EBFA70: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.6.0) ==7339== by 0x5BCC6F1: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0) ==7339== by 0x5BCBEB4: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0) ==7339== by 0x512EF8D: start_thread (pthread_create.c:311) I ran this on Ubuntu 13.04 64 bit on an Intel Core 2 Duo CPU.
It would be good if you could install full debugging symbols for gstreamer, the gst-plugins-base libs and the plugins.
In ximagesrc, according to the results above, it seems to me like some GstBuffer* created by calling gst_ximageutil_ximage_new() in gst_ximage_src_ximage_get() are never freed. The pool of buffers should be freed when the element is stopped of disposed.
Created attachment 247480 [details] Valgrind log for the C test case Here is the full Valgrind log for my C test case, with the full debugging symbols for gstreamer.
I wouldn't call 2640 bytes a huge memory leak. So from your log it seems that there is no huge leak, just some minor leaks... which means that the huge memory usage must be in another place where there are still references to the memory. You could check with massif for example where allocations are piling up.
The buffer_pool code in ximagesrc looks rather suspicious. This should be changed to a real GstBufferPool and I wouldn't be surprised if the leaks disappear then. I expect that valgrind shows no larger leaks because the XImages are still referenced by the server all the time.
Just yesterday I got another user reporting Kazam eating 15GB of memory and slowing down the machine to a crawl. This time they used H.264. Is is possible that leaking only happens when using GStreamer in Python and the problem is with Gnome Introspection? In my previous experience with Gnome introspection this happened more than once.
Also leaks these buffers with gst-launch-1.0 ximagesrc endx=1919 endy=1079 use-damage=false show-pointer=true num-buffers=20 ! fakesink
(In reply to comment #26) > Just yesterday I got another user reporting Kazam eating 15GB of memory and > slowing down the machine to a crawl. This time they used H.264. > > Is is possible that leaking only happens when using GStreamer in Python and the > problem is with Gnome Introspection? > > In my previous experience with Gnome introspection this happened more than > once. Quite possible, but didn't you say you were able to reproduce it with the C testcase too?
(In reply to comment #28) > (In reply to comment #26) > > Just yesterday I got another user reporting Kazam eating 15GB of memory and > > slowing down the machine to a crawl. This time they used H.264. > > > > Is is possible that leaking only happens when using GStreamer in Python and the > > problem is with Gnome Introspection? > > > > In my previous experience with Gnome introspection this happened more than > > once. > > Quite possible, but didn't you say you were able to reproduce it with the C > testcase too? I don't see David sayin' such thing in the conversation, no.
> > In my previous experience with Gnome introspection this happened more than > > once. > > Quite possible, but didn't you say you were able to reproduce it with the C > testcase too? I was able to reproduce some leaks with gst-launch. I never got around to write a C testcase. When Python programs use certain C libraries it can often happen that Python garbage collection stops being as awesome as they say it is. So if this is not reproducible with C, then Introspection is something worth looking into. Right now, however, I am heavily occupied with work related stuff and my time is limited. :/
Using the pipelines from comment #15, I can reproduce this in 1.0, but not with git master. So I assume it was fixed. Please re-open if you can still reproduce it.
Would it be possible for someone to post a reference to the relevant commit?
@david: I have no idea which commit fixed it.. Someone would have to bisect to find out.