Bug 687493 – Huge memory leak using Python, Gtk+, GStreamer and vp8enc

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 687493 - Huge memory leak using Python, Gtk+, GStreamer and vp8enc


Summary:	Huge memory leak using Python, Gtk+, GStreamer and vp8enc


Status:	RESOLVED OBSOLETE

Product:	GStreamer
Classification:	Platform
Component:	dont know
Version:	git master
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	1.1.90
Assigned To:	GStreamer Maintainers
QA Contact:	GStreamer Maintainers

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2012-11-03 06:37 UTC by David Klasinc
Modified:	2013-09-23 13:56 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Python test program (3.25 KB, text/plain) 2012-11-03 06:37 UTC, David Klasinc	Details
Valgrind log of a 30 seconds run. (992.03 KB, text/plain) 2012-11-03 15:06 UTC, David Klasinc	Details
Test case to check for a specific memory leak bug. (1.58 KB, text/plain) 2013-06-20 18:41 UTC, Alexandre Quessy	Details
Valgrind log for the C test case (138.19 KB, text/x-log) 2013-06-21 18:55 UTC, Alexandre Quessy	Details

Description David Klasinc 2012-11-03 06:37:43 UTC

Created attachment 227958 [details]
Python test program

With ximagesrc as my source and I capturing 1920x1080 video which is then encoded with vp8 and muxed into a webm container. The attached program develops a big memory leak that sometimes peaks over 5MB/s.

I was unable to localize this, however the same pipeline executed with gst-launch-1.0 works OK and there are no memory leaks, at least not this big.

It appears that something is wrong with GObject introspection because this leak appears only when doing this in python and not in the command line.

I was running this on Ubuntu 12.10.

Comment 1 Tim-Philipp Müller 2012-11-03 12:56:04 UTC

I can not reproduce this with git master of things. Perhaps you could try 1.0.2 ?

Or run in valgrind, like this:

  $ G_SLICE=always-malloc  valgrind  python  testapp.py  2>&1 | tee valgrind.log

and make sure your program shuts down automatically after 10 or 20 seconds or so.

Comment 2 David Klasinc 2012-11-03 15:06:59 UTC

Created attachment 227966 [details]
Valgrind log of a 30 seconds run.

Comment 3 David Klasinc 2012-11-03 15:07:35 UTC

This is the leak summary of the 30 seconds run with valgrind:

 LEAK SUMMARY:
    definitely lost: 16,612 bytes in 6 blocks
    indirectly lost: 240 bytes in 10 blocks
      possibly lost: 88,119,273 bytes in 1,039 blocks
    still reachable: 18,325,091 bytes in 39,799 blocks
         suppressed: 0 bytes in 0 blocks

Comment 4 Tim-Philipp Müller 2012-11-03 15:57:10 UTC

I can't reproduce this with 1.0.1 as on debian sid either, btw.

Could you (a) install -dbg packages for gstreamer and plugins, and (b) make another valgrind log please, this time over 60 seconds perhaps.

Also, how do you stop it? Do you properly shut down the pipeline in the end or just control-C ?

Comment 5 David Klasinc 2012-11-03 16:11:01 UTC

I tried with the gstreamer developers team PPA which includes 1.0.2 version. But the result with running my test script is the same. Five minutes into the recording and this is what I am getting in /proc

VmPeak:  2054440 kB
VmSize:  2054440 kB
VmLck:         0 kB
VmPin:         0 kB
VmHWM:   1181684 kB
VmRSS:   1173504 kB
VmData:   751024 kB
VmStk:       136 kB
VmExe:      2172 kB
VmLib:     23136 kB
VmPTE:      2816 kB
VmSwap:        0 kB

Something is not as it should be.

Comment 6 David Klasinc 2012-11-03 16:15:27 UTC

I stop the pipeline by sending it Gst.Event.new_eos() event then I wait for five seconds and then set the state to NULL.

It is a crude mechanism in this case, but I wanted minimum outside influence.

The problem, however, is that with longer recordings my computer swaps out before the pipeline is stopped.

I'll install debug packages a later today and try with those.

Comment 7 David Klasinc 2012-11-04 07:06:52 UTC

I've noticed something strange happening. When I run this test with valgrind the process is killed when pipeline is set to playing. I tried to run this example with Timer threads and without timer threads. In both cases only a single 'Killed' is written in the terminal and nothing else happens.

I am not sure what is getting killed. Python or Valgrind. I assume the former, because I still get the valgrind output.

Comment 8 Olivier Crête 2013-01-24 03:26:13 UTC

Instead of using a timer to NULL it, wait until playback is done with something like:

self.pipeline.get_bus().timed_pop_filtered(Gst.CLOCK_TIME_NONE, Gst.MessageType.EOS)

This will block until the pipeline is actually done playing.

Comment 9 David Klasinc 2013-01-24 08:37:09 UTC

Thanks for the tip. In an actual application I am waiting for EOS and I am doing this properly. This was just a quick hack. :)

Comment 10 Tim-Philipp Müller 2013-05-04 10:12:03 UTC

We still need a valgrind log with full debugging symbols in the GStreamer stack and libvpx, so setting back to NEEDINFO again.

I wonder if this is a duplicate of #682714 - could you check with a more recent GStreamer ?

I can still not reproduce this with git master, nor with my debian sid packages (python-gi 3.2.2-1 + gst 1.0.5).

The biggest leak I get is in python's PyString_InternInPlace resize , and I don't see any leaks that increase over time (did a 30-second run and a 300-second run).

Comment 11 David Klasinc 2013-05-04 16:51:42 UTC

I again tried the test program, I set the first timeout to 700 seconds. In 5 minutes memory consumption went from  < 1% to 6.8% on my 16GB machine.

I used htop to monitor memory consuption. It was steadily rising during the recording. After five minutes I stopped the program.

This is really disturbing. I wish I could isolate this leak and get some useful valgrind logs.

As I mentioned before, with x264enc there is no such leakage.

Various leakage was reported many times before for Kazam Screencaster.

Relevant bug:

https://bugs.launchpad.net/kazam/+bug/981224

If anyone can shed some light on this, I'd really appreciate it.

Merely guessing, could it be a problem in PyGObject and introspection? I've seen problems with it before.

Comment 12 David Klasinc 2013-05-05 09:02:14 UTC

Latest testing with Kazam Screen recorder running on Ubuntu 13.04 with GStreamer 1.0.6 turned out everything is ok. Memory consumption was never higher than 1.2% which is expected.

I did experience a lot of dropped frames one or two minutes into the video. So this is a separate issue and I'll look into it. It's probably just wrong settings.

Comment 13 David Klasinc 2013-05-05 11:25:17 UTC

I did some more testing and it turns out the memory leak is gone if I use these parameters:

self.videnc.set_property("cpu-used", 2)
self.videnc.set_property("end-usage", "vbr")
self.videnc.set_property("target-bitrate", 800000000)
self.videnc.set_property("static-threshold", 1000)
self.videnc.set_property("token-partitions", 2)
self.videnc.set_property("max-quantizer", 30)
self.videnc.set_property("threads", self.cores)

There is no memory leaking, but flushing the pipeline after stopping it, takes a lot of time and there are many dropped frames in the video. The resulting framerate is around one frame per second, perhaps even less.


Using the following parameters:

self.videnc.set_property("cpu-used", 6)
self.videnc.set_property("deadline", 1000000)
self.videnc.set_property("min-quantizer", 15)
self.videnc.set_property("max-quantizer", 15)
self.videnc.set_property("threads", self.cores)

Framerate is what I expect it to be, flushing the pipeline takes no time - there is no wait after stopping the pipeline, but the memory leak is back. Two minutes into the video htop is showing 6.5% of memory usage on my 16GB machine.

Comment 14 Tim-Philipp Müller 2013-05-05 17:56:50 UTC

I find it hard to believe that the same python program leaks with one set of settings on vp8enc but not with another. I just don't know how that might happen. Perhaps you could try to write a C test case with the exact same settings?

Comment 15 David Klasinc 2013-05-05 19:09:20 UTC

I just confirmed the case with gst-launch. Here are the command lines used:

No leak, terrible framerate:
gst-launch-1.0 -e ximagesrc endx=1919 endy=1079 use-damage=false show-pointer=true ! queue ! videorate ! video/x-raw,framerate=25/1 ! videoconvert !  vp8enc end-usage=vbr cpu-used=6 target-bitrate=80000000 threads=4 token-partitions=2 min-quantizer=10 max-quantizer=30 ! queue name=before_mux ! webmmux name=mux ! queue ! filesink location="test-videorate.webm"

Leak, excellent framerate:
gst-launch-1.0 -e ximagesrc endx=1919 endy=1079 use-damage=false show-pointer=true ! queue ! videorate ! video/x-raw,framerate=25/1 ! videoconvert !  vp8enc end-usage=vbr cpu-used=6 target-bitrate=80000000 threads=4 token-partitions=2 min-quantizer=10 max-quantizer=30 deadline=100000 ! queue name=before_mux ! webmmux name=mux ! queue ! filesink location="test-videorate.webm"

The only difference is in the deadline parameter. If deadline is set to around 42000 or more, memory stars leaking.

gst-launch is probably written in C, so this should be enough, I guess?

Comment 16 George 2013-05-05 20:06:35 UTC

I can confirm that (e.g. tested the commands in David's latest comment) with stock install of Ubuntu 13.04.

Comment 17 Tim-Philipp Müller 2013-05-05 20:19:26 UTC

I can reproduce this with git master and the second gst-launch-1.0 pipeline, thanks a lot!

Comment 18 RajuB 2013-06-11 14:04:54 UTC

Hi,

Any fix or patch for this huge memory leak??

regards
JasonP

Comment 19 Alexandre Quessy 2013-06-20 18:41:51 UTC

Created attachment 247378 [details]
Test case to check for a specific memory leak bug.

Here is a C program to try to reproduce this bug and debug it in valgrind. It's the equivalent of David Klasinc's leaking one above, plus the num-buffers property set so that the pipeline automatically stops after a few frames. The instructions to compile it and to run Valgrind on it are in the code.

Valgrind reports millions of "possibly lost" bytes, but that might be simply due to the fact that Glib does unusual things with pointers that could cause them to point into the middle of an allocated block. It reports no "definitely lost", nor "indirectly lost" bytes.

I am trying to increase the num-buffers parameter to see if there the number of "possibly lost" bytes increase, but my computer has only 2 processors, so it's rather long. Maybe someone could try it with a more recent computer.

Otherwise, maybe I am doing something wrong to reproduce this bug, since I can't reproduce an obvious memory leak.

Comment 20 Alexandre Quessy 2013-06-21 17:17:12 UTC

Here is the proper way to build the test case in C above if one has a local version of GStreamer built from Git:

libtool --mode=link gcc testcase_leak_vp8enc.c -Wall `pkg-config --cflags --libs gstreamer-1.0` -o testcase

Then, here are the details for the definitely lost bytes reported:

==7339== 2,640 (1,360 direct, 1,280 indirect) bytes in 5 blocks are definitely lost in loss record 2,299 of 2,356
==7339==    at 0x4C2CD7B: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==7339==    by 0x5BADCF0: g_malloc (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0)
==7339==    by 0x5BC2EF2: g_slice_alloc (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0)
==7339==    by 0x4E6705E: gst_buffer_new (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.6.0)
==7339==    by 0x7692E86: ??? (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstximagesrc.so)
==7339==    by 0x76919D2: ??? (in /usr/lib/x86_64-linux-gnu/gstreamer-1.0/libgstximagesrc.so)
==7339==    by 0x7B003C1: ??? (in /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.6.0)
==7339==    by 0x7B01C1A: ??? (in /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.6.0)
==7339==    by 0x4EBFA70: ??? (in /usr/lib/x86_64-linux-gnu/libgstreamer-1.0.so.0.6.0)
==7339==    by 0x5BCC6F1: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0)
==7339==    by 0x5BCBEB4: ??? (in /lib/x86_64-linux-gnu/libglib-2.0.so.0.3600.0)
==7339==    by 0x512EF8D: start_thread (pthread_create.c:311)

I ran this on Ubuntu 13.04 64 bit on an Intel Core 2 Duo CPU.

Comment 21 Tim-Philipp Müller 2013-06-21 17:37:12 UTC

It would be good if you could install full debugging symbols for gstreamer, the gst-plugins-base libs and the plugins.

Comment 22 Alexandre Quessy 2013-06-21 17:51:35 UTC

In ximagesrc, according to the results above, it seems to me like some GstBuffer* created by calling gst_ximageutil_ximage_new() in gst_ximage_src_ximage_get() are never freed.

The pool of buffers should be freed when the element is stopped of disposed.

Comment 23 Alexandre Quessy 2013-06-21 18:55:29 UTC

Created attachment 247480 [details]
Valgrind log for the C test case

Here is the full Valgrind log for my C test case, with the full debugging symbols for gstreamer.

Comment 24 Sebastian Dröge (slomo) 2013-07-01 11:12:48 UTC

I wouldn't call 2640 bytes a huge memory leak. So from your log it seems that there is no huge leak, just some minor leaks... which means that the huge memory usage must be in another place where there are still references to the memory.

You could check with massif for example where allocations are piling up.

Comment 25 Sebastian Dröge (slomo) 2013-07-01 11:20:01 UTC

The buffer_pool code in ximagesrc looks rather suspicious. This should be changed to a real GstBufferPool and I wouldn't be surprised if the leaks disappear then.

I expect that valgrind shows no larger leaks because the XImages are still referenced by the server all the time.

Comment 26 David Klasinc 2013-07-01 11:24:42 UTC

Just yesterday I got another user reporting Kazam eating 15GB of memory and slowing down the machine to a crawl. This time they used H.264.

Is is possible that leaking only happens when using GStreamer in Python and the problem is with Gnome Introspection?

In my previous experience with Gnome introspection this happened more than once.

Comment 27 Sebastian Dröge (slomo) 2013-07-01 11:25:02 UTC

Also leaks these buffers with

gst-launch-1.0 ximagesrc endx=1919 endy=1079 use-damage=false show-pointer=true num-buffers=20 ! fakesink

Comment 28 Sebastian Dröge (slomo) 2013-07-01 11:25:43 UTC

(In reply to comment #26)
> Just yesterday I got another user reporting Kazam eating 15GB of memory and
> slowing down the machine to a crawl. This time they used H.264.
> 
> Is is possible that leaking only happens when using GStreamer in Python and the
> problem is with Gnome Introspection?
> 
> In my previous experience with Gnome introspection this happened more than
> once.

Quite possible, but didn't you say you were able to reproduce it with the C testcase too?

Comment 29 George 2013-08-29 22:37:23 UTC

(In reply to comment #28)
> (In reply to comment #26)
> > Just yesterday I got another user reporting Kazam eating 15GB of memory and
> > slowing down the machine to a crawl. This time they used H.264.
> > 
> > Is is possible that leaking only happens when using GStreamer in Python and the
> > problem is with Gnome Introspection?
> > 
> > In my previous experience with Gnome introspection this happened more than
> > once.
> 
> Quite possible, but didn't you say you were able to reproduce it with the C
> testcase too?

I don't see David sayin' such thing in the conversation, no.

Comment 30 David Klasinc 2013-08-30 07:14:20 UTC

> > In my previous experience with Gnome introspection this happened more than
> > once.
> 
> Quite possible, but didn't you say you were able to reproduce it with the C
> testcase too?

I was able to reproduce some leaks with gst-launch. I never got around to write a C testcase.

When Python programs use certain C libraries it can often happen that Python garbage collection stops being as awesome as they say it is. So if this is not reproducible with C, then Introspection is something worth looking into. Right now, however, I am heavily occupied with work related stuff and my time is limited. :/

Comment 31 Olivier Crête 2013-09-20 20:18:51 UTC

Using the pipelines from comment #15, I can reproduce this in 1.0, but not with git master. So I assume it was fixed. Please re-open if you can still reproduce it.

Comment 32 David Klasinc 2013-09-23 07:28:24 UTC

Would it be possible for someone to post a reference to the relevant commit?

Comment 33 Olivier Crête 2013-09-23 13:56:49 UTC

@david: I have no idea which commit fixed it.. Someone would have to bisect to find out.