GNOME Bugzilla – Bug 696157
Memory leak when changing background image
Last modified: 2013-09-18 08:14:32 UTC
Almost every time I change my background image, GNOME Shell's memory usage grows by about 50MB, and sometimes even by 200-400MB. The memory leak occurs whether there are active extensions or not. I am using git master. OS: Arch Linux
sounds blocker-worthy
It seems way worse then that. Memory usage goes up by 800MB to 1.3GB here when changing the background.
Someone should fix this bug :(
Created attachment 239674 [details] (xz compressed) valgrind output
I consider this to be a definitive blocker - we can't ship with a huge memory leak like that.
This is pretty easy to reproduce by running: while true; do for f in /usr/share/backgrounds/gnome/*; do gsettings set org.gnome.desktop.background picture-uri file://$f; sleep 1; done; done I have a couple of mutter patches which are important, but there's some changes needed on the gnome-shell side too, and i'm still working on that.
Created attachment 239721 [details] [review] background: fix task leak in load_file_async g_task_run_in_thread takes its own reference to the task passed in, so we can unref the initial reference.
Created attachment 239722 [details] [review] background: fix pixbuf leak in load_file_finish g_task_propagate_pointer relinishes the GTask of its reference to the propagated pointer, so we need to unref it ourselves when we're done with it.
i'll attach my in-progress gnome-shell work, too.
Created attachment 239723 [details] [review] background: always add content to cache We currently only add the first instance of a background to the cache. This means if the actor associated with that background is destroyed, the content will be evicted and it will need to get reloaded, even if it's already loaded on another actor. This commit ensures every content gets added to the cache.
Created attachment 239724 [details] [review] background: When updating actor content, evict old content from cache Normally backgrounds get evicted from the cache when their actor is destroyed. If the actor changes content without destroying itself, though, we should evict the old content from the cache, too.
Created attachment 239725 [details] [review] background: explicitly destroy content in-progress patch that needs more investigation.
Review of attachment 239721 [details] [review]: Looks good.
Review of attachment 239724 [details] [review]: Makes sense.
Review of attachment 239725 [details] [review]: This does not really "fix" anything while debugging I confirmed that dispose will get run after the GC kicks in.
Review of attachment 239722 [details] [review]: OK.
Review of attachment 239723 [details] [review]: Not directly related to the leak but looks good anyway.
Review of attachment 239723 [details] [review]: ::: js/ui/background.js @@ +167,3 @@ content = null; + this._images.push(content); Is it ok to push null here ?
Review of attachment 239722 [details] [review]: Hmm not "OK" ... causes g_object_unref: assertion `G_IS_OBJECT (object)' failed
Created attachment 239741 [details] [review] background: Don't add the pixbuf as user_data I have no idea why we do this ... we never access it again.
(In reply to comment #18) > Review of attachment 239723 [details] [review]: > > ::: js/ui/background.js > @@ +167,3 @@ > content = null; > > + this._images.push(content); > > Is it ok to push null here ? No, not a good idea.
Review of attachment 239741 [details] [review]: I don't think this is quite right. ::: src/compositor/meta-background.c @@ -1199,3 @@ - (CoglUserDataDestroyCallback) - g_object_unref); - The pixbuf owns its pixel data, but we're using it. cogl. We don't want to copy the that data to the texture, since it can be sizable, so we need to keep the pixbuf alive for the duration of the texture object's lifetime.
Created attachment 239756 [details] [review] background: fix pixbuf leak in load_file_finish g_task_propagate_pointer relinishes the GTask of its reference to the propagated pointer, so we need to unref it ourselves when we're done with it.
Created attachment 239757 [details] [review] background: always add content to cache We currently only add the first instance of a background to the cache. This means if the actor associated with that background is destroyed, the content will be evicted and it will need to get reloaded, even if it's already loaded on another actor. This commit ensures every content gets added to the cache.
Review of attachment 239756 [details] [review]: Looks good.
(In reply to comment #22) > Review of attachment 239741 [details] [review]: > > I don't think this is quite right. > > ::: src/compositor/meta-background.c > @@ -1199,3 @@ > - (CoglUserDataDestroyCallback) > - g_object_unref); > - > > The pixbuf owns its pixel data, but we're using it. cogl. We don't want to copy > the that data to the texture, since it can be sizable, so we need to keep the > pixbuf alive for the duration of the texture object's lifetime. OK I though that the texture just uploads the data to video memory and forgot about it ... doesn't seem to be the case.
Review of attachment 239725 [details] [review]: "after the GC kicks in" which right now isn't very often because we took out the periodic GC because of the deadlock bug.
Created attachment 239835 [details] [review] background: drop saturation and blur effects We don't use them anymore, so drop them.
Created attachment 239836 [details] [review] background: share snippets between pipelines Cogl automatically caches pipelines with no eviction policy, so we need to make sure to reuse snippets to prevent identical pipelines from getting cached separately.
Created attachment 239837 [details] [review] background: don't tank if background is destroyed before it gets a pipeline Right now we call unset_texture from MetaBackground's dispose method. unset_texture assumes there's a pipeline available, but there may not be if the object was just created. This commit fixes that incorrect assumption.
Created attachment 239838 [details] [review] background: properly disconnect background signals BackgroundManager connects to the changed signal in the backgrounds it manages. The signal ids for the changed signal connectionss are stored as state on the background manager object. If the background being managed changes while the manager is still loading the old background, then the signal id variable can get out of sync with the background object being managed. This commit ties the signal id to the background objects themselves, so there is no opportunity for them to desynchronize.
Created attachment 239839 [details] [review] workspaceThumbnail: always destroy bgManager when destroyed Right now we only destroy the bgManager object when the workspaceThumbnail is explicitly destroy with its destroy() method. This commit makes sure bgManager gets destroyed when the workspaceThumbnail actor is destroyed without calling destroy().
Created attachment 239840 [details] [review] background: don't load the same image more than once concurrently If a background gets requested from the cache while it's still being loaded from an earlier call, then there will be two concurrent loads of the same file. That concurrency is mitigates the effectiveness of the cache and also causes leaks. This commit consolidates file loads so that concurrency doesn't happen.
Review of attachment 239835 [details] [review]: Looks good. ::: src/meta/meta-background.h @@ +62,3 @@ { META_BACKGROUND_EFFECTS_NONE = 0, + META_BACKGROUND_EFFECTS_VIGNETTE = 1 << 1, Don't really have reason to keep the value so just use 1 << 0 (or 1). In theory we could just turn this into a boolean now maintaining flexibility doesn't really hurt.
Review of attachment 239836 [details] [review]: Looks good.
Review of attachment 239837 [details] [review]: ::: src/compositor/meta-background.c @@ +466,3 @@ + g_printerr ("freeing background\n"); + I'll drop this @@ +967,3 @@ + g_printerr ("freeing pixbuf %p\n", pixbuf); + g_object_unref (pixbuf); +} and these @@ +986,3 @@ return; } + g_printerr ("creating pixbuf '%s' %p\n", task_data->filename, pixbuf); and this @@ +990,3 @@ g_task_return_pointer (task, pixbuf, (GDestroyNotify) g_object_unref); + + g_object_weak_ref (G_OBJECT (pixbuf), (GWeakNotify) destroy_notify, pixbuf); and this.
Review of attachment 239757 [details] [review]: OK.
Review of attachment 239837 [details] [review]: OK (with the debug stuff removed).
Review of attachment 239838 [details] [review]: Makes sense.
Review of attachment 239839 [details] [review]: Looks good.
Review of attachment 239840 [details] [review]: Something is not right here ... When I change the background in this one it changes in the window thumbnail but the desktop background stays the same. Can be easily seen using your "while true" loop and going to the overview.
You forgot to attach the interval exception fix btw.
Created attachment 239844 [details] [review] background: fix animations with UINT_MAX duration gnome-desktop uses UINT_MAX as a sentinel value to mean, "don't ever update slide". This commit adds support for that.
Review of attachment 239844 [details] [review]: Considering gnome-desktop itself sets the duration to UINT_MAX, I'd kinda just prefer if we fixed GnomeBGSlideShow and respected the "fixed" part of get_slide.
(In reply to comment #44) > Review of attachment 239844 [details] [review]: > > Considering gnome-desktop itself sets the duration to UINT_MAX, I'd kinda just > prefer if we fixed GnomeBGSlideShow and respected the "fixed" part of > get_slide. perhaps, but regardless we shouldn't throw an exception if a file has a really big number in it. I guess a better "right now" fix than attachment 239844 [details] [review] is to explicitly check for any overflow, not just duration >= UINT_MAX
Created attachment 239847 [details] [review] background: don't choke on huge slide show durations if a slideshow file has a really large duration we'll currently throw an exception. This bug is aggravated by the fact that some versions of gnome-desktop use UINT_MAX as a sentinel value to mean, "don't ever update slide". This commit treats durations that would overflow as infinitely long.
Review of attachment 239847 [details] [review]: Hm, OK. I still want to fix this for 3.8.1 though.
So this is a question of risk/benefits/drawbacks. The risk seems to me to be relatively high, unless we get more people to test this. The benefit is not leaking. This seems like something that could go in 3.8.1 after it's had more testing. I'd be curious what Florian thinks. Tentatively I'd say 3.8.1 myself, but if others feel it's 3.8.0 material I wouldn't object.
> The benefit is not leaking. the leaks that were described in earlier comments are _massive_...
(In reply to comment #41) > Review of attachment 239840 [details] [review]: > > Something is not right here ... > When I change the background in this one it changes in the window thumbnail but > the desktop background stays the same. > > Can be easily seen using your "while true" loop and going to the overview. odd, I can't reproduce. are there any tracebacks in the log or the like?
(In reply to comment #50) > (In reply to comment #41) > > Review of attachment 239840 [details] [review] [details]: > > > > Something is not right here ... > > When I change the background in this one it changes in the window thumbnail but > > the desktop background stays the same. > > > > Can be easily seen using your "while true" loop and going to the overview. > > odd, I can't reproduce. are there any tracebacks in the log or the like? I can't either ... no idea what I did yesterday.
Review of attachment 239840 [details] [review]: OK looks good, I can no longer reproduce the bug I saw yesterday.
(In reply to comment #48) > So this is a question of risk/benefits/drawbacks. The risk seems to me to be > relatively high, unless we get more people to test this. The benefit is not > leaking. Well leaking such huge amount of memory can has very bad side effect if you have lets say only 2GB of memory (not *that* unlikely) changing the background a few times and you end up with disk trashing and a very slow system.
(In reply to comment #48) > So this is a question of risk/benefits/drawbacks. The risk seems to me to be > relatively high, unless we get more people to test this. The benefit is not > leaking. > > This seems like something that could go in 3.8.1 after it's had more testing. > I'd be curious what Florian thinks. It's "user-visible regression" vs. "big memory leak" for me here. I'm updating the whole stack now to not blame the patch set by mistake, but it does look like this needs some more work (so it's either 3.8.1 or delaying 3.8.0)
Lets give this patchset a closer look, then. I'm willing to wait until tomorrow, at least.
OK. If I can still reproduce after lunch, I'll bisect the patch set.
(In reply to comment #56) > OK. If I can still reproduce after lunch, I'll bisect the patch set. Reproduce what?
there shouldn't be any user-visible changes. what does it look like? Do you have both the mutter and the gnome-shell patches?
(In reply to comment #58) > there shouldn't be any user-visible changes. what does it look like? Failing to set the background in the overview or at all. So far I've only reproduced it with animated backgrounds. > Do you have both the mutter and the gnome-shell patches? Yes, everything from this bug.
bisect points to "background: explicitly destroy content"
Comment on attachment 239725 [details] [review] background: explicitly destroy content oh sorry, I should have obsoleted this attachment. it was just an in-progress patch I threw up while working on the problem. I don't have it in my tree atm.
To be clear, the mutter patches are: background: fix task leak in load_file_async background: fix pixbuf leak in load_file_finish background: drop saturation and blur effects background: share snippets between pipelines background: don't tank if background is destroyed before it gets a pipeline and the gnome shell patches are: background: always add content to cache background: When updating actor content, evict old content from cache background: properly disconnect background signals workspaceThumbnail: always destroy bgManager when destroyed background: don't load the same image more than once concurrently background: don't choke on huge slide show durations
Can we branch gnome-shell and push these to master if they're believed to be correct? It's a lot more convenient for people to test that way then; e.g. I can just wait for gnome-ostree to build them.
(In reply to comment #61) > (From update of attachment 239725 [details] [review]) > oh sorry, I should have obsoleted this attachment. it was just an in-progress > patch I threw up while working on the problem. I don't have it in my tree atm. Ah! In a quick test, the problem goes away with that patch removed.
(In reply to comment #63) > Can we branch gnome-shell and push these to master if they're believed to be > correct? Mmmh, I was actually planning to continue our tradition to not branch before 3.x.1 ...
To make testing easier, i've pushed this stuff to branches: https://git.gnome.org/browse/mutter/log/?h=wip/bg-fixes https://git.gnome.org/browse/gnome-shell/log/?h=wip/bg-fixes
(In reply to comment #66) > To make testing easier, i've pushed this stuff to branches: > > https://git.gnome.org/browse/mutter/log/?h=wip/bg-fixes > https://git.gnome.org/browse/gnome-shell/log/?h=wip/bg-fixes Current gnome-ostree now has this. Playing with changing around the backgrounds, it looks fine to me. I think I'll keep the hot pink just for fun...
okay i've talked to walters on irc and he's giving his 2/2 (and I talked to matthias earlier in person and he gave his 1/2). Florian is okay with me pushing now before he does the releases, too, so i'm going to push.
Attachment 239721 [details] pushed as 5ed6e37 - background: fix task leak in load_file_async Attachment 239756 [details] pushed as 13c7020 - background: fix pixbuf leak in load_file_finish Attachment 239835 [details] pushed as 0e58906 - background: drop saturation and blur effects Attachment 239836 [details] pushed as 47cf63b - background: share snippets between pipelines Attachment 239837 [details] pushed as 577e5e2 - background: don't tank if background is destroyed before it gets a pipeline
Attachment 239724 [details] pushed as b9da6d9 - background: When updating actor content, evict old content from cache Attachment 239757 [details] pushed as 3be489c - background: always add content to cache Attachment 239838 [details] pushed as f920fd8 - background: properly disconnect background signals Attachment 239839 [details] pushed as 18d850d - workspaceThumbnail: always destroy bgManager when destroyed Attachment 239840 [details] pushed as bed3bb4 - background: don't load the same image more than once concurrently Attachment 239847 [details] pushed as 1dadcee - background: don't choke on huge slide show durations
Still have a memory leak. Much smaller, in gnome-shell and gnome-control-center. Replace many backgrounds - and see.
If you run alt-f2 type lg click Memory then click "Full GC" does it recover the memory?
(In reply to comment #72) > If you run alt-f2 type lg click Memory then click "Full GC" does it recover the > memory? No, it's not.
Review of attachment 239840 [details] [review]: This broke vignettes in the overview, since we lost the { effects: params.effects } setter.
we've done a few more leak fixes. See bug 697119 and bug 697395
(the vignette bug mentioned in comment 74 was fixed in bug 696712)
I'm having the same problem :( Gnome Shell 3.8.3 Manjaro Linux