Bug 595321 – memory leaks (ongoing)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 595321 - memory leaks (ongoing)


Summary:	memory leaks (ongoing)


Status:	RESOLVED FIXED

Product:	gnome-shell
Classification:	Core
Component:	general
Version:	unspecified
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gnome-shell-maint
QA Contact:	gnome-shell-maint

URL:
Whiteboard:

Duplicates:	602853 (view as bug list)
Depends on:
Blocks:

Reported:	2009-09-15 23:13 UTC by Colin Walters
Modified:	2010-01-14 20:17 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
[ShellTextureCache] Fix memory leaks (1.74 KB, patch) 2009-09-16 00:14 UTC, Colin Walters	committed	Details \| Review
[ShellAppMonitor] Squash small memory leak (812 bytes, patch) 2009-09-16 23:45 UTC, Colin Walters	committed	Details \| Review
[ShellAppSystem] Unref GIcon we use internally (1.52 KB, patch) 2009-09-16 23:45 UTC, Colin Walters	committed	Details \| Review
[ShellAppSystem] Fix memory leak in getting applications from menu (949 bytes, patch) 2009-09-22 16:48 UTC, Colin Walters	committed	Details \| Review
[ShellTextureCache] Don't leak when loading an already-cached item (1.96 KB, patch) 2009-09-22 22:02 UTC, Colin Walters	committed	Details \| Review

Description Colin Walters 2009-09-15 23:13:31 UTC

This bug will have patches to both analyze the shell's memory usage, data I've gathered, and fixes.

Comment 1 Colin Walters 2009-09-15 23:59:15 UTC

== Application Add/Remove ==

So first up I tried doing "while true; do yum -y install pidgin; yum -y remove pidgin" in a loop, since the application system has nontrivial memory management bits and we do a LOT of stuff when the applications change.

A first observation is that the shell's RES grows about 1MB/s under this scenario.  My new bindings for JS_GC (bug 595323) don't appear to shrink the RES, but does Spidermonkey release pages back to the OS?  Don't know.

I haven't tried valgrind yet, but one really interesting data point (using the tool in bug 354457) is:

BigBox | created 15809 | destroyed 4758 | max-population 11051

So at some point we had over 11,000 BigBox instances floating around.  That sounds like quite a lot!  Is that a GC problem?  It's hard to know.  I need to improve that patch so that we can dump the data at any point rather than at exit.

Also, I'd really like a set of stack traces ordered by most frequent allocation point.

So, further work to be done on that patch.  However for this one I'll turn to valgrind.

Comment 2 Colin Walters 2009-09-16 00:14:16 UTC

Let's start with the "definitely lost".  They're small potatoes, but worth fixing.

==11710== 7,872 bytes in 62 blocks are definitely lost in loss record 263 of 312
==11710==    at 0x4A0776F: realloc (vg_replace_malloc.c:429)
==11710==    by 0x6BCE673: g_realloc (gmem.c:170)
==11710==    by 0x6BEE7CE: g_string_maybe_expand (gstring.c:361)
==11710==    by 0x6BEEE35: g_string_insert_len (gstring.c:696)
==11710==    by 0x6BEF233: g_string_append_len (gstring.c:844)
==11710==    by 0x6BB26EC: g_build_path_va (gfileutils.c:1460)
==11710==    by 0x6BB299C: g_build_filename (gfileutils.c:1730)
==11710==    by 0x3FC0C12003: gnome_desktop_thumbnail_factory_lookup (in /usr/lib64/libgnome-desktop-2.so.11.2.6)
==11710==    by 0xE92D5B5: load_pixbuf_thread (shell-texture-cache.c:427)
==11710==    by 0x5E9AFEB: run_in_thread (gsimpleasyncresult.c:676)
==11710==    by 0x5E8B904: io_job_thread (gioscheduler.c:182)
==11710==    by 0x6BF737C: g_thread_pool_thread_proxy (gthreadpool.c:265)
==11710== 
==11710== 
==11710== 29,808 bytes in 207 blocks are definitely lost in loss record 286 of 312
==11710==    at 0x4A0763E: malloc (vg_replace_malloc.c:207)
==11710==    by 0x6BCE575: g_malloc (gmem.c:131)
==11710==    by 0x57BC777: cogl_texture_new_from_data (cogl-texture.c:1330)
==11710==    by 0xE92DBFE: pixbuf_to_cogl_handle (shell-texture-cache.c:653)
==11710==    by 0xE92E88B: on_pixbuf_loaded (shell-texture-cache.c:721)
==11710==    by 0x5E9AD52: g_simple_async_result_complete (gsimpleasyncresult.c:588)
==11710==    by 0x5E9AF06: complete_in_idle_cb_for_thread (gsimpleasyncresult.c:650)
==11710==    by 0x6BC8CBE: g_idle_dispatch (gmain.c:4065)
==11710==    by 0x6BC500C: g_main_dispatch (gmain.c:1960)
==11710==    by 0x6BC648C: g_main_context_dispatch (gmain.c:2513)
==11710==    by 0x6BC692D: g_main_context_iterate (gmain.c:2591)
==11710==    by 0x6BC7080: g_main_loop_run (gmain.c:2799)

Comment 3 Colin Walters 2009-09-16 00:14:29 UTC

Created attachment 143256 [details] [review]
[ShellTextureCache] Fix memory leaks

Need to free the returned thumbnail path, and in the CachePolicy.NONE
case we also need to unref the cogl handle.

Comment 4 Colin Walters 2009-09-16 00:18:53 UTC

Ok so moving on, what we see here are our two main big uses of memory; the JS heap and the GObject instance allocator.  What we need to know here is basically, what JS objects are live and more interestingly if they're a proxy object, what GObjects are they pointing to?

==11710== 811,312 bytes in 22,363 blocks are possibly lost in loss record 304 of 312
==11710==    at 0x4A0763E: malloc (vg_replace_malloc.c:207)
==11710==    by 0x4A077C7: realloc (vg_replace_malloc.c:429)
==11710==    by 0x3FB201B91A: JS_realloc (in /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB2059CA8: (within /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB203E666: js_PutCallObject (in /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB20448C6: (within /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB20505AA: js_Invoke (in /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB203D5A6: (within /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB2049EB3: (within /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB20505AA: js_Invoke (in /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB203D7FC: (within /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710==    by 0x3FB2049EB3: (within /usr/lib64/xulrunner-1.9.1/libmozjs.so)
==11710== 
==11710== 
==11710== 17,634,208 bytes in 6,306 blocks are possibly lost in loss record 312 of 312
==11710==    at 0x4A05260: memalign (vg_replace_malloc.c:460)
==11710==    by 0x4A05317: posix_memalign (vg_replace_malloc.c:569)
==11710==    by 0x6BE87F0: allocator_memalign (gslice.c:1136)
==11710==    by 0x6BE82FA: allocator_add_slab (gslice.c:1007)
==11710==    by 0x6BE8517: slab_allocator_alloc_chunk (gslice.c:1055)
==11710==    by 0x6BE760A: magazine_cache_pop_magazine (gslice.c:661)
==11710==    by 0x6BE7976: thread_memory_magazine1_reload (gslice.c:736)
==11710==    by 0x6BE7C20: g_slice_alloc (gslice.c:813)
==11710==    by 0x6BE7CE5: g_slice_alloc0 (gslice.c:833)
==11710==    by 0x633E205: g_type_create_instance (gtype.c:1698)
==11710==    by 0x6322C63: g_object_constructor (gobject.c:1338)
==11710==    by 0x63223B3: g_object_newv (gobject.c:1215)

Comment 5 Dan Winship 2009-09-16 01:31:08 UTC

(In reply to comment #4)
> ==11710== 17,634,208 bytes in 6,306 blocks are possibly lost in loss record 312
> of 312
> ==11710==    at 0x4A05260: memalign (vg_replace_malloc.c:460)
> ==11710==    by 0x4A05317: posix_memalign (vg_replace_malloc.c:569)
> ==11710==    by 0x6BE87F0: allocator_memalign (gslice.c:1136)
> ==11710==    by 0x6BE82FA: allocator_add_slab (gslice.c:1007)
> ==11710==    by 0x6BE8517: slab_allocator_alloc_chunk (gslice.c:1055)
> ==11710==    by 0x6BE760A: magazine_cache_pop_magazine (gslice.c:661)
> ==11710==    by 0x6BE7976: thread_memory_magazine1_reload (gslice.c:736)
> ==11710==    by 0x6BE7C20: g_slice_alloc (gslice.c:813)

You need to run with G_SLICE=always-malloc

Comment 6 Colin Walters 2009-09-16 23:45:31 UTC

Created attachment 143309 [details] [review]
[ShellAppMonitor] Squash small memory leak

Comment 7 Colin Walters 2009-09-16 23:45:34 UTC

Created attachment 143310 [details] [review]
[ShellAppSystem] Unref GIcon we use internally

Comment 8 Colin Walters 2009-09-17 00:52:50 UTC

(In reply to comment #5)
> (In reply to comment #4)
> > ==11710== 17,634,208 bytes in 6,306 blocks are possibly lost in loss record 312
> > of 312
> > ==11710==    at 0x4A05260: memalign (vg_replace_malloc.c:460)
> > ==11710==    by 0x4A05317: posix_memalign (vg_replace_malloc.c:569)
> > ==11710==    by 0x6BE87F0: allocator_memalign (gslice.c:1136)
> > ==11710==    by 0x6BE82FA: allocator_add_slab (gslice.c:1007)
> > ==11710==    by 0x6BE8517: slab_allocator_alloc_chunk (gslice.c:1055)
> > ==11710==    by 0x6BE760A: magazine_cache_pop_magazine (gslice.c:661)
> > ==11710==    by 0x6BE7976: thread_memory_magazine1_reload (gslice.c:736)
> > ==11710==    by 0x6BE7C20: g_slice_alloc (gslice.c:813)
> 
> You need to run with G_SLICE=always-malloc

Oops, durrr, yes.  

Ok so rerunning, the most interesting one is:

==27651== 2,116,627 (771,408 direct, 1,345,219 indirect) bytes in 15,528 blocks are definitely lost in loss record 298 of 307
==27651==    at 0x4A0763E: malloc (vg_replace_malloc.c:207)
==27651==    by 0x60DB575: g_malloc (gmem.c:131)
==27651==    by 0x60F4CA3: g_slice_alloc (gslice.c:824)
==27651==    by 0xD9FF3BA: shell_app_info_new_from_tree_item (shell-app-system.c:131)
==27651==    by 0xD9FFACD: gather_entries_recurse (shell-app-system.c:342)
==27651==    by 0xDA00133: shell_app_system_get_applications_for_menu (shell-app-system.c:557)
==27651==    by 0x561CEBB: ffi_call_unix64 (in /usr/lib64/libffi.so.5.0.6)
==27651==    by 0x7FEFFE33F: ???
==27651==    by 0x7FEFFE16F: ???
==27651==    by 0x561CC43: ffi_call (in /usr/lib64/libffi.so.5.0.6)
==27651==    by 0x7FEFFE0CF: ???
==27651==    by 0xFFFFFFFF: ???

Comment 9 Dan Winship 2009-09-21 19:56:19 UTC

Comment on attachment 143310 [details] [review]
[ShellAppSystem] Unref GIcon we use internally

>     case SHELL_APP_INFO_TYPE_ENTRY:
>-      return themed_icon_from_name (gmenu_tree_entry_get_icon ((GMenuTreeEntry*)info->entry));
>+      {
>+        return themed_icon_from_name (gmenu_tree_entry_get_icon ((GMenuTreeEntry*)info->entry));
>+      }

This seems unnecessary and inconsistent with the rest of the code.

>@@ -988,10 +990,12 @@ shell_app_info_create_icon_texture (ShellAppInfo *info, float size)
>     {
>       ret = clutter_texture_new ();
>       g_object_set (ret, "opacity", 0, "width", size, "height", size, NULL);
>-      return ret;
>     }
>+  else
>+    ret = shell_texture_cache_load_gicon (shell_texture_cache_get_default (), icon, (int)size);
> 
>-  return shell_texture_cache_load_gicon (shell_texture_cache_get_default (), icon, (int)size);
>+  g_object_unref (icon);
>+  return ret;
> }

The line cut off before the start of this chunk is "if (!icon)", so won't removing the "return ret" from that if cause a g_warning when you try to g_object_unref NULL?

Comment 10 Colin Walters 2009-09-22 16:48:59 UTC

Created attachment 143714 [details] [review]
[ShellAppSystem] Fix memory leak in getting applications from menu

Comment 11 Colin Walters 2009-09-22 16:51:52 UTC

Next up:

==9726== 19,584 (9,792 direct, 9,792 indirect) bytes in 68 blocks are definitely lost in loss record 268 of 302
==9726==    at 0x4A0763E: malloc (vg_replace_malloc.c:207)
==9726==    by 0x60DB575: g_malloc (gmem.c:131)
==9726==    by 0x4EE7777: cogl_texture_new_from_data (cogl-texture.c:1330)
==9726==    by 0xDA0C5D1: pixbuf_to_cogl_handle (shell-texture-cache.c:656)
==9726==    by 0xDA0C767: on_pixbuf_loaded (shell-texture-cache.c:724)
==9726==    by 0x53A7D52: g_simple_async_result_complete (gsimpleasyncresult.c:588)
==9726==    by 0x53A7F06: complete_in_idle_cb_for_thread (gsimpleasyncresult.c:650)
==9726==    by 0x60D5CBE: g_idle_dispatch (gmain.c:4065)
==9726==    by 0x60D200C: g_main_dispatch (gmain.c:1960)
==9726==    by 0x60D348C: g_main_context_dispatch (gmain.c:2513)
==9726==    by 0x60D392D: g_main_context_iterate (gmain.c:2591)
==9726==    by 0x60D4080: g_main_loop_run (gmain.c:2799)

Comment 12 Owen Taylor 2009-09-22 20:20:19 UTC

Review of attachment 143714 [details] [review]:

Looks good

Comment 13 Colin Walters 2009-09-22 22:02:39 UTC

Created attachment 143757 [details] [review]
[ShellTextureCache] Don't leak when loading an already-cached item

A previous patch fixed a leak when loading items which shouldn't
be cached, but we also had a leak if two requests for the same
item were outstanding.  In that case we load the pixbuf twice,
but should discard subsequent loads when we notice we've already
cached it.

Comment 14 Owen Taylor 2009-09-22 22:04:24 UTC

(In reply to comment #13)
> Created an attachment (id=143757) [details]
> [ShellTextureCache] Don't leak when loading an already-cached item
> 
> A previous patch fixed a leak when loading items which shouldn't
> be cached, but we also had a leak if two requests for the same
> item were outstanding.  In that case we load the pixbuf twice,
> but should discard subsequent loads when we notice we've already
> cached it.

Is this happening frequently? Is this a problem? It sounds potentially like we might be doing a ton of extra io.

Comment 15 Colin Walters 2009-09-22 22:06:20 UTC

(In reply to comment #14)
> (In reply to comment #13)
> > Created an attachment (id=143757) [details] [details]
> > [ShellTextureCache] Don't leak when loading an already-cached item
> > 
> > A previous patch fixed a leak when loading items which shouldn't
> > be cached, but we also had a leak if two requests for the same
> > item were outstanding.  In that case we load the pixbuf twice,
> > but should discard subsequent loads when we notice we've already
> > cached it.
> 
> Is this happening frequently? Is this a problem? It sounds potentially like we
> might be doing a ton of extra io.

Not frequently, I think only a few times.  It was a constant 21k leak, not big.

Comment 16 Owen Taylor 2009-09-23 20:11:06 UTC

Review of attachment 143757 [details] [review]:

Can I suggest:

 - Moving the unref to after out and guarding with a if (texdata)
 - Just ref'ing again when inserting and unreffing unconditionally

Your patch looks like it would work fine, but I think standard cleanup patterns are better.

In terms of whether we need to de-duplicate requests, what I'd suggest is adding some quick print debug logging to see when this is happening, and if it is happening in cases that are non-fringe (say, happens when multiple documents have the same document icon), then file a bug so we remember the issue.

Comment 17 Colin Walters 2009-09-23 21:21:58 UTC

(In reply to comment #16)
> Review of attachment 143757 [details] [review]:
> 
> Can I suggest:
> 
>  - Moving the unref to after out and guarding with a if (texdata)
>  - Just ref'ing again when inserting and unreffing unconditionally
> 
> Your patch looks like it would work fine, but I think standard cleanup patterns
> are better.

Good point, fixed.

> 
> In terms of whether we need to de-duplicate requests, what I'd suggest is
> adding some quick print debug logging to see when this is happening, and if it
> is happening in cases that are non-fringe (say, happens when multiple documents
> have the same document icon), then file a bug so we remember the issue.

Uuuugh.  Ok no it turns out we are doing it a lot.  So um, horrible mess, I'll file a separate bug.

Comment 18 Dan Winship 2009-11-12 22:52:34 UTC

closable?

Comment 19 drago01 2009-11-13 16:32:12 UTC

(In reply to comment #18)
> closable?

No, some leaks are still present, dunno if we want to add future patches to this bug or a new bug for each leak/fix.

Comment 20 Owen Taylor 2009-11-24 22:32:48 UTC

*** Bug 602853 has been marked as a duplicate of this bug. ***

Comment 21 Dan Winship 2010-01-14 20:17:28 UTC

i don't think there's any benefit to keeping this open, since it has been untouched for two months. if we find more memory leaks we can file more bugs. if someone goes on another memory-leak-fixing spree, they can open a bug for that spree.