GNOME Bugzilla – Bug 685959
[Win32] Memory leak on every redraw of a widget
Last modified: 2015-06-13 17:44:49 UTC
Created attachment 226253 [details] A simple example which shows the memory leak On the windows platform, every redraw of a Gtk widget causes a memory leak. This can be tested by creating a small program with a single window (like the one attached) and simply resizing the window. This causes a memory leak in the application and the memory is not released until the process is terminated. The leak occurs when calling gtk_widget_queue_draw on the widget or even when calling gdk_window_invalidate_rect on the corresponding Gdk window. The size of the memory leak depends on the size of the redrawn area, but for each redraw it can be between a couple of kilobytes and several megabytes. The problem seems to occur whenever an expose event is received by the widget, even if the widget's expose_event handler doesn't do anything. This was tested on Windows 7 both 32/64 bit and on Windows Server 2008 with gtk 2.24.10. The issue doesn't occur on Linux, only on Windows. To verify the memory leak, compile and run the attached example and watch the memory grow.
I've managed to build gtk on windows and do a `git bisect` and the issue seems to be introduced by commit 2ae574ab6dd1f9810c4667920f9984ed1e95d0f7.
*** Bug 679312 has been marked as a duplicate of this bug. ***
CC'ing Alex as he's the author of the patch. He may find more easily if there's something going wrong. Sorry for the noise if you already receive notifications from the gtk-win32-maint address.
Why this bug is still unconfirmed? As the poster says the size of the leak may be small but also huge, and will hurt the reputation of GTK as cross-platform enviroment for every program that ships with this faulty GTK versions (if you use any GTK application for some hours it become a bloatware). Actually the only workaround is to ship with GTK 2.16.x on WIN32.
(In reply to comment #4) > Why this bug is still unconfirmed? because we don't use the 'NEW' state. > Actually the only workaround is to ship with GTK 2.16.x on WIN32. no, there is another workaround: identify and fix the leak. this workaround has the benefit of not shipping an ancient version of gtk.
I checked the commit that Luis found, there are a few changes: http://git.gnome.org/browse/gtk+/commit/?h=gtk-2-24&id=2ae574ab6dd1f9810c4667920f9984ed1e95d0f7 I think the problem is introduced from this part of the commit: @@ -109,9 +110,10 @@ gdk_pixmap_impl_win32_finalize (GObject *object) GDK_NOTE (PIXMAP, g_print ("gdk_pixmap_impl_win32_finalize: %p\n", GDK_PIXMAP_HBITMAP (wrapper))); - _gdk_win32_drawable_finish (GDK_DRAWABLE (object)); + if (!impl->is_foreign) + GDK_DRAWABLE_IMPL_WIN32 (impl)->hdc_count--; - GDI_CALL (DeleteObject, (GDK_PIXMAP_HBITMAP (wrapper))); + _gdk_win32_drawable_finish (GDK_DRAWABLE (object)); ... the new code removed the GDI call to DeleteObject and I've looked inside _gdk_win32_drawable_finish and there is no trace of it also inside that. So I suppose that adding to gdk_pixmap_impl_win32_finalize() in gdkpixmap-win32.c: GDI_CALL (DeleteObject, (GDK_PIXMAP_HBITMAP (wrapper))); ... after _gdk_win32_drawable_finish, like it was in the old code may help. Actually I have problem to access gtk GIT, git.gnome.org does not accept https as protocol and my company has firewalled the GIT service port, so I'll try to apply the patch on 2.24.13 tarball I got through http, I hope to be able to compile it using the other dependencies from the win32 bundle version. Maybe Luis that already has an environment able to compile GTK may succeed in ths before I do...
(In reply to comment #6) > Actually I have problem to access gtk GIT, git.gnome.org does not accept https > as protocol and my company has firewalled the GIT service port I'm in the same situation, something like this works just fine (from an msysgit bash session): export HTTP_PROXY=http://userid:password@proxy.your.company.com:80 git clone http://git.gnome.org/browse/gtk+ More info on https://live.gnome.org/Git/Developers#Anonymous_Access
I've been able to compile gdk/gtk with the fix i proposed in the previous comment but the leak is still there... I'll do some more analysis on this but sadly I've not enough time for the deep analysis that this problem needs.
Ok, it seems I've been able to fix the problem with some printf debugging I've find the place where the leak was. I'm not sure this fix is 100% correct because I don't know the internals of GTK. I'll attach a patch for this, I hope someone can review it and tell me if it's wrong, I've tested it with the debug program attached to this bug, to the program attached to https://bugzilla.gnome.org/show_bug.cgi?id=679312 and, for regression, with gtk-demo.exe.
Created attachment 226646 [details] [review] An experimental patch to fix this memory leak. The patch uses cairo_surface_destroy() instead that cairo_surface_finish() in gdk_win32_drawable_finish(), as it seems is correct from cairo documentation.
(In reply to comment #6) > I checked the commit that Luis found, there are a few changes: Well, that's Jurica Bradaric that identified it in comment #1 ;-) > Maybe Luis that already has an environment able to compile GTK may succeed in > ths before I do... I don't work on Windows these days, Linux only, but I help on some GTK forums and someone complained there about the leak...
Created attachment 226691 [details] gtk-memleak-testcase.c: another test program
Created attachment 226692 [details] plot.py: create graph from output of previous test program
Created attachment 226693 [details] memory usage for GTK+ 2.24.10 as included with the bundle on www.gtk.org
Created attachment 226694 [details] memory usage for patched GTK+ Thank you Gabriele. As we can see, this graph showing the behavior with patch #226646 from comment #10 applied is a clear improvement! Fixes the issue as far as I can see and the patch does look correct to me. I've asked Alexander Larsson (the author of commit 2ae574ab6dd1f9810c4667920f9984ed1e95d0f7) on #gtk+ if he can take a look too, just to be sure we're not missing anything here.
I don't think the patch is right. For instance, it references impl->cairo_surface after having called cairo_surface_set_user_data(), but that call will cause gdk_win32_cairo_surface_destroy to be called, and thus iml->cairo_surface to be set to NULL (unless i'm missing something). Additionally, its just not right. The way things are supposed to happen is that someone calls gdk_win32_ref_cairo_surface(), they will get a a ref to the cairo surface, and when they are done with the surface they need to unref it and that will cause the surface to go away. GdkDrawableWin32 owns no ref to the surface at all. Instead, it owns a kind of "weak" ref. I.e. it has a pointer to it so that further calls can reuse the same object, but as soon as the real owners unref the surface we drop the pointer via gdk_win32_cairo_surface_destroy. If the GdkWindow or GdkPixmap is destroyed we now know that the surface is useless, so we _finish() it meaning that all further use of it is just an error, and we clear the user data so that we free that dc. However, we don't own a reference to the surface so we can't just unref (surface_destroy) it. Each owner of a ref has to unref the surface themselves, and if GdkDrawableWin32 unreffed it too there would be an extra unref. So, the problem is that some caller of gdk_win32_ref_cairo_surface() is not correctly freeing the surface when not needed anymore.
Ah, so the issue is that _gdk_pixmap_new() does: /* No need to create a new surface when needed, as we have one already */ drawable_impl->cairo_surface = dib_surface; drawable_impl->handle = hbitmap; pixmap_impl->bits = bits; Which actually *is* an owning ref. So, gdk_pixmap_impl_win32_finalize() needs to do something like: if (!impl->is_foreign) { /* Tell outstanding owners that the surface is useless */ cairo_surface_finish (impl->cairo_surface); /* Drop our reference */ cairo_surface_destroy (impl->cairo_surface); impl->cairo_surface = NULL; }
I think the cairo_surface_finish call is not needed since cairo_surface_destroy performs a finish itself if the reference count of the surface is 0.
(In reply to comment #14) > Created an attachment (id=226693) [details] > memory usage for GTK+ 2.24.10 as included with the bundle on www.gtk.org Thanks for the data Dieter. Would it be possible for you to test it against a recent version of GTK 3 to see if it is affected by the same bug ?
> I think the cairo_surface_finish call is not needed since cairo_surface_destroy > performs a finish itself if the reference count of the surface is 0. Obviously, yes, but in the case where the reference count is not 0 after the destroy we want to finish it so that other references to the surface don't start doing operations to a destroyed window.
Created attachment 226983 [details] [review] Proposed fix (In reply to comment #17) > if (!impl->is_foreign) { > /* Tell outstanding owners that the surface is useless */ > cairo_surface_finish (impl->cairo_surface); > > /* Drop our reference */ > cairo_surface_destroy (impl->cairo_surface); > impl->cairo_surface = NULL; > } is_foreign is part of the pixmap implementation. Did you perhaps mean something like the attached?
No, the code i talked about should go in gdk_pixmap_impl_win32_finalize
Created attachment 227002 [details] [review] Proposed fix (In reply to comment #22) > No, the code i talked about should go in gdk_pixmap_impl_win32_finalize I see. Presumably we still want to finish the surface exactly once no matter if it's a pixmap or not, so perhaps something like this?
Review of attachment 227002 [details] [review]: I don't understand much in that, but it's clear it won't work... ::: ../gtk+-2.24.13/gdk/win32/gdkpixmap-win32.c @@ +118,3 @@ + if (!pixmap_impl->is_foreign) + { You forgot to call cairo_surface_finish...
(In reply to comment #24) > You forgot to call cairo_surface_finish... What I did may not be right, but I didn't forget. Just above the code segment you quote is a call to _gdk_win32_drawable_finish() which calls cairo_surface_finish(). If we added a call within gdk_pixmap_impl_win32_finalize() as well, then cairo_surface_finish() would get called twice under some circumstances. If we move the call from _gdk_win32_drawable_finish() to gdk_pixmap_impl_win32_finalize(), then it won't get called at all under some other circumstances. I don't _think_ that's what we want, but I could easily be wrong.
Oh, sorry, looks you are right. I just looked at comment 17 and couldn't see that call in your patch so I thought it had been forgotten.
Review of attachment 227002 [details] [review]: ::: ../gtk+-2.24.13/gdk/win32/gdkpixmap-win32.c @@ +117,3 @@ _gdk_win32_drawable_finish (GDK_DRAWABLE (object)); + if (!pixmap_impl->is_foreign) I'd rather put this in the check above so we finish and destroy the pixmap cairo_surface before reaching _gdk_win32_drawable_finish, because we're conceptually using a different cairo_surface, i.e. one allocated in the pixmap constructor, not in gdk_win32_ref_cairo_surface, and thus it should not be freed by _gdk_win32_drawable_finish. For instance, they differ because there is no gdk_win32_cairo_key set on the surface.
And yes, we should leave in the call to _gdk_win32_drawable_finish, its needed for foreign pixmaps which get their cairo_surface allocated by gdk_win32_ref_cairo_surface().
Created attachment 227018 [details] [review] win32: Don't leak the pixmap cairo_surface GdkPixmapWin32 allocates a cairo_surface manually for non-foreign pixmaps, instead of letting GdkDrawableWin32 create on on-demand. However, the pixmap created surface is a strong ref, rather than the weak ref created by gdk_win32_ref_cairo_surface() so we can't rely on _gdk_win32_drawable_finish to actually free it. So, we have to manually free it when we finalize or we leak it.
something like this (this is untested)
I've compiled GTK 2.24.13 with the patch https://bugzilla.gnome.org/attachment.cgi?id=227018, and I've run the proposed test program and the more complex one from https://bugzilla.gnome.org/show_bug.cgi?id=679312. It seems that the leak is solved. I fear that we may still leak some bytes since releasing the cairo surface in gdk_win32_pixmap_finish instead of gdk_win32_drawable_finish cause cairo_surface_set_user_data() not to be called, maybe also a call to: cairo_surface_set_user_data (drawable_impl->cairo_surface, &gdk_win32_cairo_key, NULL, NULL); is needed before cairo_surface_destroy()?
I can confirm a dc leak without freeing user data (gtk 2.24.13 with the patch https://bugzilla.gnome.org/attachment.cgi?id=227018). The leak is very small and you need the program from https://bugzilla.gnome.org/show_bug.cgi?id=679312 running for a few minutes to see it. It's strange since cairo documentation of cairo_surface_set_user_data (http://www.cairographics.org/manual/cairo-cairo-surface-t.html#cairo-surface-set-user-data) says: destroy : a cairo_destroy_func_t which will be called when the surface is destroyed or when new user data is attached using the same key. At least with the cairo version got from the latest GTK 2.24.10 bundle from GTK.org it seems that the callback is called when doing set_user_data -> NULL, but not when destroying the cairo surface. So to prevent the dc leak I've moved again the patch in gdkdrawable-win32.c since gdk_win32_cairo_key is static and not visible in gdkpixmap-win32.c
> I fear that we may still leak some bytes since releasing the cairo surface in > gdk_win32_pixmap_finish instead of gdk_win32_drawable_finish cause > cairo_surface_set_user_data() not to be called, maybe also a call to: I don't understand this. The cairo_surface is allocated in the pixmap constructor, so gdk_win32_ref_cairo_surface() will never allocate it, and thus never call cairo_surface_set_user_data(). There is no need to unset it as it is never set, and doing so will not cause any code to run. However, it might be that we're leaking drawable_impl->hdc in the 16bpp pixmap case, as we're manually creating a Dc there which is not owned by cairo, so it will not be DeleteDC:ed with the surface. Are you using 16bpp?
(In reply to comment #32) > I can confirm a dc leak without freeing user data (gtk 2.24.13 with the patch > https://bugzilla.gnome.org/attachment.cgi?id=227018). > > The leak is very small and you need the program from > https://bugzilla.gnome.org/show_bug.cgi?id=679312 running for a few minutes to > see it. I previously let bug 671538 alive because it covered the GDI leak, even if very similar to this one. So the root causes are the same, and it's a duplicate of this bug?
(In reply to comment #29) > Created an attachment (id=227018) [details] [review] > win32: Don't leak the pixmap cairo_surface > I tested this patch with gtk+-2.24.10 on Windows7 (mingw). I think that the leak is solved. Thanks!
(In reply to comment #33) > However, it might be that we're leaking drawable_impl->hdc in the 16bpp pixmap > case, as we're manually creating a Dc there which is not owned by cairo, so it > will not be DeleteDC:ed with the surface. Are you using 16bpp? I can confirm a memory leak when using 16 BPP. Tested using remote connection to Windows XP, 7 and Server 2008 after applying patch 227018. There's no leak when using 24 BPP or 8 BPP.
Comment on attachment 227018 [details] [review] win32: Don't leak the pixmap cairo_surface Attachment 227018 [details] pushed as bb867e9 - win32: Don't leak the pixmap cairo_surface
Enough people have verified that the patch works, so i pushed it. However, we should also look at the 16bpp leak.
The 16bpp leak seems addressed in bug 671538, so I'm resolving this. Feel free to reopen if that's not the case. Thanks for having fixed that, Alexander.
*** Bug 681934 has been marked as a duplicate of this bug. ***
*** Bug 683246 has been marked as a duplicate of this bug. ***