GNOME Bugzilla – Bug 695636
_gdk_windowing_create_cairo_surface() may return already finished surfaces
Last modified: 2015-03-08 20:05:56 UTC
In the win32 implementation, _gdk_windowing_create_cairo_surface() will return a new reference to the already available cairo surface in the GdkDrawable, but this surface may already be finished (i.e. in CAIRO_STATUS_SURFACE_FINISHED), and therefore is quite unusable.
Instead, _gdk_windowing_create_cairo_surface() should always create a fresh cairo surface.
Created attachment 238586 [details] [review]
Patch for the issue.
Created attachment 238590 [details] [review]
Same patch, without g_message() :)
I don't think it should clear the previous surface, no other backend
is doing that. And we should probably avoid assertions in libraries.
My first approach was actually to avoid clearing the previous surface, but I ended up doing it in order to properly balance out the HDC count (impl->hdc_count), which is win32-specific.
Basically, when you create a new surface we need to increase the HDC count, and with cairo_surface_set_user_data() we add a callback to get the count decreased when the GdkDrawing no longer makes use of the surface. If we just created a new surface and returned it without clearing the one in the GdkDrawing, we end up with the HDC count acquired for the new surface never released, which ends up assert-ing in _gdk_win32_drawable_finish(). That's why I added the logic to clear up the previous surface and keeping the new one internally instead of just returning it.
If we cannot clear the previous surface, we may end up needing to keep track of all created surfaces in the GdkDrawing, so that we balance the HDC count properly. I'll think about the issue a bit further, anyway.
I see, but I think we definitely can't just kill the drawable's surface.
Created attachment 238966 [details] [review]
So, this new patch avoids to clear up the internal surface. Instead, we just create a new surface and return it.
In order to balance out the HDC count, I added a new user_data key which will just release the acquired DC. Still, for the case where the cairo surface is kept in impl->cairo_surface, we keep the previous user_data key to clear that internal pointer.
I also removed the assert() checking that impl->hdc_count is zero when we call _gdk_win32_drawable_finish(), as there may still be surfaces created with _gdk_windowing_create_cairo_surface() out there which will release the DC count themselves when they're destroyed.
Created attachment 238993 [details]
monodevelop logs when triggering the issue
Some additional info regarding the original place where this issue was found...
The bug is found in XamarinStudio/Monodevelop under win32 (so GTK#). To reproduce it, just open a project in monodevelop, click on the 'Solution' dock in the left side of the window to open it, hide it, and when back in the main window the tabs in the custom notebook widget stop working.
On mouse move events, these tabs will create a cairo context just to see if the pointer is within the area of each tab, and the root problem is that when these contexts are created, the underlying cairo surface of the window is already finished. Showing and hiding the 'Solution' dock triggers the surface finish (see attached logs), mainly because _gdk_window_destroy_hierarchy() calls gdk_window_drop_cairo_surface() and this last one calls cairo_surface_finish().
Created attachment 239273 [details] [review]
Tester to reproduce the issue
Owen asked me a tester to reproduce the issue and I came up with this one. The tester, attached, basically does this:
* Create a GtkEventBox
* Get a cairo context (ctx#1) from the GtkEventBox
* Destroy the GtkEventBox
* Create a new GtkEventBox
* Get a cairo context (ctx#2) from the GtkEventBox
* This context is already in SURFACE_FINISHED state
* Destroy both cairo contexts ctx#2 and ctx#1
From now on any the context that is retrieved from the GtkEventBox (e.g. through clicking on the label in the app window) will also be in SURFACE_FINISHED state.
So it seems an issue with destroying the widget from where we got the first cairo context while a reference to the context is still valid, even if we destroy that context afterwards.
The previous patch I suggested seems to fix the issue, btw.
I run the same reproducer under gtk+3/win32 and there is no issue. So the problem really seems only to be in gtk-2-24 *and* the win32 backend.
Comment on attachment 238966 [details] [review]
I've reviewed and tested this patch extensively, and it does look completely right to me, it does bring the same surface lifetime semantics on win32 than on x11.
Before the patch:
- _gdk_windowing_create_cairo_surface() returns a new, Xlib surface, attaches no data to it, so destroying it doesn't trigger any GDK code.
- ref_cairo_surface() returns a ref to impl->cairo_surface, if that doesn't exist, it is created through the former and surface data is set so impl->cairo_surface is reset back to NULL when the surface is destroyed
- _gdk_windowing_create_cairo_surface() merely calls ref_cairo_surface(), which,
- ref_cairo_surface() creates impl->cairo_surface, in a similar way to X11
So the bug really is that both calls return impl->cairo_surface, even on callers that expect a surface of their own, and destroying those surfaces will then finish/erase the drawable surface at odd times.
With aleksander's patch the call order is inverted, just like in X11, and HDC refcount handling is separated from other surface data. It looks like the current code attempts to unify HDC refcount on both calls, or tries really hard to keep HDC refcount on control of GDK, the patch may help a bit less on the second (eg. leaking windowing_create_cairo_surface() surfaces), but returning impl->cairo_surface anytime sounds like a misbehavior
(In reply to comment #10)
> second (eg. leaking windowing_create_cairo_surface() surfaces), but returning
^^ "if you leak", to avoid confusions, this patch itself doesn't leak anything
Thanks for the testing, Carlos. I'm not really in a good position to ack win32 patches, but if this works, is tested and fixes a real issue, we should get it in.
Created attachment 268127 [details]
Screenshot with and without this patch on windows xp
The use of this patch leads to various visual artifacts on win32 platform. GTK version 2.24.22 with removed patch (left side of screenshot) does not have this problem, while the original version 2.24.22 (right side) has.
Builded with MinGW GCC 4.7.2 and
Pixman lib - 0.32.4,
Cairo lib - 1.12.16,
Gdk-pixbuf lib - 2.26.5,
Pango lib - 1.30.1,
Atk lib - 2.4.0.
This patch causes rendering issues for me as well.
Apparent knock-on problems from this fix raised separately as bug #744169.
I am reopening this since it is reverted in msys2 and reverted on the msvc binaries distribution. TingPing can you please give some more info about the rendering problems here?
(In reply to Ignacio Casal Quinteiro (nacho) from comment #16)
> TingPing can you please give some more info about the rendering problems here?
It has been a while but if you build HexChat with this commit the treeview
doesn't full render or something like that.
Ah the picture in comment #13 is what I remember, the text in the treeview did not render or had glitching rendering.
Bug #744169 links to the problems we had in Freeciv, including screenshots. We've been stuck on Gtk+ 2.24.20 due to this, but now it's reverted we can consider using the latest version (2.24.26).
I believe bug #741060 is where this fix was reverted, and gives rationale.
Just for reference this seems to be gtk2 specific since in gtk3 there is no such patch: https://github.com/Alexpux/MINGW-packages/tree/master/mingw-w64-gtk3 and I haven't experienced those glitches.
TingPing, can you double check 2.24.26 fixes this problem for you too?
Hey! Sorry my original patch made so much noise :/ Looks like alexl didn't revert it, but pushed a fix on top of it? https://mail.gnome.org/archives/commits-list/2014-December/msg01151.html
Hey, yeah it seems alex made a fix. I asked the hexchat guys to test the new tarball so let's see what they reply. I've also asked msys2 maintainer to update it so once it is updated I can also update it. I'll leave this bug opened until we get a confirmation.
Closing this since it was indeed fixed by alex.