After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 727452 - Wayland windows with non-trivial alpha get new content drawn over old
Wayland windows with non-trivial alpha get new content drawn over old
Status: RESOLVED OBSOLETE
Product: gtk+
Classification: Platform
Component: Backend: Wayland
3.12.x
Other Linux
: Normal normal
: ---
Assigned To: gtk-bugs
gtk-bugs
Depends on:
Blocks:
 
 
Reported: 2014-04-01 18:27 UTC by Simon McVittie
Modified: 2015-11-17 20:36 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
gdk_window_impl_wayland_begin_paint_region: always use an intermediate buffer (1.96 KB, patch)
2014-04-01 18:28 UTC, Simon McVittie
none Details | Review
[doesn't work] gdk_window_clear_backing_region: clear to rgba(0,0,0,0) before filling (2.16 KB, patch)
2014-04-01 18:30 UTC, Simon McVittie
none Details | Review
gdk_window_impl_wayland_begin_paint_region: clear the buffer for each frame (1.77 KB, patch)
2014-04-02 09:39 UTC, Simon McVittie
rejected Details | Review

Description Simon McVittie 2014-04-01 18:27:11 UTC
GtkWindowImplWayland tries to optimize by doing the Gtk drawing directly
into the same buffer it gives to Wayland, as long as Wayland is not
using that buffer right now.

Normally, this would be fine, because we clear that buffer to the
window's background pattern with gdk_window_clear_backing_region()
before painting it.

However, if you're making a fancy composited shell with semitransparent
bits, the background pattern might itself have an alpha channel. If it
does, at the moment we just paint it over the top of whatever happens
to be left in the buffer from the previous frame, which will shine
through any areas with non-trivial alpha.
Comment 1 Simon McVittie 2014-04-01 18:28:34 UTC
Created attachment 273425 [details] [review]
gdk_window_impl_wayland_begin_paint_region: always use an  intermediate buffer

GtkWindowImplWayland tries to optimize by doing the Gtk drawing directly
into the same buffer it gives to Wayland, as long as Wayland is not
using that buffer right now.

Normally, this would be fine, because we clear that buffer to the
window's background pattern with gdk_window_clear_backing_region()
before painting it.

However, if you're making a fancy composited shell with semitransparent
bits, the background pattern might itself have an alpha channel. If it
does, at the moment we just paint it over the top of whatever happens
to be left in the buffer from the previous frame, which will shine
through any areas with non-trivial alpha.

As a quick hack around this, return TRUE, which means Gtk will
allocate a new, blank (rgba(0,0,0,0)) image surface with
gdk_wayland_window_create_similar_image_surface(), do the Gtk-level
drawing into that, and copy its contents into impl->cairo_surface
when painting has finished.

---

I know this is a hack, but it works, and might point someone towards a better-quality solution.
Comment 2 Simon McVittie 2014-04-01 18:30:06 UTC
Created attachment 273427 [details] [review]
[doesn't work] gdk_window_clear_backing_region: clear to  rgba(0,0,0,0) before filling

I hoped this would make the previous patch unnecessary, but it doesn't seem to work. I always get confused by alpha channels in Cairo; perhaps someone who understands this stuff could salvage it.
Comment 3 Simon McVittie 2014-04-01 18:33:17 UTC
I suppose explicitly memset()ing the entire buffer to rgba(0,0,0,0) in gdk_window_impl_wayland_begin_paint_region() might also do the trick.
Comment 4 Daniel Stone 2014-04-01 21:44:46 UTC
Review of attachment 273427 [details] [review]:

::: gdk/gdkwindow.c
@@ +3064,3 @@
+      cairo_save (cr);
+      cairo_set_source_rgb (cr, 0, 0, 0);
+       * the Wayland backend.

Does CAIRO_OPERATOR_SOURCE work better?
Comment 5 Daniel Stone 2014-04-01 21:55:59 UTC
Actually, all that should be unnecessary - try removing, well, all of it, and just changing the operator to CAIRO_OPERATOR_SOURCE (i.e. memcpy) before the cairo_fill(). No sense blending with undefined content (as the default operator is OVER).
Comment 6 Matthias Clasen 2014-04-02 01:27:37 UTC
a testcase demonstrating the problem would be great
Comment 7 Kristian Høgsberg 2014-04-02 04:29:11 UTC
(In reply to comment #6)
> a testcase demonstrating the problem would be great

What I've seen is that if you switch between focused and non-focused, the entire window gets repainted, which illustrates the problem described above as the drop shadows gets darker and darker.
Comment 8 Simon McVittie 2014-04-02 09:38:04 UTC
(In reply to comment #5)
> Actually, all that should be unnecessary - try removing, well, all of it, and
> just changing the operator to CAIRO_OPERATOR_SOURCE (i.e. memcpy) before the
> cairo_fill().

I feel as though that ought to work, so I tried this patch, but it doesn't seem to work:

--- a/gdk/gdkwindow.c
+++ b/gdk/gdkwindow.c
@@ -3055,6 +3055,12 @@ gdk_window_clear_backing_region (GdkWindow *window,
   cairo_region_intersect (clip, region);
 
   gdk_cairo_region (cr, clip);
+  /* If the background pattern is not fully opaque, the default OVER operator
+   * would leave old content visible through the non-opaque regions. In
+   * practice the X11 backend is unaffected because it uses a new surface
+   * every time, but the Wayland backend is susceptible to this since
+   * it re-uses the same Cairo surface as much as possible (#727452). */
+  cairo_set_operator (cr, CAIRO_OPERATOR_SOURCE);
   cairo_fill (cr);
 
   cairo_destroy (cr);

(In reply to comment #6)
> a testcase demonstrating the problem would be great

Sorry, I'm doing drive-by bugfixing in some code that is not currently public, and I don't know enough about Wayland to construct a more minimal test case. I believe drawing a GtkProgressBar in gtk_progress_bar_pulse() mode, on a transparent toplevel window, would illustrate the bug?

The darkening drop-shadow that Kristian described sounds like it might well be another symptom of the same thing. Kristian, does Attachment #273425 [details] fix that for you?
Comment 9 Simon McVittie 2014-04-02 09:39:33 UTC
Created attachment 273458 [details] [review]
gdk_window_impl_wayland_begin_paint_region: clear the buffer  for each frame

GtkWindowImplWayland tries to optimize by doing the Gtk drawing directly
into the same buffer it gives to Wayland, as long as Wayland is not
using that buffer right now.

Normally, this would be fine, because we clear that buffer to the
window's background pattern with gdk_window_clear_backing_region()
before painting it.

However, if you're making a fancy composited shell with semitransparent
bits, the background pattern might itself have an alpha channel. If it
does, at the moment we just paint it over the top of whatever happens
to be left in the buffer from the previous frame, which will shine
through any areas with non-trivial alpha.

---

This also works, and might be a bit nicer for performance than Attachment #273425 [details].
Comment 10 Matthias Clasen 2014-04-02 13:58:59 UTC
Hmm, I see the shadow problem with weston, but not with mutter-wayland. I guess the buffer handling is slightly different between the two ?
Comment 11 Simon McVittie 2014-04-02 14:07:11 UTC
Our environment has Weston; I'd be somewhat surprised if that makes a difference, but you never know...
Comment 12 Matthias Clasen 2014-04-02 15:32:49 UTC
In fact, weston --use-pixman does not exhibit the blackening shadow issue either.
Comment 13 Benjamin Otte (Company) 2014-04-03 00:43:43 UTC
That's because the behavior is specific to reusing surfaces. And in mutter and pixman-weston, the compositor doesn't release the surface immediately so GDK marks it as busy and falls back to the way we do it in X by using a similar surface.

I believe regular weston releases the buffer pretty much immediately (as its contents are uploaded to a GL texture). So GDK reuses the buffer and because it isn't cleared, the bug happens.

I'm not sure memsetting the whole buffer is correcct for the cases where we only redraw parts of the window. 
Also, if we wonder about performance, CAIRO_OPERATOR_CLEAR'ing its cairo surface should be faster as cairo has optimizations for cleared surfaces.


I think the problem is that we have no definition inside GDK if begin_paint clears the cairo surface or returns a cairo surface with undefined contents.
Comment 14 Simon McVittie 2014-04-03 10:04:02 UTC
(In reply to comment #13)
> That's because the behavior is specific to reusing surfaces. And in mutter and
> pixman-weston, the compositor doesn't release the surface immediately so GDK
> marks it as busy and falls back to the way we do it in X by using a similar
> surface.

That makes sense - so Mutter and pixman-weston are basically in the same situation as when I patched out the fast-path altogether (Attachment #273425 [details]).

If the slow-path is the only thing that gets tested in practice, perhaps it would be better to use that as a short-term solution.

> I'm not sure memsetting the whole buffer is correcct for the cases where we
> only redraw parts of the window. 

Hmm, good point. I did think "it's fine, if anything its semantics are closer to the slow path", but I think that may have been wrong reasoning - the slow path only copies the redrawn region from the similar surface (overwriting the corresponding part of the reusable surface), not the entire similar surface.

> I think the problem is that we have no definition inside GDK if begin_paint
> clears the cairo surface or returns a cairo surface with undefined contents.

The comments, and the implementation in GdkWindow, suggest that it's meant to be OK for it to have undefined contents - so the patch from Comment #8 would be a more correct approach, if it worked (unfortunately it doesn't, and I still can't see why not).
Comment 15 Benjamin Otte (Company) 2014-04-04 12:22:07 UTC
(In reply to comment #14)
> The comments, and the implementation in GdkWindow, suggest that it's meant to
> be OK for it to have undefined contents - so the patch from Comment #8 would be
> a more correct approach, if it worked (unfortunately it doesn't, and I still
> can't see why not).
>
I think this is the approach we should take at least in the stable branch.

But why doesn't this work? Your comment above says you don't use it because you feared it'd be slow.
Comment 16 Simon McVittie 2014-04-04 12:32:51 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > The comments, and the implementation in GdkWindow, suggest that it's meant to
> > be OK for it to have undefined contents - so the patch from Comment #8 would be
> > a more correct approach, if it worked (unfortunately it doesn't, and I still
> > can't see why not).
> >
> I think this is the approach we should take at least in the stable branch.
> 
> But why doesn't this work? Your comment above says you don't use it because you
> feared it'd be slow.

Sorry, I might have been unclear. I've put four patches on this bug:

Attachment #273425 [details] ("#if 0" the fast-path to make it work like the X11 backend) works, but I worried that it would hurt performance by adding a copy into the most common case. On the other hand, it sounds as though in practice there's always a copy on Mutter-Wayland anyway. I think this is the only one so far that both works, and is theoretically correct.

Attachment #273427 [details] doesn't actually seem to prevent the symptom for me, and I haven't been able to work out why not.

The pseudo-patch in the text of Comment #8 is a more refined version of Attachment #273427 [details] in response to Daniel's comments, but also doesn't seem to work. Again, I haven't been able to work out why it doesn't do what I think it should.

Attachment #273458 [details] prevented the symptom when I tried it, but you pointed out that it was wrong for the cases where we only redraw parts of the window, and I agree.
Comment 17 Marco Barisione 2014-04-16 21:01:11 UTC
The mysterious shell where we had this problem is <http://blog.barisione.org/2014-04/maynard/>. We went, as a temporary measure, for Simon's first patch.
Comment 18 Simon McVittie 2014-04-17 10:06:18 UTC
Review of attachment 273458 [details] [review]:

Marking this patch as rejected. Benjamin Otte pointed out that it is incorrect, and I agree with his analysis.
Comment 19 Simon McVittie 2014-04-17 10:07:25 UTC
Comment on attachment 273425 [details] [review]
gdk_window_impl_wayland_begin_paint_region: always use an  intermediate buffer

Un-obsoleting this patch. It's the one we ended up using for Maynard, and it works.
Comment 20 Matthias Clasen 2015-11-17 20:36:01 UTC
I am not seeing any problems with the alpha window example in testgtk with todays gtk, either with mutter or with weston.