Bug 739178 – background corrupted on resume from suspend with nvidia blob driver

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 739178 - background corrupted on resume from suspend with nvidia blob driver


Summary:	background corrupted on resume from suspend with nvidia blob driver


Status:	RESOLVED FIXED

Product:	gnome-shell
Classification:	Core
Component:	drivers
Version:	3.14.x
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	---
Assigned To:	gnome-shell-maint
QA Contact:	gnome-shell-maint

URL:
Whiteboard:

Duplicates:	739203 740299 741457 744889 748785 (view as bug list)
Depends on:
Blocks:

Reported:	2014-10-25 18:13 UTC by taniglu.budikov
Modified:	2019-05-17 07:49 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
corrupted gnome shell background (1.55 MB, image/png) 2014-10-25 18:13 UTC, taniglu.budikov		Details
meta-background: Add a function to refresh all background instances (2.10 KB, patch) 2015-03-19 14:54 UTC, Rui Matos	committed	Details \| Review
Refresh all background instances after suspend if needed (3.29 KB, patch) 2015-03-19 14:57 UTC, Rui Matos	none	Details \| Review
Refresh all background instances after suspend if needed (3.42 KB, patch) 2015-03-19 17:02 UTC, Rui Matos	none	Details \| Review
Refresh all background instances after suspend if needed (3.44 KB, patch) 2015-03-20 09:43 UTC, Rui Matos	committed	Details \| Review
Refresh background on resume (1.16 KB, patch) 2015-03-20 10:32 UTC, Alex Hultman	none	Details \| Review
cogl-winsys-glx: Allow creating a GL context with robustness enabled (4.23 KB, patch) 2016-06-03 16:39 UTC, Rui Matos	none	Details \| Review
cogl-context: Add a cogl_get_graphics_reset_status API (2.88 KB, patch) 2016-06-03 16:39 UTC, Rui Matos	none	Details \| Review
cogl: Ignore GL_CONTEXT_LOST when checking for GL errors (19.01 KB, patch) 2016-06-03 16:40 UTC, Rui Matos	none	Details \| Review
compositor: Handle GL context reset errors (4.94 KB, patch) 2016-06-03 16:42 UTC, Rui Matos	none	Details \| Review
MetaBackground: invalidate contents on loss of graphics context (1.61 KB, patch) 2016-06-03 16:42 UTC, Rui Matos	none	Details \| Review
MetaSurfaceActorX11: invalidate the stex on loss of graphics context (1.69 KB, patch) 2016-06-03 16:43 UTC, Rui Matos	none	Details \| Review
Revert "meta-background: Add a function to refresh all background instances" (2.21 KB, patch) 2016-06-03 16:43 UTC, Rui Matos	rejected	Details \| Review
Revert "Refresh all background instances after suspend if needed" (3.51 KB, patch) 2016-06-03 16:44 UTC, Rui Matos	rejected	Details \| Review
main: Reload theme on loss of graphics context (820 bytes, patch) 2016-06-03 16:44 UTC, Rui Matos	none	Details \| Review
cogl-winsys-glx: Allow creating a GL context with robustness enabled (4.97 KB, patch) 2016-06-14 13:34 UTC, Rui Matos	none	Details \| Review
cogl-context: Add a cogl_get_graphics_reset_status API (2.88 KB, patch) 2016-06-14 13:34 UTC, Rui Matos	none	Details \| Review
cogl: Ignore GL_CONTEXT_LOST when checking for GL errors (19.01 KB, patch) 2016-06-14 13:34 UTC, Rui Matos	none	Details \| Review
restart: Make meta_restart() work without a message (925 bytes, patch) 2016-06-14 13:34 UTC, Rui Matos	none	Details \| Review
compositor: Handle GL context reset errors (5.54 KB, patch) 2016-06-14 13:36 UTC, Rui Matos	none	Details \| Review
MetaBackground: invalidate contents on loss of graphics context (1.61 KB, patch) 2016-06-14 13:36 UTC, Rui Matos	none	Details \| Review
MetaSurfaceActorX11: invalidate the stex on loss of graphics context (1.69 KB, patch) 2016-06-14 13:36 UTC, Rui Matos	none	Details \| Review
screenShield: Stop using an offscreen buffer for the arrow actor (1.03 KB, patch) 2016-06-14 15:17 UTC, Rui Matos	none	Details \| Review
screenShield: Make the arrow redraw on style-changed (858 bytes, patch) 2016-06-14 16:40 UTC, Rui Matos	none	Details \| Review
screenShield: Chain up Arrow's vfuncs (1.11 KB, patch) 2016-06-15 13:44 UTC, Rui Matos	none	Details \| Review
cogl-winsys-glx: Add support for NV_robustness_video_memory_purge (5.96 KB, patch) 2016-06-21 19:16 UTC, Rui Matos	none	Details \| Review
cogl-context: Add a cogl_get_graphics_reset_status API (4.03 KB, patch) 2016-06-21 19:17 UTC, Rui Matos	committed	Details \| Review
cogl: Ignore GL_CONTEXT_LOST when checking for GL errors (19.01 KB, patch) 2016-06-21 19:18 UTC, Rui Matos	committed	Details \| Review
clutter/x11: Add API to request video memory purges to be reported (2.84 KB, patch) 2016-06-21 19:19 UTC, Rui Matos	committed	Details \| Review
restart: Make meta_restart() work without a message (1.24 KB, patch) 2016-06-21 19:20 UTC, Rui Matos	committed	Details \| Review
compositor: Handle GL video memory purged errors (5.11 KB, patch) 2016-06-21 19:22 UTC, Rui Matos	committed	Details \| Review
MetaBackground: invalidate contents on video memory purged errors (1.58 KB, patch) 2016-06-21 19:23 UTC, Rui Matos	committed	Details \| Review
MetaSurfaceActorX11: invalidate the stex on video memory purged errors (1.81 KB, patch) 2016-06-21 19:24 UTC, Rui Matos	committed	Details \| Review
main: Reload theme on video memory purge errors (832 bytes, patch) 2016-06-21 19:26 UTC, Rui Matos	committed	Details \| Review
screenShield: Stop using an offscreen buffer for the arrow actor (1.03 KB, patch) 2016-06-21 19:27 UTC, Rui Matos	committed	Details \| Review
screenShield: Chain up Arrow's style_changed vfunc (802 bytes, patch) 2016-06-21 19:29 UTC, Rui Matos	committed	Details \| Review
cogl-winsys-glx: Add support for NV_robustness_video_memory_purge (8.13 KB, patch) 2016-06-27 18:24 UTC, Rui Matos	committed	Details \| Review

Description taniglu.budikov 2014-10-25 18:13:33 UTC

Created attachment 289315 [details]
corrupted gnome shell background

on resume from suspend gnome-shell background and lock screen background are corrupted with random pixels.

cf attachment

this bug is  open on archlinux bug tracker and debian bug tracker

https://bugs.archlinux.org/task/42511
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=765436

it seems to be nvidia proprietary driver related

Comment 1 Rui Matos 2014-10-27 13:24:33 UTC

*** Bug 739203 has been marked as a duplicate of this bug. ***

Comment 2 taniglu.budikov 2014-10-27 20:42:40 UTC

Sorry, i forgot to mention :

- Archlinux with kernel 3.17.1
- Xorg server 1.16.1
- nvidia binary driver 340.46

nothing suspicious in logs and as said in the other report, atl-r + r restore the backgrounds.

Comment 3 taniglu.budikov 2014-10-29 20:46:52 UTC

one more update. it seams that the background picture is literaly erased from memory and not restored.

by using the nvidia-smi utility

1- before sleep : memory usage 115MiB /  1023MiB
2- after sleep : 45MiB /  1023MiB
3- after sleep + reload shell : 97MiB /  1023MiB

i've done this three steps a couple of time and the result is always the same a decrease of video memory consumption.

Comment 4 Bastien Nocera 2014-11-07 10:50:47 UTC

Any chance for you to reproduce this with the nouveau driver?

Comment 5 taniglu.budikov 2014-11-07 11:38:43 UTC

i will try.

one workaround for the moment is having an animated wallpaper (default gnome one) so when a wakeup occured, the wallpapper will change if the wallpapper delay is reached :-)

Comment 6 taniglu.budikov 2014-11-07 12:47:36 UTC

ok, so i installed nouveau and the background is not corrupted after wakeup. 
it's only with nvidia binary driver.

i will give nouveau a try.

Comment 7 taniglu.budikov 2014-11-07 16:37:43 UTC

conclusion, after three or four sleep/wakeup since i installed the nouveau driver i don't have any backgrounds corruption (in screen shield and gnome-shell).

plus things is that wakeup is almost instantaneous with "nouveau" and performances for standard usage are decent.

Comment 8 Bastien Nocera 2014-11-07 16:59:57 UTC

Closing this as not a gnome bug then. Thank you for testing.

Comment 9 Jasper St. Pierre (not reading bugmail) 2014-11-07 18:49:15 UTC

We usually leave bugs in the driver open until the underlying driver bug is fixed and released.

Comment 10 Daniel Korostil 2014-11-17 19:07:47 UTC

Is it filed upstream? It's a common bug among nvidia blob users.

Comment 11 Rui Matos 2014-11-18 09:39:50 UTC

*** Bug 740299 has been marked as a duplicate of this bug. ***

Comment 12 taniglu.budikov 2014-11-18 19:28:17 UTC

here is the same question on nvidia forum

https://devtalk.nvidia.com/default/topic/787748/linux/-nvidia340xx-archlinux64-gnome3-14-the-background-of-desktop-and-lockscreen-mess-after-resume-from-/

Comment 13 Mark Pariente 2014-11-30 21:05:01 UTC

The nvidia forum post by sandipt mentions:

"Our current software architecture doesn't preserve the contents of FBOs across modeswitches or power events. As a result, our implementation relies on applications re-rendering to their FBOs every frame. This is a fundamental limitation of our design that will be fixed in the future."

Is is possible for gnome-shell to somehow work around this, by redrawing the background on resume from suspend?

Also, this code path was working fine on 3.12, what changed on the gnome-shell side?

Comment 14 Rui Matos 2014-12-16 16:19:46 UTC

*** Bug 741457 has been marked as a duplicate of this bug. ***

Comment 15 Hugo Poi 2014-12-23 09:20:24 UTC

I have the same probleme under Archlinux with AMD catalyst proprietary driver.

And same probleme here https://bbs.archlinux.org/viewtopic.php?pid=1486134

Comment 16 José J 2015-02-03 09:28:40 UTC

Same here, under SolydX with Gnome, after updating from Gnome Shell 3.12 to Gnome Shell 3.14.

My graphic card is Nvidia, with propietary drivers.

Comment 17 Dominique Leuenberger 2015-02-03 11:06:07 UTC

I ust had a report of the same case for openSUSE, where NVidia also chimed in

(see https://bugzilla.suse.com/show_bug.cgi?id=914149#c18 )

### quote ###
My best guess would be that the desktop background picture is being stored as an OpenGL Frame Buffer Object (FBO), and that whatever program is responsible for drawing it is counting on the FBO's contents to be preserved across the suspend/resume cycle.

FBO contents aren't guaranteed to persist across modeswitch events (suspend/resume will incur a VT switch away from X and back to it, and the VT switches in turn will incur modesets). If an FBO is being reused across a modeswitch, its contents need to be refreshed.
### end quote ###

Comment 18 Alex Hultman 2015-02-18 17:57:41 UTC

I'm having this issue on Fedora 21 with NVIDIA driver, two monitors. From what I have read, it's clearly a GNOME bug.

https://devtalk.nvidia.com/default/topic/795854/linux/background-gets-corrupted-after-suspend-on-gnome-3-14-346-22-gtx-750-ti-/

Comment 19 Jasper St. Pierre (not reading bugmail) 2015-02-18 19:01:39 UTC

The whole reason we use an FBO is so that we don't have to repaint it on every frame. That's considerably expensive. We weren't expecting drivers to lose FBO contents at arbitrary points without a single way of letting us know that that was the case.

If the NVIDIA driver team can give us a workaround other than "paint the FBO every frame", then I'd be fine to implement it.

Comment 20 Alex Hultman 2015-02-18 19:25:30 UTC

Wouldn't it be a simple fix to just repaint the FBO after resume?

Comment 21 Jasper St. Pierre (not reading bugmail) 2015-02-18 19:29:45 UTC

Well, mode switches, but we don't know when that happens.

Comment 22 region51 2015-02-20 16:50:06 UTC

I have the same probleme under :

-Distro: Fedora release 21 (Twenty One)
-Display Server: Fedora X.org 116.3
-Kernel: 3.18.7-200.fc21.x86_64
-Desktop: Gnome 3.14.3
-GLX Renderer: GeForce GT 330/PCIe/SSE2 GLX Version: 3.3.0 NVIDIA 340.76
atl-r + r restore the backgrounds. Что за беда?

Comment 23 Rui Matos 2015-02-21 16:06:35 UTC

*** Bug 744889 has been marked as a duplicate of this bug. ***

Comment 24 mikey 2015-03-18 08:16:05 UTC

Just to confirm that I have been experiencing the same issue with GeForce GTX 660 Ti, driver version 346.47 and Fedora 21.

I have been experiencing this issue for some time and the only time I'm aware of having seen it is after a resume. Could I suggest building in a repaint of the buffer on resume as it is likely to resolve 99% of the issues people are experiencing.

Comment 25 Alex Hultman 2015-03-18 09:36:16 UTC

(In reply to mikey from comment #24)
> Just to confirm that I have been experiencing the same issue with GeForce
> GTX 660 Ti, driver version 346.47 and Fedora 21.
> 
> I have been experiencing this issue for some time and the only time I'm
> aware of having seen it is after a resume. Could I suggest building in a
> repaint of the buffer on resume as it is likely to resolve 99% of the issues
> people are experiencing.

I'm in favor of this. It's a hack but a good one. My background is so depressing right now. Don't go all Bug 650371 on this one, please - it's clearly broken.

Comment 26 Bastien Nocera 2015-03-18 09:41:29 UTC

(In reply to Alex Hultman from comment #25)
> (In reply to mikey from comment #24)
> > Just to confirm that I have been experiencing the same issue with GeForce
> > GTX 660 Ti, driver version 346.47 and Fedora 21.
> > 
> > I have been experiencing this issue for some time and the only time I'm
> > aware of having seen it is after a resume. Could I suggest building in a
> > repaint of the buffer on resume as it is likely to resolve 99% of the issues
> > people are experiencing.
> 
> I'm in favor of this.

There's no polls here.

> It's a hack but a good one.

It's also not possible. See comment 21.

> My background is so
> depressing right now. Don't go all Bug 650371 on this one, please - it's
> clearly broken.

It's a bug in the NVidia driver. I would suggest you contact them and point them here instead of trying to get gnome-shell developers to implement bug fixes which really are work-arounds.

Comment 27 mck 2015-03-18 10:08:28 UTC

> It's a bug in the NVidia driver.

As a suffering user that sentiment is discouraging and lowers my respect for the gnome community, despite knowing it holds truth.

As far as i've understood it, the nvidia driver has always behaved that way, and it is gnome that made the change that broke things.

I certainly appreciate all efforts to report the bug upstream to nvidia and push for them to fix it, but any temporary hack to address the worse aspect of the bug: refresh after resume; i'm betting would be hugely appreciated by all effected gnome users.

Even a simple recipe that users can add to a resume event to force the refresh would be appreciated.

Comment 28 Bastien Nocera 2015-03-18 10:12:52 UTC

(In reply to Mick Wever from comment #27)
> As far as i've understood it, the nvidia driver has always behaved that way,
> and it is gnome that made the change that broke things.

There are some contracts drawn up when you advertise particular OpenGL features. They're the only ones that don't implement them properly.

> I certainly appreciate all efforts to report the bug upstream to nvidia and
> push for them to fix it, but any temporary hack to address the worse aspect
> of the bug:

We're waiting on NVidia to tell us what that work-around should be (comment 19).

> refresh after resume; i'm betting would be hugely appreciated by
> all effected gnome users.

Refresh after resume still isn't possible to detect from the graphics stack (see comment 21, again).

> Even a simple recipe that users can add to a resume event to force the
> refresh would be appreciated.

Use the nouveau drivers, they don't have that problem (comment 5).

Comment 29 mck 2015-03-18 10:39:03 UTC

> Even a simple recipe that users can add to a resume event to force the
> refresh would be appreciated.

Would a custom hook in systemd be a possibility for users to add themselves?
 ref https://wiki.archlinux.org/index.php/Power_management#Hooks_in_.2Fusr.2Flib.2Fsystemd.2Fsystem-sleep

Comment 30 Alex Hultman 2015-03-18 13:01:19 UTC

(In reply to Bastien Nocera from comment #28)
> (In reply to Mick Wever from comment #27)
> > As far as i've understood it, the nvidia driver has always behaved that way,
> > and it is gnome that made the change that broke things.
> 
> There are some contracts drawn up when you advertise particular OpenGL
> features. They're the only ones that don't implement them properly.
> 
> > I certainly appreciate all efforts to report the bug upstream to nvidia and
> > push for them to fix it, but any temporary hack to address the worse aspect
> > of the bug:
> 
> We're waiting on NVidia to tell us what that work-around should be (comment
> 19).
> 
> > refresh after resume; i'm betting would be hugely appreciated by
> > all effected gnome users.
> 
> Refresh after resume still isn't possible to detect from the graphics stack
> (see comment 21, again).
> 
> > Even a simple recipe that users can add to a resume event to force the
> > refresh would be appreciated.
> 
> Use the nouveau drivers, they don't have that problem (comment 5).

I posted a link with a quote from NVIDIA's Aaron Plattner (comment 18). In case you haven't read it:

This is a known interaction problem between the way GNOME stores its backgrounds and the fact that FBO contents are not preserved across suspend / resume cycles. In general, applications should avoid relying on FBO contents being preserved across mode switches or VT switches.

----

So the workaround is already out there, I have found similar answers from NVIDIA in other threads. Don't rely on FBO contents being preserved across mode switches.

And no - nouveau is not an option, at all. GNOME 3.12 worked fine.

Comment 31 Jasper St. Pierre (not reading bugmail) 2015-03-18 16:13:33 UTC

(In reply to Alex Hultman from comment #30)
> NVIDIA in other threads. Don't rely on FBO contents being preserved across
> mode switches.

I'd love to do that, but I don't know when modeswitches happen. If NVIDIA gives me a way to detect when modeswitches happen, even if it's super-specific to their architecture and driver, I will be more than delighted to write some code to repaint the FBO as a workaround.

Keep in mind that this is a blatant violation of the OpenGL specification, who claim that FBOs should survive with valid contents for the duration of the FBO. But I understand that writing OpenGL drivers is legitimately hard, and I have a good understanding of why this happens in their architecture. I'd be more than happy to add a hack workaround once I have a way to do it.

If there's anything I can do to help this out from the GNOME side, please let me know.

> And no - nouveau is not an option, at all. GNOME 3.12 worked fine.

GNOME 3.12 had major performance issues with certain background setups, which is why we went to using an FBO to store precached background pixels.

Comment 32 Josselin Mouette 2015-03-18 16:31:25 UTC

Thanks Jasper for all the explanations. I’ll try to get in touch with nVidia through our hardware support.

Comment 33 Rui Matos 2015-03-19 14:54:50 UTC

Created attachment 299832 [details] [review]
meta-background: Add a function to refresh all background instances

We need to reload the FBOs under some circumstances, this adds a way
to easily do so.

Comment 34 Rui Matos 2015-03-19 14:57:16 UTC

Created attachment 299833 [details] [review]
Refresh all background instances after suspend if needed

NVIDIA drivers don't preserve FBO contents across suspend / resume
cycles which results in broken backgrounds. We can work around that by
forcing a refresh when coming out of suspend.
--

There, this hack works fine in my testing. Also, despite what Aaron
Plattner says, IME VT switching doesn't wipe the background and mode
switches are taken care of anyway since we already reload the FBOs on
monitors-changed.

Comment 35 Jasper St. Pierre (not reading bugmail) 2015-03-19 15:38:17 UTC

Review of attachment 299832 [details] [review]:

Sure.

Comment 36 Jasper St. Pierre (not reading bugmail) 2015-03-19 15:39:21 UTC

Review of attachment 299833 [details] [review]:

OK, I figured it was also on VT switches, which is why I hesitated a bit more. I don't have an NVIDIA machine to actually test with myself, but this looks fine from a blind review.

I would add a comment above the hack with a link to this bug or the NVIDIA forums or something to let people know why we're doing this.

Comment 37 Rui Matos 2015-03-19 17:02:34 UTC

Created attachment 299853 [details] [review]
Refresh all background instances after suspend if needed

--

Ok, added a reference to this bug in a comment to make it clear what's
going on.

So, how do we want to deal with this? I can ask for a code freeze
break if it's deemed worth it and safe enough, otherwise it can go
into 3.16.1.

This also applies cleanly to the 3.14 branches, so I'll push it there
as well if I get an ack.

Comment 38 Jasper St. Pierre (not reading bugmail) 2015-03-19 18:21:32 UTC

Review of attachment 299853 [details] [review]:

OK. Let's put this in 3.16.1.

Comment 39 Florian Müllner 2015-03-19 20:02:26 UTC

(In reply to Rui Matos from comment #37)
> This also applies cleanly to the 3.14 branches, so I'll push it there
> as well if I get an ack.

You don't need an ack for 3.14 - as some stuff has accumulated on the stable branch lately, I'm planning to do a final release these days, so please push it there before 3.16.1.

Comment 40 Debarshi Ray 2015-03-20 06:12:06 UTC

Review of attachment 299853 [details] [review]:

::: src/shell-util.c
@@ +387,3 @@
+    return FALSE;
+
+  if (g_strcmp0 (get_gl_vendor (), "NVIDIA"))

Aren't we missing a '== 0' here?

Comment 41 Rui Matos 2015-03-20 09:43:25 UTC

Created attachment 299931 [details] [review]
Refresh all background instances after suspend if needed

--

(In reply to Debarshi Ray from comment #40)
> Aren't we missing a '== 0' here?

strcmp, The Deceiver, strikes again. And of course, this means that
the vendor string was wrong too, is should be "NVIDIA Corporation".

Thanks for catching it!

Comment 42 Alex Hultman 2015-03-20 10:16:52 UTC

Wouldn't it be better to always redraw the FBO on resumes, regardless if we are using NVIDIA or not. String comparison is maybe a little too specific and haxxy. I mean - a redraw per resume isn't that much of an overhead.

Comment 43 Alex Hultman 2015-03-20 10:20:51 UTC

That makes the patch a lot simpler too:

14	const LoginManager = imports.misc.loginManager;

253	        // NVIDIA drivers don't preserve FBO contents across
254	        // suspend/resume, see
255	        // https://bugzilla.gnome.org/show_bug.cgi?id=739178
257	        LoginManager.getLoginManager().connect('prepare-for-sleep',
258	                                                   function(lm, suspending) {
259	                                                       if (suspending)
260	                                                           return;
261	                                                       Meta.Background.refresh_all();
262	                                                   });

Comment 44 Alex Hultman 2015-03-20 10:32:42 UTC

Created attachment 299935 [details] [review]
Refresh background on resume

Comment 45 Rui Matos 2015-03-20 10:40:08 UTC

(In reply to Alex Hultman from comment #42)
> Wouldn't it be better to always redraw the FBO on resumes, regardless if we
> are using NVIDIA or not.

Why do more work than we need to?

> String comparison is maybe a little too specific
> and haxxy.

It's what everyone does, including game engines, etc.

Comment 46 Alex Hultman 2015-03-20 12:17:52 UTC

I'm not arguing. Your patch will be much welcomed to my workstation.

But no - string comparison is not whats done in game engines or what "everyone" does (I'm most certainly not that kind of "everyone"). Anyhow - you know more GNOME than I do and this patch of yours works so lets just commit it and be happy.

Comment 47 Ignacio Casal Quinteiro (nacho) 2015-03-20 12:22:08 UTC

afaik when it comes to match driver vendors, that is done with string comparisons in the opengl world...

Comment 48 Rui Matos 2015-03-23 17:39:44 UTC

Will push to master after 3.16 is out.

   46d57bf..6b3be51  gnome-3-14 -> gnome-3-14

Attachment 299832 [details] pushed as 6b3be51 - meta-background: Add a function to refresh all background instances

Comment 49 Rui Matos 2015-03-23 17:40:49 UTC

4b0cbaf..d19e78a  gnome-3-14 -> gnome-3-14

Attachment 299931 [details] pushed as d19e78a - Refresh all background instances after suspend if needed

Comment 50 Rui Matos 2015-03-25 10:49:41 UTC

Comment on attachment 299832 [details] [review]
meta-background: Add a function to refresh all background instances

Attachment 299832 [details] pushed as 7c5fe42 - meta-background: Add a function to refresh all background instances

Comment 51 Rui Matos 2015-03-25 10:50:01 UTC

Attachment 299931 [details] pushed as c3bf4a3 - Refresh all background instances after suspend if needed

Comment 52 region51 2015-04-04 16:41:44 UTC

My problem is solved
[maxim@localhost ~]$ inxi -S -G
System:    Host: localhost.localdomain Kernel: 3.19.3-200.fc21.x86_64 x86_64 (64 bit) Desktop: Gnome 3.14.4
           Distro: Fedora release 21 (Twenty One)
Graphics:  Card: NVIDIA G92 [GeForce GT 330]
           Display Server: Fedora X.org 116.3 driver: N/A Resolution: 1280x1024@60.02hz
           GLX Renderer: GeForce GT 330/PCIe/SSE2 GLX Version: 3.3.0 NVIDIA 340.76

Comment 53 John Chadwick 2015-04-06 12:53:30 UTC

The patch for 3.14.4 does fix the problem for images as wallpapers, but for colors as wallpapers I'm still getting garbage. Just thought I'd throw that in here, I haven't looked into why or anything.

Comment 54 mck 2015-04-13 07:29:26 UTC

works a charm. (also on an integrated 'Intel HD Graphics 5500', from which i suffered the same problem (which doesn't match the discussions above))

Comment 55 Rui Matos 2015-04-14 17:50:41 UTC

(In reply to John Chadwick from comment #53)
> The patch for 3.14.4 does fix the problem for images as wallpapers, but for
> colors as wallpapers I'm still getting garbage. Just thought I'd throw that
> in here, I haven't looked into why or anything.

Commenting in closed bugs isn't a good way to attract attention.

Anyway, I couldn't reproduce this. How are you setting the background color? Which GPU and driver version are you using?

Comment 56 Rui Matos 2015-05-12 11:17:56 UTC

*** Bug 748785 has been marked as a duplicate of this bug. ***

Comment 57 nicolasvila 2015-05-23 08:40:43 UTC

Can someone give instructions to apply the workaround patches and rebuild the gnome packages. Thanks.

Comment 58 Hussam Al-Tayeb 2015-07-19 08:38:34 UTC

This appears to not work if I have 'systemctl hibernate' scheduled from a /etc/cron.d/hibernate-at-4am script. I still see the desktop corruption.


However, if I alt-f2 then 'systemctl hibernate', I see no corruption.

Comment 59 nicolasvila 2015-07-21 11:02:16 UTC

I upgraded my Debian from Wheezy to Jessie and the problem has gone with
the upgraded Gnome. Not sure the fix will be backported

Comment 60 nicolasvila 2015-07-21 11:04:32 UTC

For your information, here are the Gnome Shell versions for Debian:
Debian Wheezy : 3.4.2 (bug)
Debian Jessie : 3.14.2 (fixed)

Comment 61 Julian Sikorski 2015-07-24 06:19:40 UTC

Did the fix make it to 3.16.3 as well? I am seeing this again in Fedora 22.

Comment 62 Florian Müllner 2015-07-24 07:37:56 UTC

(In reply to Julian Sikorski from comment #61)
> Did the fix make it to 3.16.3 as well?

Yes, 3.16.1 to be exact.

Comment 63 Hussam Al-Tayeb 2015-07-24 07:55:32 UTC

(In reply to Florian Müllner from comment #62)
> (In reply to Julian Sikorski from comment #61)
> > Did the fix make it to 3.16.3 as well?
> 
> Yes, 3.16.1 to be exact.

Then it is still broken. I have to alt+f2 -> r
or set a new background wallpaper after resuming from hibernate.

Comment 64 Rui Matos 2015-07-24 13:31:02 UTC

(In reply to Hussam Al-Tayeb from comment #63)
> Then it is still broken. I have to alt+f2 -> r
> or set a new background wallpaper after resuming from hibernate.

Does it work if instead of 'systemctl hibernate' you call the logind dbus interface? Something like:

gdbus call --system --dest org.freedesktop.login1 --object-path /org/freedesktop/login1 --method org.freedesktop.login1.Manager.Hibernate false

Comment 65 Hussam Al-Tayeb 2015-07-24 14:41:56 UTC

Yes. That worked. The background looked fine. The up arrow in the
lockscreen looked fuzzy but after unlocking, the background looked
fine.

Is this way of hibernating something I can use through a cron job in
(/etc/cron.d/) without requiring my user is logged on?

Comment 66 Rui Matos 2015-07-24 15:01:32 UTC

(In reply to Hussam Al-Tayeb from comment #65)
> Is this way of hibernating something I can use through a cron job in
> (/etc/cron.d/) without requiring my user is logged on?

I'm actually not sure. Try it?

Anyway, at this point, we're getting offtopic. This is how systemd works. The API we use to know that we're resuming from suspend/hibernate is provided by logind and its docs[1] say:

"Note that this will only be sent out for suspend/resume cycles done via logind, i.e. generally only for high-level user-induced suspend cycles, and not automatic, low-level kernel induced ones which might exist on certain devices with more aggressive power management."

I don't know of any lower level API that we could use to know when the machine is resuming.

[1] http://www.freedesktop.org/wiki/Software/systemd/inhibit/

Comment 67 Julian Sikorski 2015-07-25 07:46:53 UTC

(In reply to Florian Müllner from comment #62)
> (In reply to Julian Sikorski from comment #61)
> > Did the fix make it to 3.16.3 as well?
> 
> Yes, 3.16.1 to be exact.

For me it still happens with 3.16.3 - not every time, however. I suspend my laptop (suspend to RAM) either by closing the lid, or by letting it run until it suspends.

Comment 68 Hussam Al-Tayeb 2015-09-11 17:35:20 UTC

Would it be possible to tell we resumed from hibernate when clock changes?
For example, I hibernated at 11:55 and resumed at 12:55. The kernel is reporting a different date or time so redraw the background.

Comment 69 Ray Strode [halfline] 2015-09-15 12:36:56 UTC

yes, timerfd() will fire any time the system is resumed if the TFD_TIMER_CANCEL_ON_SET flag is used. Though, it's clearly cleaner to use the systemd apis (except I guess they don't cover all cases)

Comment 70 Hussam Al-Tayeb 2015-09-28 19:29:54 UTC

Does only the user background get refreshed or the greeter one as well?

Comment 71 Hussam Al-Tayeb 2015-11-08 14:12:11 UTC

Hi again. There is another place which needs to be refreshed upon resume from hibernate. The up-arrow in the lockscreen gets corrupted.
Should I open a new bug report?
Thank you!

Comment 72 Alex Hultman 2015-12-13 16:55:34 UTC

It doesn't work that well for me. I would re-open but I cannot. I see many corruptions in background and the login shield even when I haven't suspended. After the monitor blanks for a while I can get corruptions.

Comment 73 Alex Hultman 2015-12-13 16:59:23 UTC

My login shield is completely corrupted and won't update even if I pull it up and down and login and logout. I use 358.16 currently.

Comment 74 Rui Matos 2015-12-14 15:46:39 UTC

Please report it to nvidia, especially if it worked with previous driver versions.

Comment 75 Hussam Al-Tayeb 2015-12-15 15:23:55 UTC

(In reply to Rui Matos from comment #74)
> Please report it to nvidia, especially if it worked with previous driver
> versions.

I think he is referring to the background in the gdm session. That one still doesn't get refreshed. Only the user session's backgrounds are refreshed.

Comment 76 Arthur Huillet 2016-05-09 11:57:03 UTC

A new extension, NV_robustness_video_memory_purge, was designed and implemented by NVIDIA to notify applications of events where the contents of video memory were lost.
This shall enable the use of a workaround as described in comment #19:
"If NVIDIA gives me a way to detect when modeswitches happen, even if it's super-specific to their architecture and driver, I will be more than delighted to write some code to repaint the FBO as a workaround."

This extension will be implemented in the upcoming r367 beta driver version (expected to be released within 2 weeks).
The text of the extension is pending official publication by Khronos, and has been provided to RedHat through separate channels.

-- 
Arthur Huillet
NVIDIA Linux graphics

Comment 77 Hussam Al-Tayeb 2016-05-10 21:54:07 UTC

(In reply to Arthur Huillet from comment #76)
> A new extension, NV_robustness_video_memory_purge, was designed and
> implemented by NVIDIA to notify applications of events where the contents of
> video memory were lost.
> This shall enable the use of a workaround as described in comment #19:
> "If NVIDIA gives me a way to detect when modeswitches happen, even if it's
> super-specific to their architecture and driver, I will be more than
> delighted to write some code to repaint the FBO as a workaround."
> 
> This extension will be implemented in the upcoming r367 beta driver version
> (expected to be released within 2 weeks).
> The text of the extension is pending official publication by Khronos, and
> has been provided to RedHat through separate channels.
> 
> -- 
> Arthur Huillet
> NVIDIA Linux graphics

Perhaps nvidia can post a patch for mutter/gnome-shell to use the new function so users such as myself try it? :)

Comment 78 Hussam Al-Tayeb 2016-05-20 06:09:05 UTC

(In reply to Arthur Huillet from comment #76)
> A new extension, NV_robustness_video_memory_purge, was designed and
> implemented by NVIDIA to notify applications of events where the contents of
> video memory were lost.
> This shall enable the use of a workaround as described in comment #19:
> "If NVIDIA gives me a way to detect when modeswitches happen, even if it's
> super-specific to their architecture and driver, I will be more than
> delighted to write some code to repaint the FBO as a workaround."
> 
> This extension will be implemented in the upcoming r367 beta driver version
> (expected to be released within 2 weeks).
> The text of the extension is pending official publication by Khronos, and
> has been provided to RedHat through separate channels.
> 
> -- 
> Arthur Huillet
> NVIDIA Linux graphics

By the way, I tried the nvidia 367.18 driver on my nvidia 630 GT (a fermi GF108 card) and it was locking up with XID errors 5 minutes after booting. reverting to the 364.19 driver fixed it.
But I did see the new extension in the release notes.

Comment 79 André Klapper 2016-05-21 14:08:05 UTC

[When replying, please avoid full quotes of previous comments for no reason, but instead only keep relevant lines. Thank you!]

Comment 80 Rui Matos 2016-06-03 16:39:19 UTC

Created attachment 329063 [details] [review]
cogl-winsys-glx: Allow creating a GL context with robustness enabled

This API allows callers to specify that they're able to handle a GL
context with robustness enabled. We make this opt-in since robustness
contexts change how glGetError() behaves.
--

The context creating bits aren't right according to the spec[0] yet
because the currently available NVIDIA driver that implements this
doesn't work like that yet. I'm told it will change to the spec'ed API
on a future release.

I'm attaching this as is now for review of the general approach and
for anyone that wants to test with the 367.18 driver.

[0] https://www.opengl.org/registry/specs/NV/robustness_video_memory_purge.txt

Comment 81 Rui Matos 2016-06-03 16:39:34 UTC

Created attachment 329064 [details] [review]
cogl-context: Add a cogl_get_graphics_reset_status API

If the driver supports the GL_ARB_robustness extension we can expose
the graphics reset status this way.

Comment 82 Rui Matos 2016-06-03 16:40:43 UTC

Created attachment 329065 [details] [review]
cogl: Ignore GL_CONTEXT_LOST when checking for GL errors

When using a context with robustness, glGetError() may return
GL_CONTEXT_LOST at any time and this error doesn't get cleared until
the application calls glGetGraphicsResetStatus() . This means that our
error checking can't call glGetError() in a loop without checking for
that return value and returning in that case.

Comment 83 Rui Matos 2016-06-03 16:42:30 UTC

Created attachment 329066 [details] [review]
compositor: Handle GL context reset errors

Emit a signal so that interested parties can recreate their FBOs and
queue a full scene graph redraw to ensure we don't end up showing
graphical artifacts.

This relies on the GL driver supporting the ARB_robustness extension
and cogl creating a suitable GL context. For now we only make use of
it with the X backend since the only driver with which this is useful
is NVIDIA.

Comment 84 Rui Matos 2016-06-03 16:42:46 UTC

Created attachment 329067 [details] [review]
MetaBackground: invalidate contents on loss of graphics context

We use FBOs so we need to cause them to be recreated if we get a GL
context lost error.

Comment 85 Rui Matos 2016-06-03 16:43:03 UTC

Created attachment 329068 [details] [review]
MetaSurfaceActorX11: invalidate the stex on loss of graphics context

MetaShapedTexture uses FBOs when mipmapping so we need to cause them
to be recreated if we get a GL context lost error.

Comment 86 Rui Matos 2016-06-03 16:43:35 UTC

Created attachment 329071 [details] [review]
Revert "meta-background: Add a function to refresh all background instances"

We now handle GL context lost errors properly so we don't need this
anymore.

This reverts commit 7c5fe42835ea303a44a837646cfb8f7e1a72e681.

Comment 87 Rui Matos 2016-06-03 16:44:21 UTC

Created attachment 329073 [details] [review]
Revert "Refresh all background instances after suspend if needed"

This is now handled entirely in cogl and mutter.

This reverts commit c3bf4a325d5b0c2b0b0115e8177d62e33c847749.

Comment 88 Rui Matos 2016-06-03 16:44:38 UTC

Created attachment 329074 [details] [review]
main: Reload theme on loss of graphics context

The theme machinery uses FBOs in some cases (mainly for shadows) which
need to be reloaded if we get a GL context lost error.

Comment 89 Hussam Al-Tayeb 2016-06-09 12:13:21 UTC

Will those patches add a dependency on the new driver for people using nvidia cards?
Pre-Fermi cards are on a legacy driver branch that is at feature freeze and won't get the new extension.

Comment 90 Hussam Al-Tayeb 2016-06-13 18:27:55 UTC

Nvidia fixed my xorg driver crash. I will try those patches.

Comment 91 Rui Matos 2016-06-14 13:34:02 UTC

Created attachment 329790 [details] [review]
cogl-winsys-glx: Allow creating a GL context with robustness enabled

This API allows callers to specify that they're able to handle a GL
context with robustness enabled. We make this opt-in since robustness
contexts change how glGetError() behaves.

Comment 92 Rui Matos 2016-06-14 13:34:17 UTC

Created attachment 329791 [details] [review]
cogl-context: Add a cogl_get_graphics_reset_status API

If the driver supports the GL_ARB_robustness extension we can expose
the graphics reset status this way.

Comment 93 Rui Matos 2016-06-14 13:34:33 UTC

Created attachment 329792 [details] [review]
cogl: Ignore GL_CONTEXT_LOST when checking for GL errors

When using a context with robustness, glGetError() may return
GL_CONTEXT_LOST at any time and this error doesn't get cleared until
the application calls glGetGraphicsResetStatus() . This means that our
error checking can't call glGetError() in a loop without checking for
that return value and returning in that case.

Comment 94 Rui Matos 2016-06-14 13:34:48 UTC

Created attachment 329793 [details] [review]
restart: Make meta_restart() work without a message

In some cases there's no meaningful message to show.

Comment 95 Rui Matos 2016-06-14 13:36:15 UTC

Created attachment 329794 [details] [review]
compositor: Handle GL context reset errors

--

As discussed with Owen on irc, I changed this to restart the shell on
any other robustness errors since there's isn't much we can do
according to the spec as we can't easily tear down a GL context and
create a new one.

Comment 96 Rui Matos 2016-06-14 13:36:27 UTC

Created attachment 329795 [details] [review]
MetaBackground: invalidate contents on loss of graphics context

We use FBOs so we need to cause them to be recreated if we get a GL
context lost error.

Comment 97 Rui Matos 2016-06-14 13:36:44 UTC

Created attachment 329796 [details] [review]
MetaSurfaceActorX11: invalidate the stex on loss of graphics context

MetaShapedTexture uses FBOs when mipmapping so we need to cause them
to be recreated if we get a GL context lost error.

Comment 98 Rui Matos 2016-06-14 13:38:14 UTC

Review of attachment 329071 [details] [review]:

let's keep this in for users who are stuck with older drivers

Comment 99 Rui Matos 2016-06-14 13:38:31 UTC

Review of attachment 329073 [details] [review]:

ditto

Comment 100 Rui Matos 2016-06-14 13:41:58 UTC

Ok, the 367.27 driver is out, complying with the spec, and these rebased patches now work with it.

Comment 101 Hussam Al-Tayeb 2016-06-14 14:54:18 UTC

Ok, I tried the patches. The background refreshes after resuming.
The floating arrow in the lock screen is still corrupted though.
Can that be refreshed as well or is it a different issue?

Comment 102 Rui Matos 2016-06-14 15:17:04 UTC

Created attachment 329802 [details] [review]
screenShield: Stop using an offscreen buffer for the arrow actor

This isn't a performance critical actor and the NVIDIA driver discards
offscreen buffers in some cases which would require us to go through
extra hoops to handle here which isn't worth it.
--

(In reply to Hussam Al-Tayeb from comment #101)
> The floating arrow in the lock screen is still corrupted though.

Try adding this gnome-shell patch too.

Comment 103 Hussam Al-Tayeb 2016-06-14 15:41:22 UTC

I applied that patch, restarted, and the floating arrow is still corrupted after resuming from hibernate. So this particular issue is still there.

However, the refreshed background much faster now after resuming from hibernate.

Comment 104 Rui Matos 2016-06-14 16:40:44 UTC

Created attachment 329812 [details] [review]
screenShield: Make the arrow redraw on style-changed

We need to invalidate the underlying ClutterCanvas for it to re-create
the PBO and emit the draw signal.
--

Can you test with this one on top too?

Note that I can't reproduce this particular issue here, but I can see
why it's failing.

Comment 105 Rui Matos 2016-06-14 16:48:00 UTC

Review of attachment 329812 [details] [review]:

Wait, StDrawingArea already does this on its style_changed() vfunc so this really shouldn't be needed

Comment 106 Hussam Al-Tayeb 2016-06-14 17:06:00 UTC

(In reply to Rui Matos from comment #105)
> Review of attachment 329812 [details] [review] [review]:
> 
> Wait, StDrawingArea already does this on its style_changed() vfunc so this
> really shouldn't be needed

It did however fix my issue.

Comment 107 Rui Matos 2016-06-15 13:44:57 UTC

Created attachment 329855 [details] [review]
screenShield: Chain up Arrow's vfuncs

Style changes weren't reaching the drawing area widget. Paint was
being chained directly which works but the correct thing to do is
chain up and let St.Widget's implementation do the propagation.
--

Thanks for testing. This should be the right fix.

Comment 108 Hussam Al-Tayeb 2016-06-15 14:07:24 UTC

(In reply to Rui Matos from comment #107)
> Created attachment 329855 [details] [review] [review]
> screenShield: Chain up Arrow's vfuncs
> 
> Style changes weren't reaching the drawing area widget. Paint was
> being chained directly which works but the correct thing to do is
> chain up and let St.Widget's implementation do the propagation.
> --
> 
> Thanks for testing. This should be the right fix.

I reverted the old patch in my local tree and applied this one. The arrow shows correctly after resuming from hibernate. Looks perfect!

Comment 109 Owen Taylor 2016-06-15 23:56:37 UTC

Review of attachment 329855 [details] [review]:

I don't think chaining up the paint is correct - if we look at st-widget.c, we see:

static void
st_widget_paint (ClutterActor *actor)
{
  st_widget_paint_background (ST_WIDGET (actor));

  /* Chain up so we paint children. */
  CLUTTER_ACTOR_CLASS (st_widget_parent_class)->paint (actor);
}

So if you chain up afterwards, you draw the background on *top* of whatever drawing you are doing. Compare st-label.c or st-icon.c - their paint implementations call st_widget_draw_background() as the first thing, then they do custom drawing, then they call paint on their child.

Comment 110 Owen Taylor 2016-06-16 11:51:35 UTC

Review of attachment 329795 [details] [review]:

Looks fine

Comment 111 Owen Taylor 2016-06-16 11:56:30 UTC

Review of attachment 329793 [details] [review]:

Should add (allow none) annotation to meta_restart(), good to commit with that.

Comment 112 Owen Taylor 2016-06-16 11:56:49 UTC

Review of attachment 329793 [details] [review]:

[ I mean allow-none with the dash ]

Comment 113 Owen Taylor 2016-06-16 12:15:09 UTC

Review of attachment 329794 [details] [review]:

::: src/compositor/compositor.c
@@ +1103,3 @@
 
+static void
+redraw_all_children (ClutterActor *actor)

It would be quite useful to document why this rather than clutter_actor_queue_redraw(stage); - my thought is that this will cause all the FBO caches for offscreen effects to be redrawn - but that's not sufficient - the fbo's actually need to be destroyed and recreated which this won't cause.

@@ +1149,3 @@
+         since we don't enable robustness in that case. */
+      g_assert (!meta_is_wayland_compositor ());
+      meta_restart (NULL);

Suggest a break; here - we typically use a break; on the last statement of the case, even if it's a default; statement that will stay last. (Without a break; it reads to me like meta_restart() is a non-returning function - and the restart is async.)

::: src/core/display.c
@@ +345,3 @@
 
+  display_signals[GL_CONTEXT_LOST] =
+    g_signal_new ("gl-context-lost",

I think this name is now wrong, since it doesn't apply to a generic context that you have to recreate, it refers to a purged context that you can recover by following certain steps. Suggest gl-video-memory-purged since that describes what those who connect need to do. (gl-context-purged also would make sense in the context of the extension but is less descriptive.)

Comment 114 Owen Taylor 2016-06-16 12:21:31 UTC

Review of attachment 329802 [details] [review]:

I can't figure out why Giovanni added this in the first place - we already cache both the cairo drawing for the arrow and the shadow - so drawing this widget should be about as cheap as possible. As far as I can tell, things should work fine without, but just make sure that you've checked that the arroow shadow draws correctly now.

Comment 115 Owen Taylor 2016-06-16 12:37:56 UTC

Review of attachment 329796 [details] [review]:

::: src/compositor/meta-surface-actor-x11.c
@@ +431,3 @@
+
+  meta_shaped_texture_set_texture (stex, NULL);
+  meta_shaped_texture_set_texture (stex, priv->texture);

Conceptually, this at the wrong level - MetaSurfaceActorX11 doesn't "know" that MetaShapedTexture uses a MetaTextureTower or that MetaTextureTower uses a FBO. *With* a comment added here along the lines of:

 /* Setting the texture to NULL will cause all the FBO's cached by the
  * shaped texture's MetaTextureTower to be discarded and recreated.
  * 

I'm OK with this because we're really just working around something. If FBO's had this behavior per the GL spec then I think the callback would be best on CoglOffscreen or CoglContext and not on the MetaDisplay.

Comment 116 Owen Taylor 2016-06-16 14:52:32 UTC

Review of attachment 329790 [details] [review]:

::: cogl/cogl/cogl-glx.h
@@ +78,3 @@
 
+void
+cogl_glx_set_allow_robustness (CoglBool allow);

This obviously is a low-overhead way of arranging things, and avoids patching Clutter, but I don't think it fits in well with how Cogl works - there are no similar global functions that affect the behavior.

I'd expect cogl_[xlib?]_renderer_set_allow_robustness()

or something like that.

::: cogl/cogl/winsys/cogl-winsys-glx.c
@@ +1042,3 @@
+                                     GL_TRUE,
+      GLX_CONTEXT_RESET_NOTIFICATION_STRATEGY_ARB,
+                                     GLX_LOSE_CONTEXT_ON_RESET_ARB,

Hmm, so this is going to X error unless *both* robustness and GL_NV_robustness_video_memory_purge are present and can be enabled. This makes the name set_allow_robustness() questionable. I'm a bit confused about whether enabling robustness is useful without the video_memory_purge() 

I'd suggest either:

 * Only request reset_video_memory_purge if the extension is available
 * Or rename enable_robustness to request_reset_on_video_memory_purge() - but then it's unclear why you get all the other context-loss notifications - so I like this less.

In general, I think I'd rather see the code check for the robustness extension before trying to enable this stuff - generating an X error on every startup is harmless but ugly.

Comment 117 Owen Taylor 2016-06-16 14:54:37 UTC

Review of attachment 329791 [details] [review]:

::: cogl/cogl/cogl-context.c
@@ +792,3 @@
+
+CoglGraphicsResetStatus
+cogl_get_graphics_reset_status (CoglContext *context)

Needs some basic docs, at least to describe that this requires enabling the extension.

::: cogl/cogl/cogl-context.h
@@ +380,3 @@
+  COGL_GRAPHICS_RESET_STATUS_INNOCENT_CONTEXT_RESET,
+  COGL_GRAPHICS_RESET_STATUS_UNKNOWN_CONTEXT_RESET,
+  COGL_GRAPHICS_RESET_STATUS_PURGED_CONTEXT_RESET,

Needs docs

Comment 118 Rui Matos 2016-06-21 19:16:48 UTC

Created attachment 330156 [details] [review]
cogl-winsys-glx: Add support for NV_robustness_video_memory_purge

This adds API to allow callers to specify that they're interested in
video memory purge errors.
--

(In reply to Owen Taylor from comment #116)
> I'd expect cogl_[xlib?]_renderer_set_allow_robustness()

Ok, changed to something similar

> Hmm, so this is going to X error unless *both* robustness and
> GL_NV_robustness_video_memory_purge are present and can be enabled. This
> makes the name set_allow_robustness() questionable. I'm a bit confused about
> whether enabling robustness is useful without the video_memory_purge()

I was confused too and asked Arthur who clarified that indeed as of
the latest spec version we don't need to enable robust buffer
access. We only need to check for the availability of the
GetGraphicsResetStatusARB api and enable both
GLX_CONTEXT_RESET_NOTIFICATION_STRATEGY_ARB and
GLX_GENERATE_RESET_ON_VIDEO_MEMORY_PURGE_NV.

> I'd suggest either:
>
>  * Only request reset_video_memory_purge if the extension is available

This is what I did here, and I also changed the cogl api as you
suggest to reflect the clarification above.

Comment 119 Rui Matos 2016-06-21 19:17:57 UTC

Created attachment 330158 [details] [review]
cogl-context: Add a cogl_get_graphics_reset_status API

If the driver supports the GL_ARB_robustness extension we can expose
the graphics reset status this way.
--

Added documentation

Comment 120 Rui Matos 2016-06-21 19:18:20 UTC

Created attachment 330159 [details] [review]
cogl: Ignore GL_CONTEXT_LOST when checking for GL errors

--
rebased

Comment 121 Rui Matos 2016-06-21 19:19:23 UTC

Created attachment 330161 [details] [review]
clutter/x11: Add API to request video memory purges to be reported

--

This is the clutter call

Comment 122 Rui Matos 2016-06-21 19:20:15 UTC

Created attachment 330162 [details] [review]
restart: Make meta_restart() work without a message

--

Added allow-none

Comment 123 Rui Matos 2016-06-21 19:22:55 UTC

Created attachment 330163 [details] [review]
compositor: Handle GL video memory purged errors

--

(In reply to Owen Taylor from comment #113)
> +static void
> +redraw_all_children (ClutterActor *actor)
>
> It would be quite useful to document why this rather than
> clutter_actor_queue_redraw(stage); - my thought is that this will cause all
> the FBO caches for offscreen effects to be redrawn - but that's not
> sufficient - the fbo's actually need to be destroyed and recreated which
> this won't cause.

This was actually a leftover of an earlier version where I was still
experimenting. I don't think it's needed and replaced it with a simple
full stage redraw.

Also addressed your other points.

Comment 124 Rui Matos 2016-06-21 19:23:30 UTC

Created attachment 330164 [details] [review]
MetaBackground: invalidate contents on video memory purged errors

--
rebased

Comment 125 Rui Matos 2016-06-21 19:24:11 UTC

Created attachment 330165 [details] [review]
MetaSurfaceActorX11: invalidate the stex on video memory purged errors

--

Added a comment as suggested

Comment 126 Rui Matos 2016-06-21 19:26:09 UTC

Created attachment 330166 [details] [review]
main: Reload theme on video memory purge errors

--
rebased

Comment 127 Rui Matos 2016-06-21 19:27:30 UTC

Created attachment 330168 [details] [review]
screenShield: Stop using an offscreen buffer for the arrow actor

--
(In reply to Owen Taylor from comment #114)
> I can't figure out why Giovanni added this in the first place - we already
> cache both the cairo drawing for the arrow and the shadow - so drawing this
> widget should be about as cheap as possible. As far as I can tell, things
> should work fine without, but just make sure that you've checked that the
> arroow shadow draws correctly now.

Yes, it seems to work fine without this, including the shadow.

Comment 128 Rui Matos 2016-06-21 19:29:17 UTC

Created attachment 330169 [details] [review]
screenShield: Chain up Arrow's style_changed vfunc

--
(In reply to Owen Taylor from comment #109)
> Review of attachment 329855 [details] [review] [review]:
>
> I don't think chaining up the paint is correct

Right. Changed it to only chain up style_changed

Comment 129 Rui Matos 2016-06-22 19:21:36 UTC

(In reply to Rui Matos from comment #127)
> Yes, it seems to work fine without this, including the shadow.

Actually, after a closer look, the arrow shadows weren't being painted properly, see bug 767954. It still works fine with this patch though.

Comment 130 Owen Taylor 2016-06-27 14:27:05 UTC

Review of attachment 330165 [details] [review]:

Looks good

Comment 131 Owen Taylor 2016-06-27 14:29:42 UTC

Review of attachment 330166 [details] [review]:

Looks fine. (I shudder at the thought of doing more IO on resume from suspend, but it's nicely encapsulated to only NVIDIA driver versions that need it.)

Comment 132 Owen Taylor 2016-06-27 14:30:13 UTC

Review of attachment 330168 [details] [review]:

OK

Comment 133 Owen Taylor 2016-06-27 14:31:54 UTC

Review of attachment 330169 [details] [review]:

OK

Comment 134 Owen Taylor 2016-06-27 14:32:33 UTC

Review of attachment 330162 [details] [review]:

OK

Comment 135 Owen Taylor 2016-06-27 14:34:41 UTC

Review of attachment 330161 [details] [review]:

Looks fine

Comment 136 Owen Taylor 2016-06-27 14:47:02 UTC

Review of attachment 330163 [details] [review]:

Looks good

Comment 137 Owen Taylor 2016-06-27 14:50:50 UTC

Review of attachment 330164 [details] [review]:

Looks good

Comment 138 Owen Taylor 2016-06-27 14:53:25 UTC

Review of attachment 330158 [details] [review]:

Looks good

Comment 139 Owen Taylor 2016-06-27 15:42:37 UTC

Review of attachment 330156 [details] [review]:

Looks good except for docs/comments

::: cogl/cogl/cogl-xlib-renderer.h
@@ +176,3 @@
+void
+cogl_xlib_renderer_request_reset_on_video_memory_purge (CoglRenderer *renderer,
+                                                        CoglBool enable);

I'd like to see docs here. I think this is the proper place to explain:

 - that certain NVIDIA drivers lose frame buffer contents on suspend, etc.
 - that the NV_robustness_video_memory_purge extension allows getting notification
 - that if the extension isn't present, the driver doesn't have that issue

ANd also describe how you find out about the reset. If enabling this can cause
other resets to be reported, that should be mentioned here as something that
a user needs to be aware of.

::: cogl/cogl/winsys/cogl-winsys-glx.c
@@ +1031,3 @@
       None
     };
+  static const int attrib_list_reset_on_purge[] =

I'd like a comment here that has a short explanation of the relationship between NV_robustness_video_memory_purge and the robustness extension - basically what you found out from ahuillet

@@ +1050,3 @@
 
+  if (display->renderer->xlib_want_reset_on_video_memory_purge &&
+      strstr (glx_renderer->glXQueryExtensionsString (xlib_renderer->xdpy,

Need to comment why you aren't using COGL_WINSYS_FEATURE

Comment 140 Rui Matos 2016-06-27 18:24:44 UTC

Created attachment 330457 [details] [review]
cogl-winsys-glx: Add support for NV_robustness_video_memory_purge

--

Added the requested docs and comments.

Comment 141 Rui Matos 2016-06-27 18:27:05 UTC

(In reply to Rui Matos from comment #120)
> Created attachment 330159 [details] [review] [review]
> cogl: Ignore GL_CONTEXT_LOST when checking for GL errors

This one doesn't have the review flag yet, please don't forget about it

Comment 142 Owen Taylor 2016-06-28 17:28:38 UTC

Review of attachment 330457 [details] [review]:

Looks good

Comment 143 Owen Taylor 2016-06-28 17:32:07 UTC

Review of attachment 330159 [details] [review]:

Looks good

Comment 144 Rui Matos 2016-06-28 17:55:03 UTC

Attachment 330158 [details] pushed as 0f2be43 - cogl-context: Add a cogl_get_graphics_reset_status API
Attachment 330159 [details] pushed as d4d2bf0 - cogl: Ignore GL_CONTEXT_LOST when checking for GL errors
Attachment 330161 [details] pushed as 3691eb6 - clutter/x11: Add API to request video memory purges to be reported
Attachment 330162 [details] pushed as 7ed14e0 - restart: Make meta_restart() work without a message
Attachment 330163 [details] pushed as 7f6bcea - compositor: Handle GL video memory purged errors
Attachment 330164 [details] pushed as 53993ba - MetaBackground: invalidate contents on video memory purged errors
Attachment 330165 [details] pushed as cc6efeb - MetaSurfaceActorX11: invalidate the stex on video memory purged errors
Attachment 330457 [details] pushed as 87f9927 - cogl-winsys-glx: Add support for NV_robustness_video_memory_purge

Comment 145 Rui Matos 2016-06-28 17:56:50 UTC

Attachment 330166 [details] pushed as 358f64d - main: Reload theme on video memory purge errors
Attachment 330168 [details] pushed as a7562b4 - screenShield: Stop using an offscreen buffer for the arrow actor
Attachment 330169 [details] pushed as b5dd4d1 - screenShield: Chain up Arrow's style_changed vfunc

Comment 146 Hussam Al-Tayeb 2016-07-20 17:24:55 UTC

I hate to mention bad news but this is broken again after today's mutter checkins.
Background gets corrupted but floating arrow still renders correctly.

Comment 147 Jonas Ådahl 2016-07-26 23:29:09 UTC

Reopening given the previous comment.

Comment 148 aaguillon 2016-09-21 17:13:17 UTC

System:    Host:  Kernel: 3.16.0-4-amd64 x86_64 (64 bit) Desktop: Gnome 3.14.4 Distro: Debian GNU/Linux 8 
Graphics:  Card: NVIDIA GK107 [GeForce GTX 650]
           Display Server: X.Org 1.16.4 driver: nvidia Resolution: 1440x900@59.89hz, 1680x1050@59.88hz
           GLX Renderer: GeForce GTX 650/PCIe/SSE2 GLX Version: 4.4.0 NVIDIA 340.96

I have this problem. Any advice about the current status?

Thanks in advance,
AA

Comment 149 Hussam Al-Tayeb 2016-10-22 06:57:26 UTC

NVIDIA driver 375.10 seems to fix the corrupted background on switching VT.
I tried like a dozen times already.

The background corruption still happens however on resuming from hibernate.

Comment 150 Mario Sánchez Prada 2017-04-17 10:28:19 UTC

FWIW, the following two commits broke the build when using the GLES2 driver, as neither GL_CONTEXT_LOST nor GL_GUILTY_CONTEXT_RESET_ARB are defined in the headers (at least not in the headers provided by mesa 13.0.4):

* cogl: Ignore GL_CONTEXT_LOST when checking for GL errors
https://git.gnome.org/browse/mutter/commit/?id=d4d2bf0f6c1737256b921c4f1dedd3a95138cab9

* cogl-context: Add a cogl_get_graphics_reset_status API
https://git.gnome.org/browse/mutter/commit/?id=0f2be43af479f5bae41b00fc857324b27550e410

Comment 151 Mario Sánchez Prada 2017-04-17 11:42:23 UTC

I've filed bug 781398 to deal with the GLES2 issue. See patches attached there, thanks!

Comment 152 Hussam Al-Tayeb 2017-05-25 15:39:08 UTC

Right now only the animated arrow is corrupted. Wallpaper refreshes correctly.
So some improvement :)

Comment 153 Jonas Ådahl 2018-02-02 06:41:10 UTC

Created https://gitlab.gnome.org/GNOME/mutter/merge_requests/13 that should fix the regression.

Comment 154 Hussam Al-Tayeb 2018-02-02 07:12:53 UTC

(In reply to Jonas Ådahl from comment #153)
> Created https://gitlab.gnome.org/GNOME/mutter/merge_requests/13 that should
> fix the regression.

Wow. Thank you very very much for getting back to this bug. I'm cloning mutter sources now and will try the patch.

Comment 155 Hussam Al-Tayeb 2018-02-02 07:58:59 UTC

(In reply to Hussam Al-Tayeb from comment #154)
> (In reply to Jonas Ådahl from comment #153)
> > Created https://gitlab.gnome.org/GNOME/mutter/merge_requests/13 that should
> > fix the regression.
> 
> Wow. Thank you very very much for getting back to this bug. I'm cloning
> mutter sources now and will try the patch.

Looks fixed with the patch. Thanks again Jonas!

Comment 156 Jonas Ådahl 2018-02-02 08:03:44 UTC

(In reply to Hussam Al-Tayeb from comment #155)
> (In reply to Hussam Al-Tayeb from comment #154)
> > (In reply to Jonas Ådahl from comment #153)
> > > Created https://gitlab.gnome.org/GNOME/mutter/merge_requests/13 that should
> > > fix the regression.
> > 
> > Wow. Thank you very very much for getting back to this bug. I'm cloning
> > mutter sources now and will try the patch.
> 
> Looks fixed with the patch. Thanks again Jonas!

Thanks for testing, always appreciated!

Comment 157 Krystian 2018-03-03 07:44:23 UTC

Could you provide a release version / date for this fix?

Comment 158 Daniel van Vugt 2019-05-17 07:49:41 UTC

It appears the fix went into 3.28.0. But it's not working today in 3.32.x and 3.33.x so please see this new bug instead:

  https://gitlab.gnome.org/GNOME/gnome-shell/issues/1084

or if you are using Ubuntu:

  https://bugs.launchpad.net/bugs/1809407