After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 703174 - Crash when switching user with gnome 3.8 using nvidia drivers
Crash when switching user with gnome 3.8 using nvidia drivers
Status: RESOLVED OBSOLETE
Product: cogl
Classification: Platform
Component: CoglFramebuffer
unspecified
Other Linux
: Normal critical
: ---
Assigned To: Cogl maintainer(s)
Cogl maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2013-06-27 09:48 UTC by Valerio Mariani
Modified: 2021-06-10 11:20 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Session log from OpenSUSE 12.3 running Gnome 3.8, immediately after the crash (16.03 KB, text/plain)
2013-06-27 14:32 UTC, Valerio Mariani
  Details
Gdb backtrace for gnome-shell 3.8 on OpenSuse (44.28 KB, text/plain)
2013-06-28 09:19 UTC, Valerio Mariani
  Details
Session log from OpenSUSE 12.3 running Gnome 3.8, immediately after the crash (52.48 KB, text/plain)
2013-06-28 09:42 UTC, Valerio Mariani
  Details
Gdb backtrace for gnome-shell 3.8 on OpenSuse (59.12 KB, text/plain)
2013-06-28 12:33 UTC, Valerio Mariani
  Details
st: Explicilty allocate framebuffer for shadow material (1.51 KB, patch)
2013-06-28 19:58 UTC, drago01
none Details | Review
Backtrace of gnome shell after proposed patch (57.99 KB, text/plain)
2013-06-28 23:35 UTC, Valerio Mariani
  Details
st: Explicilty allocate framebuffer for shadow material (1.51 KB, patch)
2013-07-01 13:44 UTC, drago01
none Details | Review
Gdb backtrace for gnome-shell 3.8 on OpenSuse after patch (56.45 KB, text/plain)
2013-07-01 14:25 UTC, Valerio Mariani
  Details
offscreen-effect: Allocate the framebuffer explicilty (1.43 KB, patch)
2013-07-03 15:49 UTC, drago01
none Details | Review
offscreen: Allocate the framebuffer in cogl_offscreen_new_to_texture_full (1.23 KB, patch)
2013-07-03 16:59 UTC, drago01
committed Details | Review
Gdb backtrace for gnome-shell 3.8 on OpenSuse after cogl patch (57.51 KB, text/plain)
2013-07-04 23:04 UTC, Valerio Mariani
  Details
Hack (700 bytes, patch)
2013-08-07 11:47 UTC, drago01
none Details | Review
more detailed backtrace (7.97 KB, text/plain)
2013-08-07 16:38 UTC, leigh123linux
  Details
A hack to remove offending error log. (680 bytes, patch)
2013-08-08 21:21 UTC, Tapani Mattila
none Details | Review
Backtrace after cogl patch (47.28 KB, text/plain)
2013-08-09 12:54 UTC, Valerio Mariani
  Details

Description Valerio Mariani 2013-06-27 09:48:17 UTC
If I log in as a user, then 'Switch User' to allow another user to
login, and then switch back to the first user (either by selecting
again the 'Switch User' entry in the menu, or going to the correct tty
with the keyboard), I see a message saying that something went wrong,
that all extensions have been turned off, and offering me to logout (I
am not using any extension). This does not happen at every switch, sometimes it happens after 2,3 switches, but it inevitably happens eventually.

This happens only with proprietary NVIDIA drivers and I witnessed this problem on the following distros: OpenSUSE (With Gnome 3.8 repository), Arch Linux, Fedora 19 (RC1), so I think it is an upstream bug. 

This effectively renders user switching, and hence multi-user workstations, unusable in Gnome 3.8 and prevents me effectively from using gnome (I need the proprietary drivers and the 3D acceleration performance for work reasons)

In detail, it seems that as soon as I leave the tty of my current user to go to the X session that runs the new GDM instance for user switching, the gnome shell in the old tty crashes. 

Here are the last few lines of /var/log/messages (On OpenSUSE). I am available to provide further information

2013-04-21T19:31:17.222982+02:00 linux-3xop dbus-daemon[568]:
dbus[568]: [system] Rejected send message, 2 matched rules;
type="method_return", sender=":1.0" (uid=0 pid=534
comm="/usr/lib/systemd/systemd-logind ") interface="(unset)"
member="(unset)" error name="(unset)" requested_reply="0"
destination=":1.48" (uid=1001 pid=1093 comm="/usr/bin/gnome-session ")
2013-04-21T19:31:17.222999+02:00 linux-3xop dbus[568]: [system]
Rejected send message, 2 matched rules; type="method_return",
sender=":1.0" (uid=0 pid=534 comm="/usr/lib/systemd/systemd-logind ")
interface="(unset)" member="(unset)" error name="(unset)"
requested_reply="0" destination=":1.48" (uid=1001 pid=1093
comm="/usr/bin/gnome-session ")
2013-04-21T19:31:17.263684+02:00 linux-3xop dbus-daemon[568]:
dbus[568]: [system] Rejected send message, 1 matched rules;
type="method_call", sender=":1.6" (uid=0 pid=632 comm="/usr/sbin/gdm
") interface="org.freedesktop.DBus.Properties" member="GetAll" error
name="(unset)" requested_reply="0" destination=":1.95" (uid=0 pid=2043
comm="/usr/lib/gdm/gdm-simple-slave --display-id /org/gn")
2013-04-21T19:31:17.263935+02:00 linux-3xop dbus[568]: [system]
Rejected send message, 1 matched rules; type="method_call",
sender=":1.6" (uid=0 pid=632 comm="/usr/sbin/gdm ")
interface="org.freedesktop.DBus.Properties" member="GetAll" error
name="(unset)" requested_reply="0" destination=":1.95" (uid=0 pid=2043
comm="/usr/lib/gdm/gdm-simple-slave --display-id /org/gn")
2013-04-21T19:31:17.742845+02:00 linux-3xop xdm[569]: Failed to give
slave programs access to the display. Trying to proceed.
2013-04-21T19:31:17.776668+02:00 linux-3xop gdm-launch-environment]:
pam_unix(gdm-launch-environment:session): session opened for user gdm
by (uid=0)
2013-04-21T19:31:17.777754+02:00 linux-3xop systemd-logind[534]: New
session 4 of user gdm.
2013-04-21T19:31:17.788569+02:00 linux-3xop systemd-logind[534]:
Linked /tmp/.X11-unix/X1 to /run/user/489/X11-display.
2013-04-21T19:31:18.064490+02:00 linux-3xop gnome-session[2062]:
Entering running state
2013-04-21T19:31:18.091559+02:00 linux-3xop rtkit-daemon[1020]:
Successfully made thread 2093 of process 2093 (/usr/bin/pulseaudio)
owned by 'gdm' high priority at nice level -11.
2013-04-21T19:31:18.091577+02:00 linux-3xop rtkit-daemon[1020]:
Supervising 5 threads of 2 processes of 2 users.
2013-04-21T19:31:18.331517+02:00 linux-3xop rtkit-daemon[1020]:
Successfully made thread 2105 of process 2105 (/usr/bin/pulseaudio)
owned by 'gdm' high priority at nice level -11.
2013-04-21T19:31:18.331533+02:00 linux-3xop rtkit-daemon[1020]:
Supervising 9 threads of 3 processes of 2 users.
2013-04-21T19:31:18.332104+02:00 linux-3xop pulseaudio[2105]:
[pulseaudio] pid.c: Daemon already running.
2013-04-21T19:31:18.396178+02:00 linux-3xop kernel: [ 317.259962]
traps: gnome-shell[1264] trap int3 ip:7f208af9308f sp:7fff51e501a0
error:0
2013-04-21T19:31:18.414595+02:00 linux-3xop dbus-daemon[568]:
dbus[568]: [system] Rejected send message, 2 matched rules;
type="method_return", sender=":1.0" (uid=0 pid=534
comm="/usr/lib/systemd/systemd-logind ") interface="(unset)"
member="(unset)" error name="(unset)" requested_reply="0"
destination=":1.48" (uid=1001 pid=1093 comm="/usr/bin/gnome-session ")
2013-04-21T19:31:18.414614+02:00 linux-3xop dbus[568]: [system]
Rejected send message, 2 matched rules; type="method_return",
sender=":1.0" (uid=0 pid=534 comm="/usr/lib/systemd/systemd-logind ")
interface="(unset)" member="(unset)" error name="(unset)"
requested_reply="0" destination=":1.48" (uid=1001 pid=1093
comm="/usr/bin/gnome-session ")
2013-04-21T19:31:18.416517+02:00 linux-3xop polkitd[629]: Unregistered
Authentication Agent for unix-session:2 (system bus name :1.73, object
path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale
en_US.UTF-8) (disconnected from bus)
2013-04-21T19:31:18.420581+02:00 linux-3xop gnome-session[1093]:
WARNING: Application 'gnome-shell.desktop' killed by signal 5
2013-04-21T19:31:18.719545+02:00 linux-3xop polkitd[629]: Registered
Authentication Agent for unix-session:4 (system bus name :1.103
[gnome-shell --mode=gdm], object path
/org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
2013-04-21T19:31:18.984581+02:00 linux-3xop polkitd[629]: Registered
Authentication Agent for unix-session:2 (system bus name :1.122
[/usr/bin/gnome-shell], object path
/org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_US.UTF-8)
2013-04-21T19:31:20.856180+02:00 linux-3xop kernel: [ 319.713669]
traps: gnome-shell[2119] trap int3 ip:7fb5f83b208f sp:7fff3bc3d1d0
error:0
2013-04-21T19:31:20.862822+02:00 linux-3xop polkitd[629]: Unregistered
Authentication Agent for unix-session:2 (system bus name :1.122,
object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale
en_US.UTF-8) (disconnected from bus)
2013-04-21T19:31:20.863059+02:00 linux-3xop gnome-session[1093]:
WARNING: Application 'gnome-shell.desktop' killed by signal 5
2013-04-21T19:31:20.863679+02:00 linux-3xop gnome-session[1093]:
WARNING: App 'gnome-shell.desktop' respawning too quickly
2013-04-21T19:31:20.864780+02:00 linux-3xop gnome-session[1093]:
Unrecoverable failure in required component gnome-shell.desktop
Comment 1 drago01 2013-06-27 11:05:03 UTC
There is not much we can do here. This is clearly a driver bug. Please report this to nvidia.
Comment 2 Valerio Mariani 2013-06-27 12:19:49 UTC
Thanks for your quick and prompt answer. 

It works correctly with exactly the same drivers (the same system, literally the same hardware) in Gnome 3.6 (for example in OpenSuse before adding the 3.8 repo). 

Since nothing changed in both the hardware and the driver, I think it must be something that Gnome 3.8 does differently from 3.6...

I think that this points to a bug in Gnome 3.8...

  Valerio
Comment 3 Timo Saarinen 2013-06-27 13:47:30 UTC
Hummmm, this bug has been reproduced on Arch Linux, OpenSuse and Fedora so far. More or less unanimous conclusion seems to be, that this is an upstream bug in Gnome 3.8. We know, that this is related to the nVidia binary driver, that some open-source lovers don't find as a recommended solution. It would be pity to switch to KDE or XFCE, because of this silly bug, which prevents us from having a reliable multi-user experience with Gnome.

Timo
Comment 4 drago01 2013-06-27 14:06:40 UTC
(In reply to comment #2)
> Thanks for your quick and prompt answer. 
> 
> It works correctly with exactly the same drivers (the same system, literally
> the same hardware) in Gnome 3.6 (for example in OpenSuse before adding the 3.8
> repo). 
> 
> Since nothing changed in both the hardware and the driver, I think it must be
> something that Gnome 3.8 does differently from 3.6...
> 
> I think that this points to a bug in Gnome 3.8...

We need a backtrace of the crash or at least the log output of the session (depending of the distro either ~/.cache/gdm/session.log).
Comment 5 Florian Müllner 2013-06-27 14:12:05 UTC
(In reply to comment #3)
> More or less unanimous conclusion seems to be, that this is an upstream bug in
> Gnome 3.8. We know, that this is related to the nVidia binary driver,


So - it is a bug in GNOME's binary nVidia driver?

We don't use driver-specific code paths in our code, so if the issue is limited to the binary nVidia driver, that's a strong indication that comment #1 is correct (in the end, only a backtrace will tell).

It's obviously possible that we only started to use features that expose said bug in 3.8, but that doesn't make it a GNOME issue - if the bug is in the driver, that's where it needs to be fixed; it is not feasible for us to maintain workarounds for an ever-growing list of known bugs in specific driver versions.
Comment 6 Jasper St. Pierre (not reading bugmail) 2013-06-27 14:14:53 UTC
Well, there's also the case of "we're actually using the features wrong, but other drivers have a bug so it actually works OK".

But it looks like things are segfaulting, so we'll need a backtrace to make sure.
Comment 7 drago01 2013-06-27 14:19:35 UTC
(In reply to comment #3)
> Hummmm, this bug has been reproduced on Arch Linux, OpenSuse and Fedora so far.
> More or less unanimous conclusion seems to be, that this is an upstream bug in
> Gnome 3.8.

No as the reporter said "it only happens with the proprietary driver". Testing with different distros does not tell us much. So your conclusion does not make much sense without any log file or traces to point that point to bug / crash inside gnome.

> We know, that this is related to the nVidia binary driver, that some
> open-source lovers don't find as a recommended solution.

I have no idea what this is supposed to mean.

> It would be pity to
> switch to KDE or XFCE, because of this silly bug, which prevents us from having
> a reliable multi-user experience with Gnome.

As stated above we need crash logs to check what / where the problem is.
Comment 8 Valerio Mariani 2013-06-27 14:32:09 UTC
Created attachment 247907 [details]
Session log from OpenSUSE 12.3 running Gnome 3.8, immediately after the crash
Comment 9 Valerio Mariani 2013-06-27 14:32:29 UTC
I attached the ~/.cache/gdm/session.log file. Please let me know if you need more info, and what you need, and I'll be happy to run all the tests that you need (At leat for a couple of hours more. Then I'll leave my workplace and we'll have to resume tomorrow)

Thank you for looking at this bug, it is very appreciated...


 Valerio
Comment 10 drago01 2013-06-27 15:22:42 UTC
(In reply to comment #9)
> I attached the ~/.cache/gdm/session.log file. Please let me know if you need
> more info, and what you need, and I'll be happy to run all the tests that you
> need (At leat for a couple of hours more. Then I'll leave my workplace and
> we'll have to resume tomorrow)

Unfortunately the log isn't really helpful. Can you get a backtrace (attach gdb to the gnome-shell process trigger the crash and get the backtrace).

You'd need a second computer to do that (over ssh).
Comment 11 Valerio Mariani 2013-06-27 16:13:07 UTC
Sorry I left my workplace and I don't have an Nvidia card at home. I will do this tomorrow as soon as get back to work. Thank you again for looking at this

  Valerio
Comment 12 Valerio Mariani 2013-06-28 00:17:03 UTC
Ok I got my hands on a desktop equipped with nvidia card and installed opensuse with gnome 3.8 to test...

I can't get the gdb traceback because attaching gdb to the gnome-shell process seems to make it unresponsive

1) On the first machine: open ssh connection
2) Start gdb
3) Type set logging file to <filename>
4) Type set logging on
5) Type attach <pid of gnome shell>

At this point, on the other computer running gnome I would like to trigger the clash, but gnome shell is completely unresponsive, I can click anywhere and nothing happens. If I type quit in gdb, the logging stops, but the shell starts behaving normally. 

Am I doing something wrong? Also, can I provide you with any other log file?

Thank you for all the help

  Valerio
Comment 13 drago01 2013-06-28 06:15:35 UTC
(In reply to comment #12)
> Ok I got my hands on a desktop equipped with nvidia card and installed opensuse
> with gnome 3.8 to test...
> 
> I can't get the gdb traceback because attaching gdb to the gnome-shell process
> seems to make it unresponsive
> 
> 1) On the first machine: open ssh connection
> 2) Start gdb
> 3) Type set logging file to <filename>
> 4) Type set logging on
> 5) Type attach <pid of gnome shell>
> 
> At this point, on the other computer running gnome I would like to trigger the
> clash, but gnome shell is completely unresponsive,

Type "continue" (or just c) into the gdb console.
Comment 14 Valerio Mariani 2013-06-28 09:19:03 UTC
Created attachment 247958 [details]
Gdb backtrace for gnome-shell 3.8 on OpenSuse

Dear Gnome Developers.
  
 thanks for the hint. I attach the log file. Unfortunately it does not seem at first sight to be very informative, especitally because I don't have debugging symbols installed. Should I install the debugging symbols for all gnome shell packages and try again?

Please ask me anything you need. I think debugging this will really make a lot of people happy...

  Valerio
Comment 15 Milan Bouchet-Valat 2013-06-28 09:26:32 UTC
Well, you also need to type
thread apply all bt
after the crash.

Maybe from the trace we will be able to tell you what debugging symbols you need so that you do not have to install all of them.
Comment 16 Valerio Mariani 2013-06-28 09:42:22 UTC
Created attachment 247960 [details]
Session log from OpenSUSE 12.3 running Gnome 3.8, immediately after the crash

Sorry, here it is!

  Valerio
Comment 17 Milan Bouchet-Valat 2013-06-28 10:52:27 UTC
OK, it looks like you will need these symbols:
zypper install -C "debuginfo(build-id)=cf4a52384e80228cada3276f44db5c5544a4933c"
zypper install -C "debuginfo(build-id)=c8509baf049a1169826becce1871bc09d20b0ea0"
zypper install -C "debuginfo(build-id)=1c5228e116ef78400df563c8e03c5cd48c9e1353"
zypper install -C "debuginfo(build-id)=3998884b9f8b8660cefd6fb80ccc823ab5e98dcc"
zypper install -C "debuginfo(build-id)=133bf18e2c6f95dde06067982fbcb76491d80367"
zypper install -C "debuginfo(build-id)=a18962cbf190a47b659d363b0daeffec19070afa"
zypper install -C "debuginfo(build-id)=b321b3170f5ff64f744f6d9f844cf9251943ecf7"
Comment 18 Valerio Mariani 2013-06-28 12:33:38 UTC
Created attachment 247975 [details]
Gdb backtrace for gnome-shell 3.8 on OpenSuse

Here is the traceback with the debugging symbols. Thanks a lot for bearing with me on this. I am not a noob, but definitely not an expert.

Please let me know if you need anything else

  Valerio
Comment 19 Milan Bouchet-Valat 2013-06-28 13:40:59 UTC
Looks good AFAICT, thanks!
Comment 20 drago01 2013-06-28 19:58:00 UTC
Created attachment 248034 [details] [review]
st: Explicilty allocate framebuffer for shadow material

When the opengl driver fails to allocate a framebuffer later on
cogl will abort. So try to catch the error upfront by doing
the allocation directly instead of letting cogl do it lazily.

---

Can you try this patch? 
Your bug looks indeed like a driver bug (it for some reason fails to 
allocate a framebuffer). This patch tryes to avoid the crash in that
case. But I cannot guarantee that it wont just crash elsewhere later on.
(it probably will).

Nevertheless can you try this one and see if it improves things?
Comment 21 Valerio Mariani 2013-06-28 23:35:00 UTC
Created attachment 248036 [details]
Backtrace of gnome shell after proposed patch

Thank you! The patch did not work. The shell still crashes, but the error changed. Instead of the "Log Out" screen, I now get the "Something went wrong, contact administrators" screen.

I attach the new backtrace

  Valerio
Comment 22 Artiom MOLCHANOV 2013-07-01 12:44:46 UTC
I have experienced this kind of crash once or twice even with Nouveau driver. Not on user switch but on shutdown with second user logged in.
Comment 23 drago01 2013-07-01 13:44:52 UTC
Created attachment 248146 [details] [review]
st: Explicilty allocate framebuffer for shadow material

When the opengl driver fails to allocate a framebuffer later on
cogl will abort. So try to catch the error upfront by doing
the allocation directly instead of letting cogl do it lazily.


---

Sorry there has been a typo in the patch, please try this one instead.
Comment 24 Valerio Mariani 2013-07-01 14:25:48 UTC
Created attachment 248152 [details]
 Gdb backtrace for gnome-shell 3.8 on OpenSuse after patch

The patch did not work. It also had an interesting side effect. Now every time I log in and open a window, the shell (or whatever is handling the window decoration) seems to crash and restart by itself imemdiately. I see the window decoration of the opened window disappear and come back in a few seconds. 

As usual I attach the backtrace.

  Valerio
Comment 25 Valerio Mariani 2013-07-03 10:51:37 UTC
Please let me know if you need more information or if I should test any new patch
Comment 26 drago01 2013-07-03 15:49:41 UTC
Created attachment 248325 [details] [review]
offscreen-effect: Allocate the framebuffer explicilty

Otherwise we might crash later when the lazy allocation fails.

--

Hi, please try this clutter patch in *addition* to the gnome-shell patch I attached earlier.
Comment 27 drago01 2013-07-03 16:59:05 UTC
Created attachment 248334 [details] [review]
offscreen: Allocate the framebuffer in cogl_offscreen_new_to_texture_full

The API says that it should return NULL on failure but it does not do that
due to the lazy allocation.

---

OK, use this patch instead of the last one (apply against cogl).
Comment 28 drago01 2013-07-03 17:01:15 UTC
Comment on attachment 248146 [details] [review]
st: Explicilty allocate framebuffer for shadow material

No longer needed with the cogl patch.
Comment 29 Valerio Mariani 2013-07-03 23:53:01 UTC
Sorry, with the latest cogl patch (248334), should I also use the gnome-shell patch or not.

I tried just by applying the cogl patch, and the problem is not solved. The behavior is identical, with the gnome-shell crashing

Sorry, I forgot to generate the backlog. I'll send it to you tomorrow

  Valerio
Comment 30 drago01 2013-07-04 06:35:34 UTC
(In reply to comment #29)
> Sorry, with the latest cogl patch (248334), should I also use the gnome-shell
> patch or not.

You don't have to. But if you do it does not matter.

> I tried just by applying the cogl patch, and the problem is not solved. The
> behavior is identical, with the gnome-shell crashing
>
> Sorry, I forgot to generate the backlog. I'll send it to you tomorrow

You should know by now that a report "it crashed" is useless without a backtrace.
Comment 31 Valerio Mariani 2013-07-04 23:04:58 UTC
Created attachment 248420 [details]
Gdb backtrace for gnome-shell 3.8 on OpenSuse after cogl patch

Sorry I was traveling for work. Ok, so let me explain better. The syptoms changed. Right now if I switch from user1 to user2 and then back to user1, instead of being faced with user1's session, I see a black screen with just a cursor. 

Looking at the backtrace, it seems that cogl crashed. I attach it.

 Valerio
Comment 32 Valerio Mariani 2013-07-08 09:58:54 UTC
Just to clarify, the previous traceback refers to a patched cogl and an unpatched gnome-shell
Comment 33 drago01 2013-07-08 11:06:22 UTC
(In reply to comment #32)
> Just to clarify, the previous traceback refers to a patched cogl and an
> unpatched gnome-shell

This does not really matter. Did you report this to NVIDIA? Not being able to create FBOs after user switch *is* a driver bug. This does not seem to be limited to a specific instance but to every attempt to create an FBO.
Comment 34 Valerio Mariani 2013-07-08 11:10:02 UTC
Ok, I'll report this to NVIDIA. However, how is it possible that it was working in 3.6? Mind you, I am not being critical. I just would like to know what is technically different, so I can report it more correctly...
Comment 35 drago01 2013-07-08 12:21:43 UTC
(In reply to comment #34)
> Ok, I'll report this to NVIDIA. However, how is it possible that it was working
> in 3.6? 

I don't know maybe the driver has some kind of upper limit on FBOs and the multiple instances of gnome-shell exceed that limit. But that is just a guess I don't even know whether we use more FBOs then we did in 3.6 or not.

> I just would like to know what is technically different, so I can report it more correctly...

Just point them to that bug.
Comment 36 Valerio Mariani 2013-07-10 00:40:22 UTC
Ok I reported the bug to NVIDIA (Through the NVIDIA Bug Submissions portal. Let's see what happens.

Thank you for your help

  Valerio
Comment 37 Valerio Mariani 2013-07-29 13:44:15 UTC
Just a note. After more than 3 weeks, no acknowledgement or action from NIVIDIA.


 Even if this is a driver bug, I suggest that you try to find a workaround on your side (trying, for example, not to hit the maximum number of framebuffer objects that are instantiable). 

Mind you, I am not saying that you are not right. I am saying that NVIDIA will not fix this soon and that you cannot expect people to use only the nouveau driver...

Thank you

   Valerio
Comment 38 drago01 2013-07-29 15:02:21 UTC
(In reply to comment #37)
> Just a note. After more than 3 weeks, no acknowledgement or action from
> NIVIDIA.

How / where did you report it?

I have added Aaron to CC maybe he knows what's going on or can point the right people to it.
Comment 39 Valerio Mariani 2013-08-02 00:15:21 UTC
Thank you drago01. It seems that you need to login at the NVIDIA site to see the issue report. I copy and paste it nelow here. No Answer yet.

---

Reference Number 130709-000280
Status New
Created 07/09/2013 05:38 PM
Updated 07/09/2013 05:38 PM
Product GeForce graphics
File Attachment HTML document backtrace.log (59.12 KB)
Choose OS Linux/Other Unix
Product Name GT640-2GD3
Driver Version 319.17-12.1 (64bit)

When using gnome 3.8, the gnome shell crashes at user switching (specifically, at tty switching). The crash seems to be caused by the failure of the NVIDIA driver to initialize a framebuffer object. The Gnome developers already looked at the issue and believe it is a NVIDIA driver bug. The bug report and analysis can be followed here:

https://bugzilla.gnome.org/show_bug.cgi?id=703174

I attach a traceback of the gnome-shell process when the crash happens.

Please let me know if you need more information

Valerio
Comment 40 drago01 2013-08-02 06:20:55 UTC
"There is an NVIDIA Linux Driver web forum. You can access it by going to
http://devtalk.nvidia.com and following the "Linux" link in the "GPU Unix
Graphics" section. This is the preferable tool for seeking help; users can
post questions, answer other users' questions, and search the archives of
previous postings.

If all else fails, you can contact NVIDIA for support at:
linux-bugs@nvidia.com. But please, only send email to this address after you
have explored the Chapter 7 and Chapter 8 chapters of this document, and asked
for help on the devtalk.nvidia.com web forum. When emailing
linux-bugs@nvidia.com, please include the 'nvidia-bug-report.log.gz' file
generated by the 'nvidia-bug-report.sh' script (which is installed as part of
driver installation), along with a detailed description of your problem."
Comment 41 Valerio Mariani 2013-08-02 16:02:16 UTC
Ok, thanks for heading me in the right direction. I posted the report at the correct place. The attachments are still scanning, but the thread is here:

https://devtalk.nvidia.com/default/topic/570981/linux/using-gnome-shell-3-8-nvidia-driver-fails-to-create-framebuffer-object/

  Valerio
Comment 42 Valerio Mariani 2013-08-04 23:55:10 UTC
Well, it seems that the attached files show the [SCANNING... PLEASE WAIT] message forever, so it seems I cannot attach files. I tried with both Chrome and Firefox. 
Does anyone know why?

It seems, however, that someone else could upload his files...
Comment 43 drago01 2013-08-05 20:32:39 UTC
Can you try with the newly released driver? https://devtalk.nvidia.com/default/topic/571558 ?
Comment 44 Valerio Mariani 2013-08-06 10:12:33 UTC
I tested it right now, I get exactly the same behavior....
Comment 45 Valerio Mariani 2013-08-06 10:14:26 UTC
FYI, NVIDIA apparently opened an internal bug:

https://devtalk.nvidia.com/default/topic/570981/linux/using-gnome-shell-3-8-nvidia-driver-fails-to-create-framebuffer-object/

They however say that it is now clear to them why this is an NVIDIA driver bug. You might want to post on the thread with more information....

Thank you all for your help

  Valerio
Comment 46 drago01 2013-08-07 11:47:45 UTC
Created attachment 251058 [details] [review]
Hack

---

Can you try this (ugly) hack? (apply against cogl).
Comment 47 Valerio Mariani 2013-08-07 12:27:35 UTC
Yes, give me a few hours. Is this an answer to this:

"The OpenGL experts got back to me already. The graphics driver is not allowed to touch the hardware during a VT switch, so it's expected for FBOs to be incomplete for the duration. libcogl needs to handle this more gracefully than just calling abort()."

(From the Nvidia forum?)
Comment 48 Valerio Mariani 2013-08-07 13:38:04 UTC
Someone tested the patch before me, and it reported in the NVIDIA forum that it did not work. I asked him to post here, also...
Comment 49 drago01 2013-08-07 13:43:41 UTC
"did not work" is to vague ...
Comment 51 leigh123linux 2013-08-07 16:38:48 UTC
Created attachment 251089 [details]
more detailed backtrace

more detailed backtrace
Comment 52 Tapani Mattila 2013-08-08 21:17:25 UTC
I simply took the error log off to prevent the abort and with a few user switches it looks like it worked. I'm not sure if the behaviour is perfect now, but my sessions survive the switches now.

I attached a quick patch for cogl/driver/gl/cogl-framebuffer-gl.c
Comment 53 Tapani Mattila 2013-08-08 21:21:41 UTC
Created attachment 251209 [details] [review]
A hack to remove offending error log.

Should prevent cogl abort because of a fatal (error-level) log entry during user switching. Very quickly tested and not a perfectly formed patch.
Comment 54 Valerio Mariani 2013-08-09 12:54:08 UTC
Created attachment 251233 [details]
Backtrace after cogl patch

I tried the cogl patch and it did not work for me. After a few user switches I see the black screen with the mouse pointer at login that I have been seeing recently. This is on OpenSUSE 12.3 with Gnome 3.8.3.

I generated a backtrace. However, I must say that when I attach gdb to the gnome-shell process and try to login, I immediately see the black screen with the pointer, and I cannot login. So the bug seems to show up at login and not on user switching, when gdb is attached.

Could this be another bug, given the slightly different symptoms? In any case, I attach the backtrace....
Comment 55 Valentin 2013-08-14 06:15:26 UTC
HI All, i have exactly the same problem on gentoo, if you need some information from me i can provide it.
Comment 56 Robert Bragg 2013-08-20 12:14:22 UTC
Comment on attachment 248334 [details] [review]
offscreen: Allocate the framebuffer in cogl_offscreen_new_to_texture_full

For reference; we've landed Adel's patch to revert the behaviour of cogl_offscreen_new_to_texture().
Comment 57 Dan Hansen 2013-08-31 05:01:49 UTC
What is the status of this bug? Is more information needed or is a fix in place? I'm running into this same issue on my Fedora 19 desktop at home. If more info is needed I'm glad to help.
Comment 58 Igor Gnatenko 2013-09-12 10:57:47 UTC
Had this problem w/ G3.8.2. retest today at the night.
Comment 59 Aniruddha 2013-09-17 05:34:49 UTC
Same problem here with gnome 3.8.4 on Fedora 19.
Comment 60 Tapani Mattila 2013-09-18 19:09:50 UTC
I don't get the crashes anymore with that patch I posted, but I only now realized that I have disabled screen saving and screen locking as well. So when I switch between users, the one that was on background doesn't greet me with an unlock dialog. This could change everything in terms of why my patch works for me.
Comment 61 Valerio Mariani 2013-09-25 23:24:33 UTC
For me, the shell does not crash anymore when switching users, but the user I switch do does not show me a lock screen or anything,  just a black screen with a cursor, so,  yes maybe the fact that your setup skips the lock screen might be important 

Valerio
Comment 62 Valerio Mariani 2013-09-25 23:24:36 UTC
For me, the shell does not crash anymore when switching users, but the user I switch do does not show me a lock screen or anything,  just a black screen with a cursor, so,  yes maybe the fact that your setup skips the lock screen might be important 

Valerio
Comment 63 Valerio Mariani 2013-09-25 23:24:37 UTC
For me, the shell does not crash anymore when switching users, but the user I switch do does not show me a lock screen or anything,  just a black screen with a cursor, so,  yes maybe the fact that your setup skips the lock screen might be important 

Valerio
Comment 64 Valerio Mariani 2013-09-25 23:25:54 UTC
Sorry for the triple post
Comment 65 Pacho Ramos 2013-11-05 21:37:57 UTC
(In reply to comment #56)
> (From update of attachment 248334 [details] [review])
> For reference; we've landed Adel's patch to revert the behaviour of
> cogl_offscreen_new_to_texture().

This looks to solve the issue in the 4 machines with nvidia drivers I have tried
Comment 66 scattol 2014-01-24 01:35:37 UTC
has this been released in Fedora19? I still see the problem
Comment 67 bbigby 2014-03-17 00:36:26 UTC
I get the "black screen with mouse cursor" problem after switching users and logging into an already active session.  However, I can make my desktop reappear by pressing the Windows key, which forces Gnome Shell to show a mosaic of the actively running programs/windows, and by pressing the Windows key again, which causes the desktop to return to normal.
Comment 68 bbigby 2014-03-17 00:39:28 UTC
I forgot to mention that I'm running Fedora 20 with the latest updates as of March 16, 2014 sans the latest kernel update.  I'm running gnome-shell 3.10.4, kernel 3.12.10-300.fc20.x86_64, kmod-nvidia 331.49.
Comment 69 André Klapper 2021-06-10 11:20:12 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version of cogl, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a ticket at
  https://gitlab.gnome.org/GNOME/cogl/-/issues/

Thank you for your understanding and your help.