GNOME Bugzilla – Bug 784470
X11 sessions fail to start when KMS is enabled in the nvidia driver
Last modified: 2017-10-05 13:58:24 UTC
Due to a current limitation in the nvidia binary driver, when enabling KMS (to use Wayland), any X11 sessions will crash, and the user will be brought back to the login screen. Ideally, GDM would check if modesetting is enabled in the nvidia driver, and blacklist the X11 sessions, so that only the Wayland session is available. This would be a nice improvement in user experience.
I am available to work on this, if the maintainers are interested, but I am going to need a little guidance, as, in order to check if KMS is enabled, we are going to need root privileges (or we won't have ead access to /sys/module/nvidia_drm/parameters/modeset ). Any ideas?
What exactly is crashing?
Created attachment 354936 [details] Failure from journactl
Created attachment 354937 [details] nvidia-bug-report tarball
> Jul 05 17:52:41 gnome-box /usr/lib/gdm3/gdm-x-session[1583]: (II) NVIDIA(0): Validated MetaModes: > Jul 05 17:52:41 gnome-box /usr/lib/gdm3/gdm-x-session[1583]: (II) NVIDIA(0): "NULL" > Jul 05 17:52:41 gnome-box /usr/lib/gdm3/gdm-x-session[1583]: (II) NVIDIA(0): Virtual screen size determined to be 640 x 480 Seem like the X driver is starting up without a valid output? Then gnome-shell crashes. Can you get the stack trace for that crash?
I can, but it's a well-known issue in the driver. It's not really gnome-specific. I was hoping we could work around it, since we cannot fix it.
HI Alberto, another reason to enable modeset with nvidia is to use the new Prime sync mode with X if using an Optimus laptop. I'm not technical but "Ideally, GDM would check if modesetting is enabled in the nvidia driver, and blacklist the X11 sessions, so that only the Wayland session is available. This would be a nice improvement in user experience." would actually not be what I want.
(In reply to tim from comment #7) > HI Alberto, another reason to enable modeset with nvidia is to use the new > Prime sync mode with X if using an Optimus laptop. I'm not technical but > "Ideally, GDM would check if modesetting is enabled in the nvidia driver, > and blacklist the X11 sessions, so that only the Wayland session is > available. This would be a nice improvement in user experience." > would actually not be what I want. Tim, the change would not affect the systems that use PRIME. Those systems use the Intel GPU to drive the screen, so we will simply check that, and we won't break hybrid graphics. This is for systems with only NVIDIA GPUs.
Thanks Alberto. So perhaps there is yet no upstream bug for the current ubuntu gnome desktop problem for hybrid users who want to use Prime sync (nvidia modeset=1 under X) and gdm?
Hi Alberto, this the same bug I have. But your suggested fix won't help Nvidia hybrid laptop owners, unless wayland will do the new Prime Synchronisation (which is not the same as PRIME). This crash happens even when laptops are running in discrete graphics mode (no intel). So it is exactly the same bug. Just set /etc/modprobe.d/zz-nvidia.conf to options nvidia_384_drm modeset = 1 then sudo update-initramfs -u make sure you are using gdm3 You can't start a gnome session. There's a libmutter crash in syslog Your diagnosis is that that is a failure to start wayland, and therefore wayland must be started (I think this is what you are saying). Such a fix would be no good to hybrid laptop owners, many of who who do not have a BIOS option for discrete mode (their laptops don't have a hardware mux for the internal display ). Such users want nvidia modeset without wayland (they would want if they knew that it finally fixes the ugly screen tearing of PRIME on the intel display). So the change would affect such users: they could either fallback to modeset=0, have the same old tearing which has plagued this hardware for years or they have to use wayland, which I assume will also not have the tearing fix (although I have no idea about that). Or we don't use gdm, which is the best option in 17.04, but lightdm is not in the default ubuntu-desktop for 17.10, gdm3 is. The most perplexing thing: what is gdm3 doing that ligthdm does not do?
By the way, just so that it's clear: exactly the same crash happens when the laptop is started in hybrid mode. At least, it seems to be the same crash. This observation is the basis for me linking this to nvidia modeset=1 and therefore concluding the PRIME is not related to the bug, it is just a victim of the bug.
@Tim: I haven't tested that use case yet. Please file a separate bug report about it, so that we can keep track of it and think of a solution.
Created attachment 358254 [details] [review] Hide X11 sessions when NVIDIA KMS is on Note: this requires /sys/module/nvidia_drm/parameters/modeset to be user readable, as gnome-shell (which, from what I understand, uses libgdm) doesn't run as root. Adding a udev rule with the following line works here: ACTION=="add" DEVPATH=="/module/nvidia_drm" SUBSYSTEM=="module" RUN+="/bin/chmod 0444 /sys/module/nvidia_drm/parameters/modeset"
FOr me, nvidia support is going backwards; I can't even get to the greeter, even when I have a fresh install (ubuntu 17.10 with development packages) and my bios is set to Discrete graphics.I'll try again in a few days because there is not much point reporting a bug at the moment. This is happening with lightdm too so it's not a gdm problem.
Just now updated everything on a stock ubuntu 17.10 with pre-release updates and I get to a gnome session with gdm with W520 in hybrid mode (nvidia profile, ie optimus). It even works with prime-sync which is the first time I've seen this working with gdm3. I will later test this on my more modern Thinkpad.
Review of attachment 358254 [details] [review]: (Alberto asked me to review this for Ubuntu inclusion; I think it'll be helpful to do it upstream. Obviously if a maintainer reviews their comments should supersede mine where they conflict.) I tried the patch, and although nvidia/wayland is quite buggy for me (for example it locks up when the screen blanks), it seems to do the job. I think maybe you should only do the work of this patch if wayland support is compiled in. Otherwise AFAICS you'll end up with no selectable sessions? ::: common/gdm-common.c @@ +48,3 @@ G_DEFINE_QUARK (gdm-common-error, gdm_common_error); +int nvidia_has_kms = -1; I'd rather this used static storage if you are going to do it this way - is there a reason you can't? @@ +824,3 @@ + fclose(file); + + return status; IMO this function should use GFile APIs, mainly so you can have nicer error handling - I think you should warn on error. g_file_new_for_path g_file_read g_input_stream_read (g_strsplit; g_strcmp0) or g_io_channel_new_file, g_io_channel_read_line_string. @@ -800,0 +802,57 @@ + +static gboolean +is_module_loaded (const char *module) { ... 54 more ... This is probably better as a g_warning @@ +858,3 @@ + g_debug ("is_nvidia_kms_available: Failed to parse '%s': %s", + kms_file, +is_nvidia_kms_available (void) I don't think you need to check for error existing here; g_file_get_contents' documentation says you are guaranteed that error is set if FALSE is returned. @@ +865,3 @@ + g_debug ("is_nvidia_kms_available: contents, \n%s\n", contents); + + gboolean status = FALSE; Use g_strcmp0 ::: daemon/gdm-display.c @@ +1453,3 @@ +#ifdef ENABLE_WAYLAND_SUPPORT + /* Disable X11 sessions if KMS is enabled in the nvidia driver + * See LP: #1697882. Probably better to provide a full link to a Launchpad bug. ::: daemon/gdm-local-display-factory.c @@ +461,3 @@ is_initial = FALSE; } Maybe #ifdef ENABLE_WAYLAND_SUPPORT here too? @@ +462,3 @@ } + if (is_nvidia_kms_available ()) This should be commented too IMO (and the other similar cases). ::: daemon/gdm-session.c @@ +360,3 @@ + if (!is_nvidia_kms_available ()) { + g_array_append_vals (search_array, x_search_dirs, G_N_ELEMENTS (x_search_dirs)); + } Put the if inside an #ifdef ENABLE_WAYLAND_SUPPORT too? ::: libgdm/gdm-sessions.c @@ +211,3 @@ } + if (!is_nvidia_kms_available ()) { ENABLE_WAYLAND_SUPPORT also? ...
Created attachment 359351 [details] [review] Hide X11 sessions when NVIDIA KMS is on @Iain: I think I have addressed most of your concerns in the latest patch. The reason why I didn't make "int nvidia_has_kms" static is that I have to share it with the other files (hence the "extern int nvidia_has_kms").
(In reply to Alberto Milone from comment #6) > I can, but it's a well-known issue in the driver. It's not really > gnome-specific. I was hoping we could work around it, since we cannot fix it. do you have a link to more info about this?
I'm glad to report that the new nvidia 384 series fixed this issue. We can close this bug report.