After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 784470 - X11 sessions fail to start when KMS is enabled in the nvidia driver
X11 sessions fail to start when KMS is enabled in the nvidia driver
Status: RESOLVED NOTGNOME
Product: gdm
Classification: Core
Component: general
3.25.x
Other Linux
: Normal normal
: ---
Assigned To: GDM maintainers
GDM maintainers
Depends on:
Blocks:
 
 
Reported: 2017-07-03 09:36 UTC by Alberto Milone
Modified: 2017-10-05 13:58 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Failure from journactl (176.20 KB, text/plain)
2017-07-05 16:00 UTC, Alberto Milone
  Details
nvidia-bug-report tarball (128.67 KB, application/gzip)
2017-07-05 16:00 UTC, Alberto Milone
  Details
Hide X11 sessions when NVIDIA KMS is on (8.49 KB, patch)
2017-08-23 16:52 UTC, Alberto Milone
none Details | Review
Hide X11 sessions when NVIDIA KMS is on (9.97 KB, patch)
2017-09-07 13:24 UTC, Alberto Milone
none Details | Review

Description Alberto Milone 2017-07-03 09:36:37 UTC
Due to a current limitation in the nvidia binary driver, when enabling KMS (to use Wayland), any X11 sessions will crash, and the user will be brought back to the login screen.

Ideally, GDM would check if modesetting is enabled in the nvidia driver, and blacklist the X11 sessions, so that only the Wayland session is available. This would be a nice improvement in user experience.
Comment 1 Alberto Milone 2017-07-03 09:39:58 UTC
I am available to work on this, if the maintainers are interested, but I am going to need a little guidance, as, in order to check if KMS is enabled, we are going to need root privileges (or we won't have ead access to /sys/module/nvidia_drm/parameters/modeset ).

Any ideas?
Comment 2 Rui Matos 2017-07-05 15:42:57 UTC
What exactly is crashing?
Comment 3 Alberto Milone 2017-07-05 16:00:14 UTC
Created attachment 354936 [details]
Failure from journactl
Comment 4 Alberto Milone 2017-07-05 16:00:59 UTC
Created attachment 354937 [details]
nvidia-bug-report tarball
Comment 5 Rui Matos 2017-07-05 16:29:13 UTC
> Jul 05 17:52:41 gnome-box /usr/lib/gdm3/gdm-x-session[1583]: (II) NVIDIA(0): Validated MetaModes:
> Jul 05 17:52:41 gnome-box /usr/lib/gdm3/gdm-x-session[1583]: (II) NVIDIA(0):     "NULL"
> Jul 05 17:52:41 gnome-box /usr/lib/gdm3/gdm-x-session[1583]: (II) NVIDIA(0): Virtual screen size determined to be 640 x 480

Seem like the X driver is starting up without a valid output?

Then gnome-shell crashes. Can you get the stack trace for that crash?
Comment 6 Alberto Milone 2017-07-05 16:39:23 UTC
I can, but it's a well-known issue in the driver. It's not really gnome-specific. I was hoping we could work around it, since we cannot fix it.
Comment 7 tim 2017-07-27 10:36:25 UTC
HI Alberto, another reason to enable modeset with nvidia is to use the new Prime sync mode with X if using an Optimus laptop. I'm not technical but "Ideally, GDM would check if modesetting is enabled in the nvidia driver, and blacklist the X11 sessions, so that only the Wayland session is available. This would be a nice improvement in user experience." 
would actually not be what I want.
Comment 8 Alberto Milone 2017-07-27 12:50:58 UTC
(In reply to tim from comment #7)
> HI Alberto, another reason to enable modeset with nvidia is to use the new
> Prime sync mode with X if using an Optimus laptop. I'm not technical but
> "Ideally, GDM would check if modesetting is enabled in the nvidia driver,
> and blacklist the X11 sessions, so that only the Wayland session is
> available. This would be a nice improvement in user experience." 
> would actually not be what I want.

Tim, the change would not affect the systems that use PRIME. Those systems use the Intel GPU to drive the screen, so we will simply check that, and we won't break hybrid graphics. This is for systems with only NVIDIA GPUs.
Comment 9 tim 2017-07-28 03:22:36 UTC
Thanks Alberto. So perhaps there is yet no upstream bug for the current ubuntu gnome desktop problem for hybrid users who want to use Prime sync (nvidia modeset=1 under X) and gdm?
Comment 10 tim 2017-08-10 23:57:36 UTC
Hi Alberto, this the same bug I have.  But your suggested fix won't help Nvidia hybrid laptop owners, unless wayland will do the new Prime Synchronisation (which is not the same as PRIME). 

This crash happens even when laptops are running in discrete graphics mode (no intel). So it is exactly the same bug. 

Just set
/etc/modprobe.d/zz-nvidia.conf
to
options nvidia_384_drm modeset = 1

then 
sudo update-initramfs -u

make sure you are using gdm3

You can't start a gnome session. There's a libmutter crash in syslog

Your diagnosis is that that is a failure to start wayland, and therefore wayland must be started (I think this is what you are saying).

Such a fix would be no good to hybrid laptop owners, many of who who do not have a BIOS option for discrete mode (their laptops don't have a hardware mux for the internal display ). Such users want nvidia modeset without wayland (they would want if they knew that it finally fixes the ugly screen tearing of PRIME on the intel display). 

So the change would affect such users: they could either fallback to modeset=0, have the same old tearing which has plagued this hardware for years or they have to use wayland, which I assume will also not have the tearing fix (although I have no idea about that). 

Or we don't use gdm, which is the best option in 17.04, but lightdm is not in the default ubuntu-desktop for 17.10, gdm3 is.

The most perplexing thing: what is gdm3 doing that ligthdm does not do?
Comment 11 tim 2017-08-11 00:00:45 UTC
By the way, just so that it's clear: exactly the same crash happens when the laptop is started in hybrid mode. At least, it seems to be the same crash. This observation is the basis for me linking this to nvidia modeset=1 and therefore concluding the PRIME is not related to the bug, it is just a victim of the bug.
Comment 12 Alberto Milone 2017-08-23 16:42:54 UTC
@Tim: I haven't tested that use case yet. Please file a separate bug report about it, so that we can keep track of it and think of a solution.
Comment 13 Alberto Milone 2017-08-23 16:52:42 UTC
Created attachment 358254 [details] [review]
Hide X11 sessions when NVIDIA KMS is on

Note: this requires /sys/module/nvidia_drm/parameters/modeset to be user readable, as gnome-shell (which, from what I understand, uses libgdm) doesn't run as root.

Adding a udev rule with the following line works here:

ACTION=="add" DEVPATH=="/module/nvidia_drm" SUBSYSTEM=="module" RUN+="/bin/chmod 0444 /sys/module/nvidia_drm/parameters/modeset"
Comment 14 tim 2017-08-26 04:12:12 UTC
FOr me, nvidia support is going backwards; I can't even get to the greeter, even when I have a fresh install (ubuntu 17.10 with development packages) and my bios is set to Discrete graphics.I'll try again in a few days because there is not much point reporting a bug at the moment.
This is happening with lightdm too so it's not a gdm problem.
Comment 15 tim 2017-09-02 03:02:11 UTC
Just now updated everything on a stock ubuntu 17.10 with pre-release updates and I get to a gnome session with gdm with W520 in hybrid mode (nvidia profile, ie optimus). It even works with prime-sync which is the first time I've seen this working with gdm3. I will later test this on my more modern Thinkpad.
Comment 16 Iain Lane 2017-09-04 16:07:23 UTC
Review of attachment 358254 [details] [review]:

(Alberto asked me to review this for Ubuntu inclusion; I think it'll be helpful to do it upstream. Obviously if a maintainer reviews their comments should supersede mine where they conflict.)

I tried the patch, and although nvidia/wayland is quite buggy for me (for example it locks up when the screen blanks), it seems to do the job.

I think maybe you should only do the work of this patch if wayland support is compiled in. Otherwise AFAICS you'll end up with no selectable sessions?

::: common/gdm-common.c
@@ +48,3 @@
 G_DEFINE_QUARK (gdm-common-error, gdm_common_error);
 
+int nvidia_has_kms = -1;

I'd rather this used static storage if you are going to do it this way - is there a reason you can't?

@@ +824,3 @@
+        fclose(file);
+
+                return status;

IMO this function should use GFile APIs, mainly so you can have nicer error handling - I think you should warn on error.

g_file_new_for_path
g_file_read
g_input_stream_read
(g_strsplit; g_strcmp0)

or g_io_channel_new_file, g_io_channel_read_line_string.

@@ -800,0 +802,57 @@
+
+static gboolean
+is_module_loaded (const char *module) {
... 54 more ...

This is probably better as a g_warning

@@ +858,3 @@
+                g_debug ("is_nvidia_kms_available: Failed to parse '%s': %s",
+                         kms_file,
+is_nvidia_kms_available (void)

I don't think you need to check for error existing here; g_file_get_contents' documentation says you are guaranteed that error is set if FALSE is returned.

@@ +865,3 @@
+        g_debug ("is_nvidia_kms_available: contents, \n%s\n", contents);
+
+        gboolean          status = FALSE;

Use g_strcmp0

::: daemon/gdm-display.c
@@ +1453,3 @@
+#ifdef ENABLE_WAYLAND_SUPPORT
+        /* Disable X11 sessions if KMS is enabled in the nvidia driver
+         * See LP: #1697882.

Probably better to provide a full link to a Launchpad bug.

::: daemon/gdm-local-display-factory.c
@@ +461,3 @@
                         is_initial = FALSE;
                 }
 

Maybe #ifdef ENABLE_WAYLAND_SUPPORT here too?

@@ +462,3 @@
                 }
 
+                if (is_nvidia_kms_available ())

This should be commented too IMO (and the other similar cases).

::: daemon/gdm-session.c
@@ +360,3 @@
+        if (!is_nvidia_kms_available ()) {
+                g_array_append_vals (search_array, x_search_dirs, G_N_ELEMENTS (x_search_dirs));
+        }

Put the if inside an #ifdef ENABLE_WAYLAND_SUPPORT too?

::: libgdm/gdm-sessions.c
@@ +211,3 @@
         }
 
+        if (!is_nvidia_kms_available ()) {

ENABLE_WAYLAND_SUPPORT also? ...
Comment 17 Alberto Milone 2017-09-07 13:24:28 UTC
Created attachment 359351 [details] [review]
Hide X11 sessions when NVIDIA KMS is on

@Iain: I think I have addressed most of your concerns in the latest patch. The reason why I didn't make "int nvidia_has_kms" static is that I have to share it with the other files (hence the "extern int nvidia_has_kms").
Comment 18 cc rstrode@redhat.com not me 2017-09-07 18:03:43 UTC
(In reply to Alberto Milone from comment #6)
> I can, but it's a well-known issue in the driver. It's not really
> gnome-specific. I was hoping we could work around it, since we cannot fix it.
do you have a link to more info about this?
Comment 19 Alberto Milone 2017-10-05 13:58:24 UTC
I'm glad to report that the new nvidia 384 series fixed this issue. We can close this bug report.