GNOME Bugzilla – Bug 754814
gnome-shell 3.17.91 stop loading after enter password
Last modified: 2017-02-28 13:31:33 UTC
gnome-shell 3.17.91 stop loading after enter password until enter escape or switch to another vt. With gnome-shell 3.17.90 all was normal.
This happen with nvidia proprietary driver. On virtualbox all ok. In gnome-shell 3.18 the same situation.
Please look at your journal (journalctl) around the time this happens and attach any relevant messages here.
May be this: pc1 systemd[1]: Started User Manager for UID 1000. сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[385]: (II) systemd-logind: got pause for 13:66 сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[385]: (II) systemd-logind: got pause for 13:64 сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[385]: (II) systemd-logind: got pause for 13:65 сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[385]: (II) systemd-logind: got pause for 226:0 сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[385]: (II) systemd-logind: got pause for 13:67 сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[385]: (II) systemd-logind: got pause for 13:68 сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[674]: _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[674]: _XSERVTransMakeAllCOTSServerListeners: server already running сен 30 06:31:43 pc1 /usr/lib/gdm/gdm-x-session[674]: X.Org X Server 1.17.2 After enter password screen become gray and icon animation don't work.
I had a similar issue as well. I entered the login password. gnome paused logging in until I pressed escape. I posted in mailing but but got little help. https://mail.gnome.org/archives/gnome-list/2015-September/msg00003.html This issue happened exactly every other try. that means it failed once, worked another time, failed once, worked another time, etc... (then started failing again after a reboot). Somehow something is getting set/unset correctly? Xorg was not starting till I pressed escape in the blank screen. that's something I could tell from the timing of the log. I'm using something other than Gnome right now till this gets fixed because I didn't know how long the 'escape' workaround would keep working. I was scared it could easily eventually stopped logging me altogether. Anyway, I am glad I wasn't the only person seeing this... For what is worth, 355.11 nvidia driver and same Xorg server version as Yurg. I just need something that reliably logs me in so I can get some work done :)
Yurg, did you by any chance find out why this happening or perhaps a workaround? What were your configure options when compiling gnome-shell? Perhaps we are compiling with same options and a particular one is breaking things?
I think this is begin after patch in 3.17.91:Fix login screen spinner causing wakeups while VT-switched away [Ray, Rui; #753891] In 3.17.90 all was fine. This happening on Arch Linux and Gentoo. I just press escape key :)
Ok, I will post in bug 753891. Thank you.
so hitting escape does nothing if verification has already entered the succeeded state. This means, GDM hasn't yet said verification is successful I guess. Since you've already entered your password, one possible explanation is gnome-shell is holding on to the answer and not delivering it to GDM yet. gnome-shell has code to do that if there are pending messages to display to the user. The code is here: answerQuery: function(serviceName, answer) {• if (!this.hasPendingMessages) {• this._userVerifier.call_answer_query(serviceName, answer, this._cancellable, null);• } else {• let signalId = this.connect('no-more-messages',• Lang.bind(this, function() {• this.disconnect(signalId);• this._userVerifier.call_answer_query(serviceName, answer, this._cancellable, null);• }));• }• },• The other thing hitting escape does is call the "reset" function which will ultimately clear the message queue. It could be the act of clearing the message queue makes the answer go through. If this theory is right, there are two questions in my head: 1) why isn't the message queue getting cleared on its own ? 2) reset cancels the user verifier before clearing the messsage queue. why does call_answer_query () work at all? These two questions are reason enough to give me doubts about the theory, but I figured I'd post it for now until I have something better.
so one possible answer for 1) is "an exception is getting thrown in the show message timeout, preventing it from getting requeued" . I think we sometimes hide exceptions when they happen in callbacks, so if you did get one that might explain why it's not in the log.
Ray, this happens exactly every other login as long as gdm is not restarted. First login to user shows bug. Second login shows no bug. Third login shows bug, etc... Does that give any clue?
and what do you see? do you see a spinner that's frozen? a gear menu?
The regular grey greeter background but nothing else. I remember I could actually still blindly right click on the where the password text box was (although I couldn't see the text box) and get a greyed out context menu or something like that.
(In reply to Yurg from comment #6) > I think this is begin after patch in 3.17.91:Fix login screen spinner > causing wakeups while VT-switched away > [Ray, Rui; #753891] > In 3.17.90 all was fine. > > This happening on Arch Linux and Gentoo. > I just press escape key :) What are the configure options you are using on Gentoo's gnome-shell?
(In reply to Hussam Al-Tayeb from comment #13) > (In reply to Yurg from comment #6) > > I think this is begin after patch in 3.17.91:Fix login screen spinner > > causing wakeups while VT-switched away > > [Ray, Rui; #753891] > > In 3.17.90 all was fine. > > > > This happening on Arch Linux and Gentoo. > > I just press escape key :) > > What are the configure options you are using on Gentoo's gnome-shell? ./configure --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --disable-dependency-tracking --disable-silent-rules --libdir=/usr/lib64 --docdir=/usr/share/doc/gnome-shell-3.18.0 --enable-compile-warnings=minimum --disable-schemas-compile --disable-maintainer-mode --disable-gtk-doc --enable-browser-plugin --enable-man --enable-systemd --with-bluetooth --enable-networkmanager BROWSER_PLUGIN_DIR=/usr/lib64/nsbrowser/plugins
Thank you. That's more or less the same configuration as arch. At least this rules out any configuration issues.
interesting, if you're seeing nothing but a gray background then you're much further along in the authentication process than I thought. Can you put Enable=true in the [debug] section of /etc/gdm/custom.conf, reproduce and then post journal output ?
journal output http://pastebin.com/raw.php?i=Fcr0GGgz xorg log file says: [ 119.062] (==) Log file: "/home/hussam/.local/share/xorg/Xorg.0.log", Time: Mon Oct 5 18:10:28 2015 (perhaps the timing when xorg started helps) I had to press Escape to get it to log on.
So in the log we see session-opened is getting emitted: Oct 05 18:09:59 hades gdm[405]: GdmSession: Emitting 'session-opened' signal and then after you got tired of waiting you hit escape: Oct 05 18:10:27 hades gdm[405]: GdmSession: external connection closed and from there things kick into gear. That means something is going wrong in the handling of the session-opened signal. The code for that is here: _onSessionOpened: function(client, serviceName) {• this._authPrompt.finish(Lang.bind(this, function() {• this._startSession(serviceName);• }));• },• That code says "when session-opened is emitted call authPrompt.finish(). When it finishes, call startSession". My first guess would be that the auth prompt isn't finishing, but that doesn't seem to be the case here, since you do see the auth prompt fading out, and going to gray. The fade out happens in startSession, so we know we got past authPrompt finishing. the code for startSession is this: Tweener.addTween(this.actor,• { opacity: 0,• time: _FADE_ANIMATION_TIME,• transition: 'easeOutQuad',• onUpdate: function() {• let children = Main.layoutManager.uiGroup.get_children();• • for (let i = 0; i < children.length; i++) {• if (children[i] != Main.layoutManager.screenShieldGroup)• children[i].opacity = this.actor.opacity;• }• },• onUpdateScope: this,• onComplete: function() {• let id = Mainloop.idle_add(Lang.bind(this, function() {• this._greeter.call_start_session_when_ready_sync(serviceName, true, null);• return GLib.SOURCE_REMOVE;• }));• GLib.Source.set_name_by_id(id, '[gnome-shell] this._greeter.call_start_session_when_ready_sync');• },• onCompleteScope: this });• We know we're getting at least mostly through the onUpdate handlers since the fade out appears to be finished. When they're done we're supposed to call the onComplete handler which will call_start_session_when_ready on the greeter. That's supposed to leave the message: GdmManager: Will start session when ready which we do see in the log, but not until after the user hits escape. It could be we're getting inundated with high priority idle handlers so start_session_when_ready never gets dispatched. I tried to reproduce this problem but couldn't on my f23 machine. If someone with a second machine and could: 1) debuginfo-install gnome-shell (or equivalent for non-fedora distros) 2) ssh into the problem machine 3) gdb attach $(pidof gnome-shell) 4) (gdb) continue 5) reproduce the problem 6) hit control-c on the machine you're ssh'd from 7) (gdb) thread apply all backtrace full 8) (gdb) call gjs_dumpstack() 9) (gdb) continue 10) hit control-c again 11) (gdb) thread apply all backtrace full 12) (gdb) call gjs_dumpstack() 13) (gdb) continue and then attach the full journal, with debug on in gdm, that would be fantastic.
I only have one computer :/
okay i'll try to reproduce more vigorously
if anyone could verify that the problem does indeed go away with gnome-shell 3.17.90 that'd be great
(In reply to Ray Strode [halfline] from comment #21) > if anyone could verify that the problem does indeed go away with gnome-shell > 3.17.90 that'd be great I can try that. I'll post back with results.
(In reply to Ray Strode [halfline] from comment #21) > if anyone could verify that the problem does indeed go away with gnome-shell > 3.17.90 that'd be great I can log in successfully every time with 3.17.90 so it appears that the bug showed up in 3.17.91 :/
(In reply to Ray Strode [halfline] from comment #20) > okay i'll try to reproduce more vigorously I think this happens with nvidia drivers.
(In reply to Yurg from comment #24) > (In reply to Ray Strode [halfline] from comment #20) > > okay i'll try to reproduce more vigorously > > I think this happens with nvidia drivers. There were only three non-translation checkins between 3.17.90 and 3.17.91
yea but it's not clear at all why stopping a spinner when verification completes could somehow delay an idle handler (that's run later on) from getting called. I basically need to reproduce this and instrument the code to get any traction i think.
You could also provide an instrumentation patch here for the people who can reproduce it.
I run Gentoo with nouveau driver and gnome-shell 3.18 successfully load, but without spinner animation.
Ray, if there are no reports of this happening on Fedora (there are probably lots of people there using nvidia), it is possible one of the xorg-server patches in Fedora is preventing this bug?
Another observation: gnome-shell/xorg use too much cpu until I ctrl+alt+f1 to greeter then ctrl+alt+f2 again so bug 753891 fixes are not working for me :s
*** Bug 756039 has been marked as a duplicate of this bug. ***
I can confirm this problem on a desktop computer with nvidia geforce 660. If the debug info with the 13 steps is still needed I should be able to provide it sometime next week when I get some more free time on my hands. I've downgraded to 3.16 for the time being.
arch linux with 4.2.2 kernel gtx 670 nvidia driver 355.11 gnome 3.18 After entering the password in the GDM gray background displayed until you press the "ESC" key. After pressing the "ESC" normally login in the system.
Not sure if relevant but autologin works around this bug. my hard disk is luks encrypted so no major security risk while computer is shut down.
A couple more pieces of information to add. After typing my password, if I click the login button instead of pressing enter, the login process works fine. If I press enter, I can confirm that pressing escape after continues the login process. Also, after pressing enter, I can still activate the no longer visible UI elements. For example, I can use the tab key to navigate to the session chooser (Gnome, Gnome Classic, Gnome Wayland, etc).
i think i'll have access to an nvidia laptop in a couple of days, will reproduce, and debug then.
(In reply to Ray Strode [halfline] from comment #36) > i think i'll have access to an nvidia laptop in a couple of days, will > reproduce, and debug then. Great. Thank you. Perhaps this is also a chance to look at other nvidia related oddities (for example, ctrl+alt+f3 then ctrl+alt+f1 back to gnome session causes desktop corruption and memory spikes).
The issue is also happening here with latest Nvidia's proprietary drivers and latest gnome-shell. The temporary workaround was hit Ctrl+Alt+F2 after the login. Please, let me know if I can provide any logs, package versions or run any command that could provide relevant information for you guys.
Saw the comment https://bugzilla.gnome.org/show_bug.cgi?id=754814#c18. I will try to rebuild everything and catch these logs from my end.
Created attachment 313598 [details] Debug info Here is the debug info as requested. From what I can see it isn't that useful. If you know which libraries/packages are required to recompile with debug info further to get deeper write it here and I'll follow up with the same procedure. NOTE: I'm on arch using the ABS to rebuild each package with options=(debug !strip). It seems to have worked for gnome-shell tho as it did load the symbols for it. They just didn't seem too useful. Also the program crashed after I pressed return, it never got past the escape part for me in gdb.
This is also happening for me on Arch with gdm/gnome-shell 3.18.0 and the Nvidia 355.11 drivers. Attempting to get debug output results in the same as comment #40, a segfault when calling gjs_dumpstack().
Getting the same problem here. Arch gdm 3.18. Dual GTX 550ti's using base mosaic. _XSERVTransMakeAllCOTSServerListeners: server already running _XSERVTransSocketUNIXCreateListener: ...SocketCreateListener() failed (II) systemd-logind: got pause for 226:2 (II) systemd-logind: got pause for 13:65 (II) systemd-logind: got pause for 13:68 (II) systemd-logind: got pause for 13:70 (II) systemd-logind: got pause for 13:67 (II) systemd-logind: got pause for 226:0 (II) systemd-logind: got pause for 13:66 (II) systemd-logind: got pause for 13:64 (II) systemd-logind: got pause for 13:69 (II) systemd-logind: got pause for 226:1
So I tried with the laptop I alluded to in comment 36 and it seems to work fine. There's a big caveat, though, that that laptop is optimus hardware and so xorg (after significant configuration on my part) is using prime support to mate the nvidia proprietary driver with the modesetting driver. I found an old laptop in a drawer that seems to be nvidia. i'll try that one next.
Perhaps NVIDIA should be contacted as well? They probably have a wider variety of hardware to test too.
looks like nvidia was a red herring. from irc today: <halfline> roshi: do you have nvidia with proprietary drivers? <halfline> oh <halfline> you said vm above <roshi> yeah <roshi> but yeah, it's a vm <halfline> interesting <nirik> but it's an installed vm? so the live booted ok to install? <halfline> what do you see after you tpye the password? <halfline> roshi: just the gray texture ? <roshi> yep <kparal> nirik: I always tried with installed <halfline> roshi: when that happens if you hit "escape" does it log you in? <halfline> (when you're sitting at the gray texture) <roshi> the live booted fine and installed <roshi> uh <roshi> yeah <roshi> lol <halfline> it does? <roshi> yeah, it did <roshi> and the enxt login worked fine too <halfline> \o/ <halfline> can i get a copy of your vm ? <halfline> been trying to reproduce that bug for weeks <halfline> it's actually what i was looking at this morning that was taking up my time <roshi> 3rd I had to hit escape again <halfline> yea every other time <adamw> huh, i haven't seen that one yet <roshi> ha, yeah, every other time <halfline> adamw: https://bugzilla.gnome.org/show_bug.cgi?id=754814 <roshi> but I don't get the black screen after logout <roshi> not once <adamw> halfline: i mean, i haven't hit it myself <adamw> yay for heisenbugs * adamw sticks 4GB in his VM and tries again <halfline> roshi: this is a throw away vm ? <halfline> roshi: can i get a copy? <roshi> I did see this "got pause for" stuff <roshi> you mean of the xml? or the whole kit and caboodle? <nirik> so where are we on this bug? I guess it's a blocker unless we can come up with a workaround or find it's hard to hit? <halfline> roshi: EVERYTHING
On Arch Linux, I can confirm this issue is triggered when using the linux-lts and nvidia-lts packages, whereas using xf86-video-intel works fine.
so i got roshi's vm and it works fine for me. this is very frustrating.
What VM software? Don't those simulate a graphics adapter with open specs?(usually not nvidia)
yea, he sees it with virt-manager. I don't know. I'm going to buy an nvidia card today and will report back.
I follow the step mention in Comment 18. Since I have no debugging experience before, I also list what I've done. I apologize if I've done something wrong and wasted your time Q_Q. I use GTX 980 with nvidia driver & nvidia-libgl Arch Linux gdm and gnome-shell 3.18.1 So I use abs, modify PKGBUILD and add options=(debug !strip). Then makepkg and install the package. Then I follow the step, and set logging on here is the output file. https://ptpb.pw/zHiJ The program crash at step 10, so I can't continue. Then I do journalctl -xe > journal.txt here is the file. (I've repeat the process about 3 times) https://ptpb.pw/BBHu I also found that it only happens if I install nvidia driver instead of nouveau. And replace gdm by sddm or lightdm solves the problem, which is the temporary workaround I did.
okay the video card i bought today reproduces the problem. will investigate in more detail now.
(In reply to Ray Strode [halfline] from comment #51) > okay the video card i bought today reproduces the problem. will investigate > in more detail now. Ok, thank you very much for your time.
So I have a somewhat better understanding of what's going on here, though i'm still little shaky on the full picture. Basically, the fix in bug 753891 doesn't work. The verification-complete signal isn't emitted by GDM for normal login operations, just for unlock operations. That's a bug in GDM we should probably fix, but it means we're not stopping the spinner. Furthermore one of the patches on that bug fixed a logic error that actually inadvertently caused the spinner to get stopped in situations where it's not now! So the bug had just the opposite effect it was supposed to, the spinner spins in cases now when it wouldn't have before. That wouldn't be a big deal (just increased resource usage) but since bug 753064 the timeout for the spinner is very fast (every 14ms, faster than 60fps). The nvidia driver makes the redraw operation (buffer swap) stall to rate limit screen updates to 60 fps. This buffer swap happens in a high priority source. Bottom line is, as soon as the update finishes, the spinner time out is already expired and ready to run again which leads to another update. It's a constant stream of: timeout since more than 14ms has passed -> redraw and wait 16 ms -> timeout since more than 14ms has passed -> redraw and wait 16 ms This means low priority main loop sources aren't serviced. Things we can do one or more of: 1) fix gdm to emit verification-complete for successful logins, not just successful unlocks 2) make sure the spinner gets stopped when the auth prompt is finished 3) ensure the timer for the spinner is run at a low enough priority that low priority sources get a chance to run 4) call start_session_when_ready directly instead of from an idle handler since there's really no point to having the idle handler any of these changes should fix the problem.
Created attachment 313774 [details] [review] gdm-session: emit verification-complete even for logins Right now we only emit verification-complete when the a user successfully reauthenticates. We should also do it when they successfully initially authenticate. This commit fixes that.
Created attachment 313777 [details] [review] gdm: don't emit start-session-when-ready from idle function There's no point in delaying the emission. We should do it right away.
Created attachment 313778 [details] [review] animation: Run every 16ms not ever 14ms Right now the spinner animation updates every 14ms. 60 frames per second would be one frame per 16.667ms, so we're waking up more frequently than we need to. This commit changes the wakeup to happen after 16ms.
Created attachment 313779 [details] [review] animation: do spinner animation with low priority It's very unexpected that a spinner animation would preempt idles from running. This commit runs the spinner animation with a low priority to ensure it doesn't take over the main loop.
Review of attachment 313779 [details] [review]: This is a good idea regardless.
Review of attachment 313777 [details] [review]: Hm. I can't find why this was added. If you think it's safe, then sure.
Review of attachment 313774 [details] [review]: Hm, yeah, this seems to be one of the only places we're changing behavior based on reauth vs. regular auth. I looked for more and couldn't find any.
Review of attachment 313778 [details] [review]: OK.
Attachment 313777 [details] pushed as 489b96a - gdm: don't emit start-session-when-ready from idle function Attachment 313778 [details] pushed as 6f26e39 - animation: Run every 16ms not ever 14ms Attachment 313779 [details] pushed as 5b7a052 - animation: do spinner animation with low priority
Attachment 313774 [details] pushed as 76e2a54 - gdm-session: emit verification-complete even for logins
Thank you for fixing this!
76e2a54 seems to crash gdm on every unlock, though
eek, i'll revert that for now until i can investigate then.
(In reply to Mantas Mikulėnas from comment #65) > 76e2a54 seems to crash gdm on every unlock, though Not for me. I've been running it patched with these exact patches since they came out. Just did a lock/unlock right now, no problems.
(In reply to Ales Katona from comment #67) > (In reply to Mantas Mikulėnas from comment #65) > > 76e2a54 seems to crash gdm on every unlock, though > > Not for me. I've been running it patched with these exact patches since they > came out. Just did a lock/unlock right now, no problems. slightly offtopic question, does the lock screen allow you a forth chance if you intentionally input the password incorrectly three times in a row?
(In reply to Hussam Al-Tayeb from comment #68) > slightly offtopic question, does the lock screen allow you a forth chance if > you intentionally input the password incorrectly three times in a row? Seems to work fine. I did four bad ones and then unlocked. It did change the "submit" button into "next" for a second there.
It occasionally gets stuck on 'next' here after the third failed attempt. I filed a bug about it.
Hello, I apologize for asking a stupid question, but when do the fixes mentioned above make it into Fedora 23? I upgraded from f22 to f23 and run into this exact issue with gdm and MATE Desktop (installed via "dnf group install 'MATE Desktop'" so it's not custom built). Would the fixes above help when logging in to MATE as opposed to GNOME 3? If not I'll open an issue with the MATE folks and reference this ticket. Thank you so much for all the work everyone does on GNOME, I use it every day and this is the first bug I've noticed!
Just wanted to add that I am also having this issue. Using extra/nvidia 355.11-4 and 4.2.5-1-ARCH x86_64. Also running extra/gdm 3.18.0-1 and extra/gnome-shell 3.18.1-2.
(In reply to Anthony Clark from comment #71) > Hello, > > I apologize for asking a stupid question, but when do the fixes mentioned > above make it into Fedora 23? Never, it was reverted because: (In reply to Mantas Mikulėnas from comment #65) > 76e2a54 seems to crash gdm on every unlock, though So we need a better fix.
To be clear, there were four independent fixes, that, alone, solved the problem. One of them crashed and got reverted. The other three should still work just fine.
The reason people are still seeing this is it hasn't made it into a release yet. 3.18.2 tarballs are due on monday. Fix should be in fedora 23 updates-testing by wednesday or thursday.
At this time gnome-shell 3.18.3 fixes this issue.
Just upgrade from 22 and face the same issue. But I'm on gnome-shell 3.18.3.
(In reply to Mantas Mikulėnas from comment #65) > 76e2a54 seems to crash gdm on every unlock, though I debugged this with grawity today
Created attachment 319571 [details] [review] session: keep session object alive while establishing credentials The only reference to session objects gets cleaned up when verification-complete is emitted, which happens in the middle of the establish_credentials handler. This commit makes sure the session object stays alive until the handler completes to prevent a crash.
Created attachment 319572 [details] [review] session: free conversation hash when disposing session We're currently leaking the hash table when disposing the session, this commit fixes that.
Attachment 319571 [details] pushed as 0254f33 - session: keep session object alive while establishing credentials Attachment 319572 [details] pushed as 4788e92 - session: free conversation hash when disposing session
*** Bug 761962 has been marked as a duplicate of this bug. ***
I am still experiencing this issue on GNOME 3.22.3 with Arch Linux. I cannot re-login and I am getting: (II) systemd-logind: got pause for 13:80 ... repeatedly. I don't think this issue has been fixed or at least something else is causing it. Also, ESC key doesn't seem to do anything for me.