GNOME Bugzilla – Bug 757942
Input event (e. g. typing on keyboard) is sent repeatedly during high load
Last modified: 2019-11-05 20:52:21 UTC
On Fedora 23 with gnome-shell-3.18.1-1.fc23.x86_64 under Wayland, when computer is under heavy load (for example, several VMs are running), if computer "freezes" because of that heavy load, input event is send repeatedly. E. g. when I had several VMs running and I was typing in Firefox, characters were appearing repeatedly (I was trying to type "properly", but I have instead typed "prrrrrrrrrrrrrrrrrrrrroperly"). On another occasion, I wanted to start Emacs from Overview, I have clicked its icon and it started five times.
Not only keyboard, but often during even a light load, mouse pointer gets frozen temporarily (even though I'm moving the mouse constantly, the pointer gets stuck and then jumps to a different location shortly afterwards). It's just tens or hundreds of milliseconds, but it's noticeable.
(In reply to boloomka from comment #0) > On Fedora 23 with gnome-shell-3.18.1-1.fc23.x86_64 under Wayland, when > computer is under heavy load (for example, several VMs are running), if > computer "freezes" because of that heavy load, input event is send > repeatedly. > > E. g. when I had several VMs running and I was typing in Firefox, characters > were appearing repeatedly (I was trying to type "properly", but I have > instead typed "prrrrrrrrrrrrrrrrrrrrroperly"). On another occasion, I wanted > to start Emacs from Overview, I have clicked its icon and it started five > times. The keyboard problem is likely some problem with autorepeat working off of wall time rather than event timestamps. That doesn't explain clicking - are you sure you didn't click the icon 5 times because it wasn't responding?
(In reply to Kamil Páral from comment #1) > Not only keyboard, but often during even a light load, mouse pointer gets > frozen temporarily (even though I'm moving the mouse constantly, the pointer > gets stuck and then jumps to a different location shortly afterwards). It's > just tens or hundreds of milliseconds, but it's noticeable. This is not related - I think there's an existing bug talking about mouse smoothness.
(In reply to Owen Taylor from comment #2) > That doesn't explain clicking - are > you sure you didn't click the icon 5 times because it wasn't responding? I'm pretty sure that I run it only once. It's possible though, now that I think about it, that I've used "Enter" to run application rather than mouse clicking.
(In reply to Owen Taylor from comment #2) > The keyboard problem is likely some problem with autorepeat working off of > wall time rather than event timestamps. That doesn't explain clicking - are > you sure you didn't click the icon 5 times because it wasn't responding? How would key repeat 'working off event timestamps' look like for client-side key repeat ? The current code in gdkdevice-wayland.c simply uses a timeout. I guess we could force a roundtrip before each synthetic key event to ensure the key release event is not waiting on the compositor side.
(In reply to Matthias Clasen from comment #5) > (In reply to Owen Taylor from comment #2) > > > The keyboard problem is likely some problem with autorepeat working off of > > wall time rather than event timestamps. That doesn't explain clicking - are > > you sure you didn't click the icon 5 times because it wasn't responding? > > How would key repeat 'working off event timestamps' look like for > client-side key repeat ? The current code in gdkdevice-wayland.c simply uses > a timeout. I guess we could force a roundtrip before each synthetic key > event to ensure the key release event is not waiting on the compositor side. Hmmm - tricky. If repeat was done compositor side, I'd say that it was necessary to make sure that the priority of the event FD is higher than the priority of the timeout, so we read all pending events from the kernel before we generate a synthetic key event. That could still be racy, but would turn a likely problem into a rare problem. But making the priority of the client side timeout lower than the priority of the Wayland FD doesn't give you anything because after things freeze for a while, the kernel might schedule GTK+ first before it schedules the compositor in,. So you need some sort of "send me an event after you've read from the event FD and dispatched me all pending events" request. If in the compositor the wayland FD priority is lower than the priority of the event FD, that might be the same as just a round trip to the compositor, but that's making assumptions about compositor implementation. Would be better if the wayland server just handled repeat!
well two things we know: 1) key repeat is usually more than a frame 2) key events usually result in drawing So one idea, maybe as a near term workaround until the wayland protocol is improved, would be to wait for the frame clock to fire before doing a repeat if it hasn't fired yet by the time a repeat should happen. Doing that would affect the precision of the key repeat interval somewhat, but that probably doesn't really matter (at least it's far better than comment 0!)
Created attachment 322707 [details] [review] wayland: tie key repeat to frame clock key repeat is handled client side, which means stalls in the compositor dispatching key release events can lead to fictious repeat events. This commit ties key repeat to the frame clock, so the compositor and client are synchronized.
One case where this would break is when erasing a wrongly typed password in gnome-terminal, For example when stumbling upon a key late in a long password, I may hold down the backspace key for some time to erase what I typed. If we tie the key repeat to the frame rate, I'd just erase one letter. So I don't think we can tie it to the frame rate. Maybe we can require a round trip per key repeat event? Instead of emitting the gdk key event in the timer handler, we do wl_display_sync() and emit on the callback. A key-release event would not only destroy the timer source, but also the callback.
Comment on attachment 322707 [details] [review] wayland: tie key repeat to frame clock yea, indeed.
Created attachment 322769 [details] [review] wayland: make deliver_key_event return void deliver_key_event is sometimes called from a timeout handler and sometimes called directly. We currently erroneously return TRUE (G_SOURCE_CONTINUE) in the case where it's called directly, but to no ill effect, since we ignore that return value. In the future, we're going to need to call it directly in other parts of the code where the return value would be relevant and handling TRUE, would require adding redundant code. Instead, this commit just changes the code to always reset the timer manually, and never rely on glib's ability to automatically reset the timer by returning TRUE. This makes the code smaller, too, since there's less special casing required.
Created attachment 322770 [details] [review] wayland: handle key up events earlier in deliver_key_event We don't need the key repeat rate or anything like that when handling key up events, so do key up events first before querying for that information.
Created attachment 322771 [details] [review] wayland: synchronize key repeat with server key repeat is handled client side, which means stalls in the compositor dispatching key release events can lead to fictious repeat events. This commit ties key repeat to a server roundtrip to ensure the client and server are in sync.
(I do think handling key repeat in the server makes sense, but this, or something like it, should go in in the meantime)
Review of attachment 322771 [details] [review]: Since the server is likely not swamped, I don't see any reason to wait for both asynchronously -- I would just do a wait in deliver_key_event after we get the timeout, to simplify code. ::: gdk/wayland/gdkdevice-wayland.c @@ +1795,3 @@ + * key repeat until both the server responds and the timer fires. + */ + device->repeat_callback = wl_display_sync (display->wl_display); Can deliver_key_event get called while we have an active callback? Won't we leak it the existing sync in this case, then?
(In reply to Jasper St. Pierre from comment #15) > Since the server is likely not swamped, I don't see any reason to wait for > both asynchronously -- I would just do a wait in deliver_key_event after we > get the timeout, to simplify code. That means unnecessarily waiting for the roundtrip in the normal, non-swamped case, though. And I don't think doing the two waits serially instead of in parallel, practically speaking, would actually make the code simpler ? > Can deliver_key_event get called while we have an active callback? Won't we > leak it the existing sync in this case, then? Top of deliver_key_event calls stop_key_repeat, which cleans up the callback.
(In reply to Ray Strode [halfline] from comment #16) > That means unnecessarily waiting for the roundtrip in the normal, > non-swamped case, though. ... which is exactly what we want, no? I don't think it would cause that much additional latency. > And I don't think doing the two waits serially > instead of in parallel, practically speaking, would actually make the code > simpler ? Maybe not. It was suggested as a code cleanup, so if it doesn't clean up code and make it easier to understand, then don't do it.
(In reply to Jasper St. Pierre from comment #17) > (In reply to Ray Strode [halfline] from comment #16) > > That means unnecessarily waiting for the roundtrip in the normal, > > non-swamped case, though. > > ... which is exactly what we want, no? I don't think it would cause that > much additional latency. Well, if we do the roundtrip up front, then in the normal, non-swamped case the roundtrip should complete before the repeat timer fires, so we won't have to wait for it. We'll only have to wait if the server is lagging, which is close to the ideal way to behave I think. > Maybe not. It was suggested as a code cleanup, so if it doesn't clean up > code and make it easier to understand, then don't do it. Yea, doing it in parallel doesn't really add any extra state, so the code doesn't really get messier. Actually we're already getting a pretty good legibility clean from attachment 322769 [details] [review] by manually resetting the timer in all cases instead of just some cases.
I think Ray's approach is fine. Jonas, do you agree ?
(In reply to Matthias Clasen from comment #19) > I think Ray's approach is fine. Jonas, do you agree ? Yes. I do suspect that it'll sometimes result in one unexpected key repeat, since this way (calling wl_display_sync() on key-pressed) the synchronization point in the server could possibly be placed too early. A more reliable way would be for the client to do: -> wl_keyboard.key(pressed) * schedule repeat * repeat timer callback <- wl_display.sync -> wl_callback.done * emit key repeat event But this will always add unpredictable latency, since we'd always have to wait, but it would catch more incorrect repeats. Maybe the approach in the patch works in enough situations though; I think we should try it before considering a later-roundtrip approach.
oh I see what you're saying. Since we schedule the roundtrip right away, it's totally possible that it could come back before the server stalls, leading to a double character to come out. I guess that's what Jasper's intuition was telling him in the back of his head when he wrote comment 15, too, maybe. I think I'm going disagree with you on it being good enough. If the user hits a key once and it comes out twice that's just as bad as if the user hits it once and it comes out 15 times.
(In reply to Ray Strode [halfline] from comment #21) > oh I see what you're saying. Since we schedule the roundtrip right away, > it's totally possible that it could come back before the server stalls, > leading to a double character to come out. I guess that's what Jasper's > intuition was telling him in the back of his head when he wrote comment 15, > too, maybe. > > I think I'm going disagree with you on it being good enough. If the user > hits a key once and it comes out twice that's just as bad as if the user > hits it once and it comes out 15 times. Indeed that would not be good enough. My reason for thinking that it could possibly be good enough is if the stalling usually happens on the client, and if the server usually happens to read key events before reading client requests. If that's the case, an early sync would often be enough. Maybe its too fragile though. If sync on repeat-timer-callback is smooth enough, maybe it's the best solution for now then.
Created attachment 322876 [details] [review] wayland: synchronize key repeat with server key repeat is handled client side, which means stalls in the compositor dispatching key release events can lead to fictious repeat events. This commit ties key repeat to a server roundtrip to ensure the client and server are in sync.
(In reply to Jonas Ådahl from comment #20) > But this will always add unpredictable latency, since we'd always have to > wait, but it would catch more incorrect repeats. What I will say is that if the Wayland compositor is blocked, we already have unpredictable latency, in that our characters won't be shown for a few frames anyway. If we're already in the worst case scenario, I don't think we need to care too much about appearances then.
I guess the point is this: the roundtrip will vary in duration a little from one repeat event to another. If the roundtrips vary too much from each other, we're going to get dropped frames anyway, because the compositor must be stalled. If we aren't getting dropped frames, then the roundtrips are only varying within a few milliseconds of each other we're within the tolerance of perception, so the repeat won't look jittery.
Matthias asked me to push this now, but if anyone has any review comments on the patch let me know, and I'll do follow up work as neccessary. Attachment 322769 [details] pushed as 619799b - wayland: make deliver_key_event return void Attachment 322770 [details] pushed as 551f174 - wayland: handle key up events earlier in deliver_key_event Attachment 322876 [details] pushed as b528183 - wayland: synchronize key repeat with server
I cannot find those commits in git after a pull and the sha1 given (619799b 551f174 and b528183) point to bad object ids, are we sure it's been pushed?
I can see them from here.
ah right, in gtk, not mutter.
Although these patches have landed in 3.19.x, the issue is still present with Gtk+ 3.20.6 on Fedora 24 releases. Can you please reopen this bug or shall I report a new one?
I reckon commit b528183 should fix the key repeat issue as it syncs with the compositor before repeating the keys. If the problem still occurs in 3.20 then it means that either the patch is incomplete or this is something else (thinking of the similar Xwayland issue https://bugzilla.gnome.org/show_bug.cgi?id=762618) Can you give the reproducer steps?
No, no reproducer. I only noticed this issue with Firefox (so XWayland) so far, but I'm not sure it doesn't affect other applications too.
(In reply to Christian Stadelmann from comment #32) > No, no reproducer. I only noticed this issue with Firefox (so XWayland) so > far, but I'm not sure it doesn't affect other applications too. Ah right, if Xwayland this is https://bugzilla.gnome.org/show_bug.cgi?id=762618 for which a fix has landed upstrean in Xserver master.
More specifically these 4 commits: https://cgit.freedesktop.org/xorg/xserver/commit/?id=fda5675 https://cgit.freedesktop.org/xorg/xserver/commit/?id=239705a https://cgit.freedesktop.org/xorg/xserver/commit/?id=26ad25a https://cgit.freedesktop.org/xorg/xserver/commit/?id=88e981e But a backport of those in Xserver 1.18.3 is not completely straightforward (i.e. it won't apply as-is on top of xserver 1.18.x).
It's happening rarely, so I can just wait for xserver 1.19 to be released with Fedora 25
Hm, I'm still seeing this issue¹ rarely, and it is not limited to XWayland applications. Can you please reopen this bug or shall I create a new one? ¹ to be more specific: under high load, letters typed into text fields of any application get repeated. I'm running Fedora 25 with xorg-x11-server-Xwayland-1.19.0-1.fc25.x86_64 gnome-shell-3.22.2-2.fc25.x86_64 mutter-3.22.2-1.fc25.x86_64 libwayland-client-1.12.0-1.fc25.x86_64
(In reply to Christian Stadelmann from comment #36) > Hm, I'm still seeing this issue¹ rarely, and it is not limited to XWayland > applications. Can you please reopen this bug or shall I create a new one? > > ¹ to be more specific: under high load, letters typed into text fields of > any application get repeated. Any application? This is odd, I have reduced the key repeat delay to the minimum (usable) and increased the repeat rate to the maximum, while stressing the CPU, i/o and malloc ("stress --cpu 8 --io 4 --vm 2" on a 10 years old intel core2 duo), load avg peaks at more than 30, the mouse becomes very jerky and yet I am still able to type within gedit, firefox, etc. But of course, keys being repeated can happen, as it depends on the delay set by the user. What gives: $ gsettings get org.gnome.desktop.peripherals.keyboard repeat-interval $ gsettings get org.gnome.desktop.peripherals.keyboard delay On your setup?
(In reply to Olivier Fourdan from comment #37) > (In reply to Christian Stadelmann from comment #36) > > Hm, I'm still seeing this issue¹ rarely, and it is not limited to XWayland > > applications. Can you please reopen this bug or shall I create a new one? > > > > ¹ to be more specific: under high load, letters typed into text fields of > > any application get repeated. > > Any application? Multiple ones. I'm not quite sure which. I'm not 100% sure so I'll wait for the next time this happens, then I'll report back. > This is odd, I have reduced the key repeat delay to the minimum (usable) and > increased the repeat rate to the maximum, while stressing the CPU, i/o and > malloc ("stress --cpu 8 --io 4 --vm 2" on a 10 years old intel core2 duo), > load avg peaks at more than 30, the mouse becomes very jerky and yet I am > still able to type within gedit, firefox, etc. > > But of course, keys being repeated can happen, as it depends on the delay > set by the user. > > What gives: > > $ gsettings get org.gnome.desktop.peripherals.keyboard repeat-interval uint32 30 > $ gsettings get org.gnome.desktop.peripherals.keyboard delay uint32 500
(In reply to Olivier Fourdan from comment #37) > (In reply to Christian Stadelmann from comment #36) > > Hm, I'm still seeing this issue¹ rarely, and it is not limited to XWayland > > applications. Can you please reopen this bug or shall I create a new one? > > > > ¹ to be more specific: under high load, letters typed into text fields of > > any application get repeated. > > Any application? I ran into this again today with Firefox (running on XWayland)
(In reply to Christian Stadelmann from comment #38) > (In reply to Olivier Fourdan from comment #37) > > (In reply to Christian Stadelmann from comment #36) > > > Hm, I'm still seeing this issue¹ rarely, and it is not limited to XWayland > > > applications. Can you please reopen this bug or shall I create a new one? > > > > > > ¹ to be more specific: under high load, letters typed into text fields of > > > any application get repeated. > > > > Any application? Texstudio, a Qt5 application running on XWayland. So it seems I'm wrong and this issue is limited to XWayland applications.
(In reply to Olivier Fourdan from comment #37) > (In reply to Christian Stadelmann from comment #36) > > Hm, I'm still seeing this issue¹ rarely, and it is not limited to XWayland > > applications. Can you please reopen this bug or shall I create a new one? > > > > ¹ to be more specific: under high load, letters typed into text fields of > > any application get repeated. > > Any application? Also happens in gnome-shell activities overview during search.
On Fedora 25 Gnome, I am having this double-character problem with no significant loading on an HP Proliant N36L Microserver using KVM over an HP remote access card. Uncommenting #WaylandEnable=false in /etc/gdm/custom.conf solves the problem, so it is definitely a Wayland issue. I suspect that the timing of KVM events (relative to a directly-attached keyboard) is exposing a race in Wayland event handling. Please reopen bug.
I forgot to mention that Wayland's response to keyboard input is also sluggish. The statement about "no significant loading" is based on not specifically running an processor-hungry application; not measurement.
I reopened this bug as #777693 against gnome-shell.