GNOME Bugzilla – Bug 740424
Frame clock synchronization broken
Last modified: 2021-07-05 13:48:02 UTC
It seems that at some point the frame-clock synchronization broke. I'm attaching a patch that makes the alpha window test in testgtk draw really slowly (using the magic of sleep(1)) which makes this very visible. The window is resizing once a second, but the actual window contents is lagging one frame.
Created attachment 291087 [details] [review] TEST ONLY: Make alpha window testgtk test redraw slowly Testcase for
Any idea owen?
I tried to bisect, but this seems to happen back to at least gtk 3.12.0
Also tried with an older cairo (first snapshot with hidpi support), same results. Maybe we regressed in gnome-shell? (Using F21 here)
alexl: Can this problem still be seen? Asking as this is set as a 3.16 "GNOME Target"...
still happening, yes
Created attachment 298088 [details] standalone testcase I'm attaching an standalone testcase that allows modifying the sleep() duration, local experimentation (changing the timer and resizing the window continuously) shown some things: - For "frequent" refresh values (below ~540ms here, this barrier is blurry though), frame sync seems to work alright, contents and window frame always match size, even on noticeable delays. - From there on, it is quite frequent here that window resizes start out of sync after I change the timer, but if I stop moving the pointer and let window size catch up, resizes come out in-sync after that. - I hit another behavior change at ~980ms, where frame sync becomes broken as described here. In these last 2 stages, I occasionally saw mutter temporarily allowing free resize of the frame, which seems to hint that it's fooled into thinking frame sync doesn't apply. Probably also worth mentioning, this doesn't happen on wayland, so seems something in x11 backends.
So, the ~1s behavior barrier is apparently a timeout in mutter, blacklisting the window as "irresponsive". I'm however attaching a mutter patch that improves the behavior seen on step #2. As for the 1s timeout itself, we might perhaps just make it a bit longer, 1s delay is not something completely unreasonable on clients over the wire. Moving to mutter for the rest of discussion, it's all apparently fine in the GTK+ side.
Created attachment 298273 [details] [review] x11: Cut some slack to clients doing slow draws/resizes The timer to blacklist the window from frame sync is set at the time of issuing the sync request, but not removed until the client replies to the most recent wait serial. This means that if the client is slowly catching up, the timeout would fire up regardless of the client slowly updating the alarm to older values. Fix this by ensuring the timeout is reset everytime the sync request counter is updated, to acknowledge the client is not irresponsive, just slow.
Review of attachment 298273 [details] [review]: Makes sense to me
Attachment 298273 [details] pushed as 94c3c8f - x11: Cut some slack to clients doing slow draws/resizes
(In reply to Carlos Garnacho from comment #9) > Created attachment 298273 [details] [review] [review] > x11: Cut some slack to clients doing slow draws/resizes > > The timer to blacklist the window from frame sync is set at the time of > issuing the sync request, but not removed until the client replies to > the most recent wait serial. > > This means that if the client is slowly catching up, the timeout would > fire up regardless of the client slowly updating the alarm to older > values. > > Fix this by ensuring the timeout is reset everytime the sync request > counter is updated, to acknowledge the client is not irresponsive, > just slow. This sounds off to me. The entire point of frame sync is to avoid "catch up" - there should be only *one* in-flight sync request at a time. I'm wondering if what you are *actually* doing is resetting the counter on the start of the frame - when GTK+ updates the counter to an odd value before redrawing the window. I don't think there's much justification for that. Maybe this is NOTABUG? or maybe there is some problem with the handling of "marked broken clients" that use extended frame sync - the idea is that once we mark a client broken we should be resizing the frame fluidly - not slowly but out of sync. (Of course, marking a CSD client as broken probably doesn't make sense.)
I've reverted the patch, and reopening here. Something might still be off, as per case #2 on comment #7. It seemed to me as if 2 requests (accounting for more than 1s together) would accumulate. This case deserves some more investigation. Still, I don't think it is something to set back on the 3.16 target, this is after all a feature meant to trigger, just that it sometimes happens unintendedly
moving off target
Patch is still marked as ACN. I assume this should be rejected.
(In reply to Jasper St. Pierre (not reading bugmail) from comment #15) > Patch is still marked as ACN. I assume this should be rejected. carlosg: Any opinion / action?
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/mutter/-/issues/ Thank you for your understanding and your help.