After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 740424 - Frame clock synchronization broken
Frame clock synchronization broken
Status: RESOLVED OBSOLETE
Product: mutter
Classification: Core
Component: general
3.13.x
Other Linux
: Normal normal
: ---
Assigned To: mutter-maint
mutter-maint
Depends on:
Blocks:
 
 
Reported: 2014-11-20 11:59 UTC by Alexander Larsson
Modified: 2021-07-05 13:48 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
TEST ONLY: Make alpha window testgtk test redraw slowly (952 bytes, patch)
2014-11-20 12:00 UTC, Alexander Larsson
none Details | Review
standalone testcase (1.40 KB, text/plain)
2015-02-27 14:02 UTC, Carlos Garnacho
  Details
x11: Cut some slack to clients doing slow draws/resizes (5.13 KB, patch)
2015-03-02 12:02 UTC, Carlos Garnacho
accepted-commit_now Details | Review

Description Alexander Larsson 2014-11-20 11:59:22 UTC
It seems that at some point the frame-clock synchronization broke.
I'm attaching a patch that makes the alpha window test in testgtk draw really slowly (using the magic of sleep(1)) which makes this very visible. The window
is resizing once a second, but the actual window contents is lagging one frame.
Comment 1 Alexander Larsson 2014-11-20 12:00:26 UTC
Created attachment 291087 [details] [review]
TEST ONLY: Make alpha window testgtk test redraw slowly

Testcase for
Comment 2 Alexander Larsson 2014-11-20 12:02:42 UTC
Any idea owen?
Comment 3 Alexander Larsson 2014-11-20 13:06:11 UTC
I tried to bisect, but this seems to happen back to at least gtk 3.12.0
Comment 4 Alexander Larsson 2014-11-20 13:13:22 UTC
Also tried with an older cairo (first snapshot with hidpi support), same results. Maybe we regressed in gnome-shell? (Using F21 here)
Comment 5 André Klapper 2015-02-22 21:10:36 UTC
alexl: Can this problem still be seen?

Asking as this is set as a 3.16 "GNOME Target"...
Comment 6 Matthias Clasen 2015-02-26 01:24:26 UTC
still happening, yes
Comment 7 Carlos Garnacho 2015-02-27 14:02:19 UTC
Created attachment 298088 [details]
standalone testcase

I'm attaching an standalone testcase that allows modifying the sleep() duration, local experimentation (changing the timer and resizing the window continuously) shown some things:

- For "frequent" refresh values (below ~540ms here, this barrier is blurry though), frame sync seems to work alright, contents and window frame always match size, even on noticeable delays.
- From there on, it is quite frequent here that window resizes start out of sync after I change the timer, but if I stop moving the pointer and let window size catch up, resizes come out in-sync after that.
- I hit another behavior change at ~980ms, where frame sync becomes broken as described here.

In these last 2 stages, I occasionally saw mutter temporarily allowing free resize of the frame, which seems to hint that it's fooled into thinking frame sync doesn't apply.

Probably also worth mentioning, this doesn't happen on wayland, so seems something in x11 backends.
Comment 8 Carlos Garnacho 2015-03-02 12:02:01 UTC
So, the ~1s behavior barrier is apparently a timeout in mutter, blacklisting the window as "irresponsive". I'm however attaching a mutter patch that improves the behavior seen on step #2.

As for the 1s timeout itself, we might perhaps just make it a bit longer, 1s delay is not something completely unreasonable on clients over the wire.

Moving to mutter for the rest of discussion, it's all apparently fine in the GTK+ side.
Comment 9 Carlos Garnacho 2015-03-02 12:02:56 UTC
Created attachment 298273 [details] [review]
x11: Cut some slack to clients doing slow draws/resizes

The timer to blacklist the window from frame sync is set at the time of
issuing the sync request, but not removed until the client replies to
the most recent wait serial.

This means that if the client is slowly catching up, the timeout would
fire up regardless of the client slowly updating the alarm to older
values.

Fix this by ensuring the timeout is reset everytime the sync request
counter is updated, to acknowledge the client is not irresponsive,
just slow.
Comment 10 Florian Müllner 2015-03-04 16:52:20 UTC
Review of attachment 298273 [details] [review]:

Makes sense to me
Comment 11 Carlos Garnacho 2015-03-05 16:35:32 UTC
Attachment 298273 [details] pushed as 94c3c8f - x11: Cut some slack to clients doing slow draws/resizes
Comment 12 Owen Taylor 2015-03-05 19:44:56 UTC
(In reply to Carlos Garnacho from comment #9)
> Created attachment 298273 [details] [review] [review]
> x11: Cut some slack to clients doing slow draws/resizes
> 
> The timer to blacklist the window from frame sync is set at the time of
> issuing the sync request, but not removed until the client replies to
> the most recent wait serial.
> 
> This means that if the client is slowly catching up, the timeout would
> fire up regardless of the client slowly updating the alarm to older
> values.
> 
> Fix this by ensuring the timeout is reset everytime the sync request
> counter is updated, to acknowledge the client is not irresponsive,
> just slow.

This sounds off to me. The entire point of frame sync is to avoid "catch up" - there should be only *one* in-flight sync request at a time. I'm wondering if what you are *actually* doing is resetting the counter on the start of the frame - when GTK+ updates the counter to an odd value before redrawing the window. I don't think there's much justification for that.

Maybe this is NOTABUG? or maybe there is some problem with the handling of "marked broken clients" that use extended frame sync - the idea is that once we mark a client broken we should be resizing the frame fluidly - not slowly but out of sync. (Of course, marking a CSD client as broken probably doesn't make sense.)
Comment 13 Carlos Garnacho 2015-03-12 17:50:34 UTC
I've reverted the patch, and reopening here. Something might still be off, as per case #2 on comment #7. It seemed to me as if 2 requests (accounting for more than 1s together) would accumulate. This case deserves some more investigation.

Still, I don't think it is something to set back on the 3.16 target, this is after all a feature meant to trigger, just that it sometimes happens unintendedly
Comment 14 Matthias Clasen 2015-03-12 20:59:09 UTC
moving off target
Comment 15 Jasper St. Pierre (not reading bugmail) 2015-12-21 21:19:52 UTC
Patch is still marked as ACN. I assume this should be rejected.
Comment 16 André Klapper 2017-08-16 20:28:53 UTC
(In reply to Jasper St. Pierre (not reading bugmail) from comment #15)
> Patch is still marked as ACN. I assume this should be rejected.

carlosg: Any opinion / action?
Comment 17 GNOME Infrastructure Team 2021-07-05 13:48:02 UTC
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org.
As part of that, we are mass-closing older open tickets in bugzilla.gnome.org
which have not seen updates for a longer time (resources are unfortunately
quite limited so not every ticket can get handled).

If you can still reproduce the situation described in this ticket in a recent
and supported software version, then please follow
  https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines
and create a new ticket at
  https://gitlab.gnome.org/GNOME/mutter/-/issues/

Thank you for your understanding and your help.