GNOME Bugzilla – Bug 719368
Don't queue redraws when reallocating actor that haven't moved
Last modified: 2021-06-10 11:32:06 UTC
When support for implicit animation of actor position was added, the optimization for not queueing when allocating an actor back to the same location was lost. This optimization is important since when we are hierarchically allocating down from the top of the stage we constantly reallocate the actors at the top of the hierarchy back to the same place.
Created attachment 262873 [details] [review] Don't queue redraws when reallocating actor that haven't moved
Review of attachment 262873 [details] [review]: looks good to me. ::: clutter/clutter-actor.c @@ +9737,3 @@ klass->allocate (self, allocation, flags); CLUTTER_UNSET_PRIVATE_FLAGS (self, CLUTTER_IN_RELAYOUT); I would probably add a comment that says something to the effect of: /* callers should ensure to call queue_redraw() if needed */ at the end of the allocate_internal(), so that it doesn't get re-added in the future.
Pushed to clutter-1.16 and clutter-1.18 Attachment 262873 [details] pushed as 2e85269 - Don't queue redraws when reallocating actor that haven't moved
This makes entering the overview considerably more stutter-y to me. Removing the early return in clutter_actor_allocate seems to fix it, but I can't figure out why.
(In reply to comment #4) > This makes entering the overview considerably more stutter-y to me. Removing > the early return in clutter_actor_allocate seems to fix it, but I can't figure > out why. Can you make a screencast and examine it frame-by-frame and see if there's any misdrawing that contributes to this subjective experience?
My laptop stutters when recording a screencast no matter what (the GPU just isn't powerful enough), so the two screencasts were effectively identical. I'll have to show you in person next Thursday or something.
(In reply to comment #6) > My laptop stutters when recording a screencast no matter what (the GPU just > isn't powerful enough), so the two screencasts were effectively identical. I'll > have to show you in person next Thursday or something. Can you describe this "stuttering" behavior in more detail? Do you just mean a low frame rate, or do you mean something more complicated? (Misdrawing, uneven frame rate, animation going backward.) Does the output of: gnome-shell-perf-tool --perf=core --perf-iters=3 show signicant differences for the FPS going into the overview with and without the patch?
It's just a low frame rate. It seems to draws at 60fps before the patch, and after it feels like 2fps.
This completely breaks gdm drawing in Continuous; I don't get any visible user picker, and mousing over the top bar causes the old menu popups to stay visible.
I've frozen clutter for now in Continuous to the previous commit: https://git.gnome.org/browse/gnome-continuous/commit/?id=d29b8f592951772e14ebf2fb3cfb4f8c825b7c2a
(In reply to comment #9) > This completely breaks gdm drawing in Continuous; I don't get any visible user > picker, and mousing over the top bar causes the old menu popups to stay > visible. (In reply to comment #10) > I've frozen clutter for now in Continuous to the previous commit: > > https://git.gnome.org/browse/gnome-continuous/commit/?id=d29b8f592951772e14ebf2fb3cfb4f8c825b7c2a I suspect what you are actually seeing is Clutter/Cogl ABI problems and unrelated, but if you have a build now that works, we can unfreeze continuous and see if there are still issues once we sort out known issues.
(In reply to comment #11) > I suspect what you are actually seeing is Clutter/Cogl ABI problems You're right, sorry for the noise. Untagged clutter, and Continuous will resume tracking clutter-1.18.
I mostly understand the performance differences now, but the understanding raises more questions than answers. The way it was supposed to work: Buffer swapping: Process events Allocate Redraw Wait for Swap event for completion of swap X server waits for GPU to finish X server waits for VBlank X server swaps X server sends swap event Start new cycle Clipped redraw (without buffer age extension) Process events Allocate Redraw Wait for GPU to finish Wait for VBlank Copy Start new cycle But due to a bug in the handling of winsys features, Cogl wasn't noticing the presence of swap events. That meant that the "Wait for the Swap event" step above was skipped, and: With triple buffering (default for Intel driver), we had almost complete parallelization of work on the CPU and work on the GPU With double buffering (xorg.conf option for Intel driver, normal for other drivers), we had parallelization of steps before redraw with work on the GPU - when we got to actual drawing we'd block for completion of the previous frame. There are also of couple contributing factors beyond GPU/CPU parallelization. First there is power saving - the power saving code that's in the wild for Intel has a pathology - if rendering in a low power state would give you 50fps normally but you lock to VBlank and only do 30fps, then because the GPU utilization is small (around 60%) the power-saving code never realizes that it needs to kick up to a higher gear and you stay at 30fps. Chris Wilson has been working on patches to improve this - due to the complexity in the area, I'm unsure if they actually will help or not for us - I haven't had a chance to test. (Search for the patchset - "drm/i915: Boost RPS frequency for CPU stalls") Our usage of clutter_stage_set_sync_delay() also is hurting when the scene complexity and drawing time gets long. We have the situation: Draw the frame Wait for vblank Copy Wait for sync_delay Draw the next frame The sync delay really should vanish when we aren't going idle and aren't hitting 60fps. That's basically *why* - but what to do is less clear. The two contributing factors are things we should fix: * We should investigate the power saving situation and help the driver developers actually come up with a working solution. * We should adapt the set_sync_delay() code to work as above But if we fix the issue in Cogl with swap events being ignored, we actually make things worse - we'll bring the performance of buffer swapping (including partial buffer swapping with the buffer age extension) *down* to what we're seeing with the partial screen updates - by eliminating pipelining between frames. This may be the right anyways - the reason we *do throttle* that we *don't want triple buffering* is that we want the compositor to be responsive and to have consistent low latency. If we fix power management to not penalize us, then this may be acceptable. Performance improvements to the work we are doing on the CPU (optimizing allocation) and to the work we are doing on the GPU (perhaps finally use the depth buffer to reduce overdraw going to the overview) also obviously help. If we can hit 60fps without parallelizing the GPU of one frame with the CPU of the next, we don't need to pipeline frames.
See bug 719741 for the Cogl feature reporting issue.
I think we should either retitle/reassign this bug, or close it and open a new one against the shell.
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version of clutter, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a ticket at https://gitlab.gnome.org/GNOME/clutter/-/issues/ Thank you for your understanding and your help.