GNOME Bugzilla – Bug 782344
Reduce CPU usage while rendering frequently-updating windows
Last modified: 2021-07-05 13:50:08 UTC
I'm well past the curve of diminishing returns with these patches, so I guess it's time to put these for review. I've been testing these patches on i7 (intel gpu), mainly with the following procedure: 1) start xterm, appears close to the top-left edge of the desktop area 2) launch "glxgears >/dev/null &" from there, move glxgears window to the bottom-right edge of the screen 3) measure with top/perf/etc from the xterm window As per top on an otherwise idle computer, CPU% usage has gone down from ~18%(±2) to ~11%(±2). From here on, I've found it hard to make a further dent. Next "low hanging" fruits might be: - Ensuring the MetaCullable cull areas are reusable across frequent repaints of the same area, might cut some 3-4% of total execution time - Enabling the "no error bit" as per https://www.khronos.org/registry/OpenGL/extensions/KHR/KHR_no_error.txt (requires mesa support for EGL). Might cut some 1-2% of total execution time in mesa validation, at the expense of making GL misuse crash-happy. But those start getting too complicated for a barely noticeable gain, so I stopped here.
Created attachment 351360 [details] [review] ClutterActor: Avoid frequent signal emission if possible Avoid signal emission for ::paint/::pick if no handlers are connected, which is the most frequent case.
Created attachment 351361 [details] [review] cogl: Mark vertex buffers as dynamic Those are cached and reused across runs, which doesn't qualify to mesa as "static" indeed. Properly marking those as dynamic is more true, and brings in slight performance benefits just by avoiding the resulting (and later silenced) mesa warning.
Created attachment 351362 [details] [review] ClutterActor: Preserve valid paint volumes till the next relayout/repaint Cuts down approximately 55% of paint volumes calculation when there's windows that redraw frequently, but don't move.
Created attachment 351363 [details] [review] ClutterActor: Call queue_redraw vfunc directly if possible Reduces some signal emission overhead.
Created attachment 351364 [details] [review] backends/native: Avoid generic closure marshaler for page flip handling This turns out more expensive than necessary.
Created attachment 351365 [details] [review] ClutterActor: Optimize away idempotent scale/position updates If the actor results in the same scale/position, there's no need to trigger a transition.
Created attachment 351366 [details] [review] wayland: Collect frame completion time once for all surfaces Dispatch all surface frames with the same monotonic time to avoid querying it too often.
Created attachment 351367 [details] [review] ClutterStage: Use non-generic marshaller for ::presented signal This signal runs often, so it's better to use a direct marshaler.
Created attachment 351368 [details] [review] ClutterStage: Store clip area as a region This will allow drawing optimizations as a region is more concrete than the bounding rectangle.
Created attachment 351369 [details] [review] compositor: Use redraw clip region to cull out children This will avoid repainting too much of the background if the bounding box turned out to be too large.
Created attachment 351370 [details] [review] clutter: Avoid rounding compensation when invalidating 2D actors This allows the redraw clip to be more constrained, so MetaCullable doesn't end up rendering portions of window shadows, frame and background when a window invalidates (part of) its contents.
Created attachment 351371 [details] [review] compositor: Ensure to clip partial shadow redraws Otherwise we end up drawing too far outside the clip area, which brings in artifacts now that we have tighter regions instead of an overlapping rect.
Created attachment 351372 [details] [review] backends: Store MUTTER_STAGE_VIEWS envvar content just once No need to poke this regularly while drawing the stage.
Created attachment 351373 [details] [review] wayland: Use weak ref to keep track of buffers Instead of using a one-shot signal meant to happen then the buffer is being destroyed.
Created attachment 351374 [details] [review] wayland: Use notify::allocation notification to update surface outputs Instead of updating the surface outputs on each actor ::paint.
Created attachment 351375 [details] [review] compositor: Remove MetaSurfaceActorWayland::painting signal It's now unused.
Created attachment 351376 [details] [review] clutter: Add ClutterPaintVolume argument to ClutterActor::queue_redraw This is an ABI break, hopefully an unimportant one since this signal/vmethod is barely overridden. The signal has been added an extra ClutterPaintVolume argument, and has been given a boolean return value. The recursion to the parents has been taken out of the default implementation and into the caller, using the returned boolean parameter to control further propagation. Passing the ClutterPaintVolume is easier on performance, as we don't need setting this pointer as gobject data just to retrieve/unset it further in propagation.
Created attachment 351377 [details] [review] compositor: Avoid changing pipeline/source if shadow is not being painted Avoids some context invalidations in cogl.
Created attachment 351378 [details] [review] compositor: Use queue_redraw vmethod over repaint-scheduled signal
Created attachment 351379 [details] [review] compositor: Remove MetaSurfaceActor::repaint-scheduled signal It's unused now.
Created attachment 351380 [details] [review] cogl: Ensure to only clear the depth buffer if depth testing is enabled The depth buffer is marked as invalid when 1) the framebuffer is just created, and 2) whenever GL_DEPTH_TEST is enabled on it. This will ensure the framebuffers attached depth buffer (if any) is properly cleared before it's actually used, while saving needless clears while depth testing is disabled (the default).
Review of attachment 351373 [details] [review]: This one doesn't look right. The 'resource-destroyed' signal is not emitted when the MetaWaylandBuffer is destroyed but the wl_resource of the wl_buffer is destroyed. Each MetaWaylandSurface who has a committed active buffer has its own reference to the MetaWaylandBuffer, and we still access the "buffer" via the texture associated with it, possibly after the wl_resource is destroyed (e.g. when it was a shm buffer, but we still fetch the texture). FWIW, this signal is only ever listened to during the time the buffer is pending, and the 'resource-destroy' handling is for handling situations where clients destroy the buffer after it was attached but before it was committed.
Review of attachment 351364 [details] [review]: looks good to me.
Review of attachment 351366 [details] [review]: lgtm.
Review of attachment 351372 [details] [review]: lgtm. should get rid of MUTTER_STAGE_VIEWS one day too.
Review of attachment 351374 [details] [review]: ::: src/wayland/meta-wayland-surface.c @@ +1286,3 @@ +surface_actor_property_notify (MetaSurfaceActorWayland *surface_actor, + GParamSpec *pspec, + MetaWaylandSurface *surface) not a fan of catch-all style functions with no information about the intention or activity in the name. It might be fewer lines, but looking at the function alone I'll have no idea why it does what it does.
Review of attachment 351367 [details] [review]: Lgtm.
Review of attachment 351375 [details] [review]: Still one user left: MetaWaylandPointerConstraint. But it can probably also use allocation.
(In reply to Jonas Ådahl from comment #22) > Review of attachment 351373 [details] [review] [review]: > > This one doesn't look right. The 'resource-destroyed' signal is not emitted > when the MetaWaylandBuffer is destroyed but the wl_resource of the wl_buffer > is destroyed. > > Each MetaWaylandSurface who has a committed active buffer has its own > reference to the MetaWaylandBuffer, and we still access the "buffer" via the > texture associated with it, possibly after the wl_resource is destroyed > (e.g. when it was a shm buffer, but we still fetch the texture). I see. It seems I missed the extra refs being taken. > > FWIW, this signal is only ever listened to during the time the buffer is > pending, and the 'resource-destroy' handling is for handling situations > where clients destroy the buffer after it was attached but before it was > committed. Hmm, it's not callback execution itself, but the frequent g_signal_connect/disconnect looked mildly hot on profiles. As per my testcase it probably comes from the xterm window (which barely redraws when running perf), so I'm not totally clear why. (In reply to Jonas Ådahl from comment #26) > Review of attachment 351374 [details] [review] [review]: > > ::: src/wayland/meta-wayland-surface.c > @@ +1286,3 @@ > +surface_actor_property_notify (MetaSurfaceActorWayland *surface_actor, > + GParamSpec *pspec, > + MetaWaylandSurface *surface) > > not a fan of catch-all style functions with no information about the > intention or activity in the name. It might be fewer lines, but looking at > the function alone I'll have no idea why it does what it does. Sure, will add 2 callbacks here.
Attachment 351364 [details] pushed as d0bfb94 - backends/native: Avoid generic closure marshaler for page flip handling Attachment 351366 [details] pushed as d6d01c8 - wayland: Collect frame completion time once for all surfaces Attachment 351367 [details] pushed as 3887d25 - ClutterStage: Use non-generic marshaller for ::presented signal Attachment 351372 [details] pushed as f2309cd - backends: Store MUTTER_STAGE_VIEWS envvar content just once
Created attachment 351421 [details] [review] wayland: Use notify::allocation notification to update surface outputs Instead of updating the surface outputs on each actor ::paint.
Created attachment 351422 [details] [review] wayland: Use notify::allocation to update pointer as per confinements There is no need to constraint the pointer to the confinement on each redraw if the surface actor didn't move/resize.
Review of attachment 351422 [details] [review]: lgtm.
Review of attachment 351421 [details] [review]: lgtm.
Review of attachment 351375 [details] [review]: lgtm, now that the constraints are updated.
Review of attachment 351373 [details] [review]: changing this to 'needs-work', as to get rid of the connect/disconnect overhead, we can use something like wl_signal (or bake our own).
Review of attachment 351360 [details] [review]: lgtm.
Review of attachment 351361 [details] [review]: makes sense.
Review of attachment 351363 [details] [review]: lgtm.
Review of attachment 351365 [details] [review]: Makes sense.
Review of attachment 351378 [details] [review]: ::: src/compositor/meta-window-actor.c @@ +203,3 @@ + + priv->repaint_scheduled = TRUE; + return CLUTTER_ACTOR_CLASS (meta_window_actor_parent_class)->queue_redraw (actor, leaf, paint_volume); nit: very long line, and (super nit) wouldn't mind a empty line before return. @@ -410,3 @@ g_object_ref_sink (priv->surface); - priv->repaint_scheduled_id = g_signal_connect (priv->surface, "repaint-scheduled", - G_CALLBACK (surface_repaint_scheduled), self); Are we really guaranteed to have meta_window_actor_queue_redraw() called each time a surface redraw is queued? If it'd be the other way around, I would understand, as queuing a redraw on an actor also queues it on all its children, but not the other way around right?
(In reply to Jonas Ådahl from comment #41) > Review of attachment 351378 [details] [review] [review]: > > ::: src/compositor/meta-window-actor.c > @@ +203,3 @@ > + > + priv->repaint_scheduled = TRUE; > + return CLUTTER_ACTOR_CLASS (meta_window_actor_parent_class)->queue_redraw > (actor, leaf, paint_volume); > > nit: very long line, and (super nit) wouldn't mind a empty line before > return. Oh sure, can change this in place. > > @@ -410,3 @@ > g_object_ref_sink (priv->surface); > - priv->repaint_scheduled_id = g_signal_connect (priv->surface, > "repaint-scheduled", > - G_CALLBACK > (surface_repaint_scheduled), self); > > Are we really guaranteed to have meta_window_actor_queue_redraw() called > each time a surface redraw is queued? If it'd be the other way around, I > would understand, as queuing a redraw on an actor also queues it on all its > children, but not the other way around right? queue_redraw is propagated all the way up to the stage, which then uses the "source" actor to determine the repaint area. As the surfaces are children of the window actor, I expect it to get notified along the way. I also played with removing this recursive behavior and invalidating the right portions in the stage right away, although the signal removed in this patch would be necessary then.
Attachment 351360 [details] pushed as 4b4c2b1 - ClutterActor: Avoid frequent signal emission if possible Attachment 351361 [details] pushed as bc041e0 - cogl: Mark vertex buffers as dynamic Attachment 351363 [details] pushed as d620189 - ClutterActor: Call queue_redraw vfunc directly if possible Attachment 351365 [details] pushed as 5cb5baa - ClutterActor: Optimize away idempotent scale/position updates Attachment 351375 [details] pushed as 27b949d - compositor: Remove MetaSurfaceActorWayland::painting signal Attachment 351421 [details] pushed as cf1edff - wayland: Use notify::allocation notification to update surface outputs Attachment 351422 [details] pushed as 27ea62a - wayland: Use notify::allocation to update pointer as per confinements
(In reply to Carlos Garnacho from comment #42) > (In reply to Jonas Ådahl from comment #41) > > Review of attachment 351378 [details] [review] [review] [review]: > > > > @@ -410,3 @@ > > g_object_ref_sink (priv->surface); > > - priv->repaint_scheduled_id = g_signal_connect (priv->surface, > > "repaint-scheduled", > > - G_CALLBACK > > (surface_repaint_scheduled), self); > > > > Are we really guaranteed to have meta_window_actor_queue_redraw() called > > each time a surface redraw is queued? If it'd be the other way around, I > > would understand, as queuing a redraw on an actor also queues it on all its > > children, but not the other way around right? > > queue_redraw is propagated all the way up to the stage, which then uses the > "source" actor to determine the repaint area. As the surfaces are children > of the window actor, I expect it to get notified along the way. > True. Next concern: AFAICS the ->queue_redraw() will happen on next _clutter_stage_do_update() as a side effect of the clutter_stage_maybe_finish_queue_redraws() call. This means that before update_texture() -> meta_window_actor_queue_frame_drawn() would work as expected, but now there is a a race where its not guaranteed that _clutter_stage_do_update() will happen before meta_window_actor_queue_frame_drawn().
Curious: are those patches targetted to land/backport in the 3.24.x series or is that only 3.26.x? (secretly hoping to see that in Fedora 26 :)
(In reply to Jean-François Fortin Tam from comment #45) > Curious: are those patches targetted to land/backport in the 3.24.x series > or is that only 3.26.x? (secretly hoping to see that in Fedora 26 :) Sorry to crush your hopes :), the patches bringing the most substantial improvements involve quite hairy changes, including an ABI break. I think this is only master material...
These three commits breaks wl_surface.enter/leave when moving: wayland: Use notify::allocation notification to update surface outputs wayland: Use notify::allocation to update pointer as per confinements compositor: Remove MetaSurfaceActorWayland::painting signal (well the confinement one doesn't but it probably has the same issue) The issue is that allocation only changes on size changes, not on just moving. notify::position doesn't work either sadly. What can we listen on that is triggered when moving the window?
(In reply to Carlos Garnacho from comment #6) > Created attachment 351365 [details] [review] [review] > ClutterActor: Optimize away idempotent scale/position updates > > If the actor results in the same scale/position, there's no need to > trigger a transition. Looks like this commit broke Epiphany, bug #788140.
(In reply to Michael Catanzaro from comment #48) > (In reply to Carlos Garnacho from comment #6) > > Created attachment 351365 [details] [review] [review] [review] > > ClutterActor: Optimize away idempotent scale/position updates > > > > If the actor results in the same scale/position, there's no need to > > trigger a transition. > > Looks like this commit broke Epiphany, bug #788140. FWIW, there was bug #784314 before that :). It does seem to me like the commit uncovered a bug elsewhere though, I promise I'll revert on gnome-3-26 after branching if I don't get to a nice fix before.
(In reply to Carlos Garnacho from comment #49) > FWIW, there was bug #784314 before that :). It does seem to me like the > commit uncovered a bug elsewhere though, I promise I'll revert on gnome-3-26 > after branching if I don't get to a nice fix before. I'll look forward to seeing this bug reappear in GNOME 3.28 then. :P
*** Bug 788444 has been marked as a duplicate of this bug. ***
Created attachment 361181 [details] [review] wayland: Trigger wl_output updates on actor position changes Both notify::position on the surface actor and position-changed on MetaWindow are listened to, in order to trigger wl_output updates for wl_surfaces whenever the surfaces move across them. Both signals are necessary in order to cater for subsurface and toplevel relocations (Because it's the parent window actor what changes position in this case). Also, shuffle signal disconnection, so each signal goes away with the reference in MetaWaylandSurface to the connected object.
Created attachment 361182 [details] [review] wayland: Update pointer confinement on surface actor relocations In the unlikely case that a surface is moved by the compositor while holding a pointer confinement, we also need to update the pointer position when the surface actor gets moved.
These last 2 patches fix the issues pointed out on comment #47
Review of attachment 361182 [details] [review]: We previously did this on paint, meaning we also cought allocation changes.
Review of attachment 361181 [details] [review]: lgtm.
(In reply to Jonas Ådahl from comment #55) > Review of attachment 361182 [details] [review] [review]: > > We previously did this on paint, meaning we also cought allocation changes. FTR after some IRC chat we're fine with this patch. Allocation changes are already handled in current code. I'm pushing it as-is.
Attachment 361181 [details] pushed as 08e4cb5 - wayland: Trigger wl_output updates on actor position changes Attachment 361182 [details] pushed as 01de04d - wayland: Update pointer confinement on surface actor relocations
attachment 351362 [details] [review] sounds like it could have a serious impact on reducing CPU usage, based on what I've seen while profiling gnome-shell. Recalculating paint volumes uses a significant amount of CPU time. Maybe that and other patches need some attention again.
Carlos, everyone, Do you see the uncommitted patches here still being useful? And so should this bug stay open?
@Daniel van Vugt: Carlos just pointed me to attachment 351368 [details] [review] and attachment 351369 [details] [review] to reduce overdraw. I'll have a look into them as they could possibly hold quite substantial power savings, especially as more apps start to use the Wayland APIs to reduce their power usage (mainly thinking about Firefox here - the Webrender backend will hopefully make use of it sometime soon).
GNOME is going to shut down bugzilla.gnome.org in favor of gitlab.gnome.org. As part of that, we are mass-closing older open tickets in bugzilla.gnome.org which have not seen updates for a longer time (resources are unfortunately quite limited so not every ticket can get handled). If you can still reproduce the situation described in this ticket in a recent and supported software version, then please follow https://wiki.gnome.org/GettingInTouch/BugReportingGuidelines and create a new ticket at https://gitlab.gnome.org/GNOME/mutter/-/issues/ Thank you for your understanding and your help.