GNOME Bugzilla – Bug 309152
compositor speedup
Last modified: 2006-04-25 18:21:55 UTC
I'm trying to get metacity's compositor as fast as xcompmgr. the first patch and a description of it will follow.
Created attachment 48426 [details] [review] do not generate buffer_picture all the time this stores buffer_picture on screen, and only recomputes it when the root screen changes, just like xcompmgr does it. with this patch compositor runs visibly faster, but still slower than xcompmgr.
Created attachment 48427 [details] [review] correct expose region start coordinates set region start coordinates to event start coordinates, like xcompmgr does. it seems obvious, but maybe there is a reason why the region should start at (0, 0) that I didn't notice? this patch produces no visible speedup.
Created attachment 48479 [details] [review] do not generate buffer_picture all the time the previous patch was using screen without initializing it. This one changes it to go through all screens in order to find the correct screen.
Created attachment 48480 [details] [review] do not draw on idle, just on timeout this patch makes compositor as fast as xcompmgr. It turns out the g_idle_add was killing it: too much drawing slows it down. This patch makes it draw at 80Hz, that should be enough, and it really is fast now. If this is acceptable, the name of the function should be changed, and compositor->repair_idle and remove_repair_idle should be removed.
Comment on attachment 48480 [details] [review] do not draw on idle, just on timeout This makes sense but should clean out the references to the idle as you mention.
Thanks for these patches, I knew it was slow for some simple reasons but nobody had ever bothered to figure it out ;-)
Created attachment 48512 [details] [review] do not draw on idle, just on timeout OK, same as previous with references to idle removed
Just a guess... Duarte, you don't have a CVS account, right?
Right. Not that I would do much with one, probably.. This is the first set of non-trivial patches I'm trying to contribute.
A question for you: if we have the compositing manager working OK, we'd probably want to turn it on by default when the user's X server has the right extensions and they are fast enough to use. How can we tell whether they are fast enough? We need to ask the X.org guys what to key off of for "alpha blending is fast enough to be usable" or else metacity would have to do some crazy thing like run a benchmark on startup.
Soeren also has some fixes in the 'spiffifity' branch in CVS, it would be good to fold those together with these changes in the appropriate way and then land all of them.
I'm trying to run through the accepted patches and making sure they're committed (or marked as such in bugzilla if they have been), but I'm not sure whether we want this on HEAD, the spiffifity branch, or both. Soeren, Havoc: what would be best?
Created attachment 49075 [details] [review] correct expose region start coordinates This is the region done right, taken from xcompmgr.c It's a bit more complex, and has no speed improvement, so maybe that was the reason for exposing the whole screen in the first place? anyway, here goes.
in response to comment #10: I have no idea, as far as I know there is no API to do that... Is this 'spiffifity' branch also about making compositor fast, or about adding new features to it?
The spiffifity branch contains two things: A much faster version of the current metacity compositing manager. You can try it out by running cvs -z3 upd -r spiffifity -D 050225 I would appreciate comments on how this stacks up against metacity+your patches. One trick to get much much better Composite performance: Add Option "XaaNoOffScreenPixmaps" "Yes" to the "Device" section of your xorg.conf file. The other thing the spiffifity branch contains is a tower of hacks making it possible to draw the windows using GL. This is what you get if you do just "cvs upd -r spiffifity" without the "-D 050225". Some specific comments on your patches "Do not create buffer picture all the time": The only reason I can see that this would be faster is because you avoid filling the backing pixmap before drawing. But you need to fill it, because otherwise the areas that aren't covered by a window will be drawn with whatever the last thing you drew was. The reason you are probably not seeing any problems, is that you are likely running nautilus, which covers the entire desktop with a big window. The real fix to this problem is to carefully avoid drawing parts of windows that are covered by other (opaque) windows. The spiffifity code does this. "Timeout instead of idle" There is no reason this in itself should have any effect at all. It's not like you can get any more work done by throttling or anything. What is likely going on, is that a redraw is getting queued out of the idle handler so that metacity will keep spinning and process any incoming actual damage much more slowly as a result. I remember fixing something like this in the spiffifity branch, but I don't remember the details.
I checked out that branch (without the GL stuff), and tested both side by side, dragging a gvim window in front of some gnome-terminals. At first, metacity + my patches ran visibly faster and never used more than 8% CPU, while the spiffifity branch was using ~70% CPU and made windows lag behind. One particularly bad case is scrolling maximized gvim windows. Then I reversed the buffer_picture patch - you are right, I always used to test with nautilus on the background, and didn't notice that problem. Anyway, my advantage was gone: the performance dropped to one comparable to spiffifity, certainly lower because of the other fixes you mention. But the point is that generating the picture all the time is a real CPU hit. > The real fix to this problem is to carefully avoid drawing parts of windows > that are covered by other (opaque) windows. The spiffifity code does this. That's also a good improvement, but I think that's something else. If it's possible to just re-fill the damaged regions in buffer_picture, that would be the real fix. (I won't have much time to try it until next week, unfortunately) "Timeout instead of idle" Yes, that must be it. My patch was just a work around then, and I didn't see the real problem. I noticed there's only an idle and not a timeout in spiffifity, while in metacity they're both there - so maybe simply removing the timeout will do it. (for the record, just to make sure, I switched the idle to a timeout in spiffifity and sure enough, there were no performance gains ;)
> possible to just re-fill the damaged regions in buffer_picture, that would be > the real fix. (I won't have much time to try it until next week, unfortunately) After some investigation, I found the code that does this in xcompmgr, and copied it over to metacity + my patch: it solves the problem and the performance is still fine. I also tried to put it in spiffifity but I'm still unfamiliar with that code and couldn't get it to work. I'm confident it should make a very visible difference. Making the patch into acceptable code is going to take a while though. For reference, the relevant xcompmgr function is: root_tile()
Created attachment 49257 [details] [review] buffer_picture, fixed Here is the patch (for main branch) with the fix previously described. It's a bit ugly (static background_props and direct XGetProperty() calls), but it works. Should I target the spiffity branch instead from now on?
If there's only an idle and no timeout, isn't indefinite backlog/starvation a possibility if there's something constantly causing screen damage?
Comment on attachment 49257 [details] [review] buffer_picture, fixed Minor comment on this patch, root_tile could be bogus so you have to be sure there's an error trap when referring to it.
> If there's only an idle and no timeout, isn't indefinite backlog/starvation a > possibility if there's something constantly causing screen damage? Only if the damage is generated faster than metacity can read events. With an idle loop, all that happens is 1. read stuff off X connection, queue idle handler if necessary 2. run idle handler So if damage arrives faster than we can read() events, we will lose, but that doesn't happen in practice. And if it did, we would also lose with a timeout.
I got the insane idea to try to port XFWM4's composite manager to Metacity yesterday as I got some problems using the NET_WM_WINDOW_OPACITY property (see bug #150493). Davyd pointed me to the spif2 branch of MC, I built it, tried running, but it fcks up my screen (everything becomes green and pink). Do you think it's a good idea to attempt this? Or isn't it worth the effort, am I missing some great existing, working composite manager for MC?
Created attachment 56485 [details] [review] Combined composite patch for current HEAD Same patch as patch #56484, but against current HEAD. Some differences as HEAD != 2.12.2
Created attachment 56486 [details] [review] Composite patch for 2.12.2 Correction of patch #56484, sorry for the wrong upload.
Ok, I screwed up a little here, so I guess I should make clear what these lasts patches are: They both combine the patches provided here and in bug 310080. Next to this, some minor build fixes were added where necessary. Patch 56485 is against current HEAD Patch 56486 is against 2.12.2 Both versions were tested succesfully using closed nvidia drivers on xorg7.0. Speed is reasonable.
Reassigning to the compositor component so that Soeren can find these easier.
This is basically obsolete now that the compositor is based on GL.