GNOME Bugzilla – Bug 732350
Add perf/hwtest.js script and dependent facilities
Last modified: 2014-07-16 18:29:58 UTC
This adds a new performance test perf/hwtest.js and some facilities is depends upon. It differs from perf/core.js in that: * It tests gedit start times, and potentially in the future might script more parts of the GNOME desktop - it doesn't just test GNOME Shell. * It's meant to include a small carefully curated set of tests - instead of a broad-based set of numbers showing many things. This is so that perf.gnome.org will remain understandable. * It is better in some ways - since I wrote core.js I decided that measuring frame rates is a very coarse tool and the more interesting things is how long Mutter takes to draw a frame. My vague idea is that we should rework things in the future so that here is a "library" of tests that we can run in different combinations rather than two entirely separate scripts. But I do think having two different test sets on the toplevel makes sense.
Created attachment 279413 [details] [review] shell-perf-helper: Add an option for continual redraws Add an option for windows created with Scripting.createTestWindow() to continually redraw themselves; this is for testing performance of application updates.
Created attachment 279414 [details] [review] ShellPerfLog: Use monotonic timestamps for logging Use g_get_monotonic_time() (that is, CLOCK_MONOTONIC) timestamps, rather than gettimeofday() values.
Created attachment 279415 [details] [review] Use the EXT_timer_query GL extension to perf-log paint completion times The time from when we start processing a new frame to the time when the GPU is done drawing that frame determines our frame rate. The EXT_timer_query GL extension allows us to start an asynchronous query after drawing the frame and later collect a timestamp for when the GPU got to that point in the command stream. Perf log these timestamps to allow determining the frame drawing times.
Created attachment 279416 [details] [review] Add a new hwtest script Add a performance test script that is used by the GNOME Hardware Testing Project to measure metrics to be reported to perf.gnome.org.
Created attachment 279417 [details] [review] Add --hwtest option to gnome-shell-perf-helper Add a --hwtest option to gnome-shell-perf-helper which runs the tests in perf/hwtest.js with the appropriate environment, and then logs the results using the 'gnome-hwtest-log' utility which is available in the hardware testing environent. (For development of hwtest.js in a normal environment, run the tests as: gnome-shell-perf-tool --perf=hwtest --extra-filter=Gedit)
Created attachment 279418 [details] [review] Make frame-timestamp logging optional Instead of always logging frame timestamps for every frame - which even before the recent additions was using >26 bytes of memory per frame, or 5MB per hour of continuous redrawing - make frame timestamps something that is turned on using a new ShellGlobal::frame-timestamps property by the perf scripts.
Created attachment 279419 [details] [review] perf: Add fallback for paintCompletedTimestamp without ARB_timer_query If we don't have ARB_timer_query, then the only way to find out when GPU rendering finished is to glFinish(). Since this is better than nothing, add ShellGlobal::frame-timer-fallbacks property and enable it when figuring out rendering times in perf/hwtest.js. It should not be enabled when perf/core.js is measuring frame rates, since frame rates will be distorted.
Review of attachment 279413 [details] [review]: OK.
Review of attachment 279414 [details] [review]: OK.
Review of attachment 279414 [details] [review]: .
Review of attachment 279415 [details] [review]: ::: src/shell-global.c @@ +767,3 @@ + +#define TIMESTAMP 0x8E28 Is there some reference doc or header you can point to for this value?
Review of attachment 279416 [details] [review]: No detailed review, looks OK from a skim. Going to assume you got everything right.
Review of attachment 279417 [details] [review]: OK.
(In reply to comment #11) > Review of attachment 279415 [details] [review]: > > ::: src/shell-global.c > @@ +767,3 @@ > > + > +#define TIMESTAMP 0x8E28 > > Is there some reference doc or header you can point to for this value? Well its defined in the spec (https://www.opengl.org/registry/specs/ARB/timer_query.txt) but also in glext.h in (in a recent enough mesa) as GL_TIMESTAMP. So probably should just be: #include <GL/glext.h> ... #ifndef GL_TIMESTAMP #define GL_TIMESTAMP 0x8E28 #endif and then just use GL_TIMESTAMP
(In reply to comment #14) > (In reply to comment #11) > > Review of attachment 279415 [details] [review] [details]: > > > > ::: src/shell-global.c > > @@ +767,3 @@ > > > > + > > +#define TIMESTAMP 0x8E28 > > > > Is there some reference doc or header you can point to for this value? > > Well its defined in the spec > (https://www.opengl.org/registry/specs/ARB/timer_query.txt) but also in glext.h > in (in a recent enough mesa) as GL_TIMESTAMP. > > So probably should just be: > > #include <GL/glext.h> > ... > #ifndef GL_TIMESTAMP > #define GL_TIMESTAMP 0x8E28 > #endif > > and then just use GL_TIMESTAMP I don't think this has any advantage over what's in the code currently. It's not like GL_TIMESTAMP from glext.h could have some other value. On the other hand, just using glext.h is probably fine - GL_TIMESTAMP was added to glext.h just over 4 years ago, and the chances of anybody successfully compiling GNOME 3.14 against a system with an older mesa are low. Proprietary GL stacks will generally have GL_TIMESTAMP in the core (added as of GL 3.3) I'll go that way.
(In reply to comment #15) > > On the other hand, just using glext.h is probably fine - GL_TIMESTAMP was added > to glext.h just over 4 years ago, and the chances of anybody successfully > compiling GNOME 3.14 against a system with an older mesa are low. Proprietary > GL stacks will generally have GL_TIMESTAMP in the core (added as of GL 3.3) > > I'll go that way. OK that makes sense I didn't know when it was added but generally when extensions are added to mesa the constants are added as well (by updating the headers to the recent ones from Khronos).
Review of attachment 279418 [details] [review]: OK. You should probably mention that it defaults to FALSE now.
Review of attachment 279419 [details] [review]: Can you mention a few systems that don't have ARB_timer_query in the comment or commit message?
Created attachment 279639 [details] [review] Use the old-fashioned GL_EXTENSIONS way to check for extensions We were checking for ARB_timer_query using glGetStringi(GL_EXTENSIONS, i) but that is specific to GL 3, which meant that drivers without full GL 2 support showed up as not supporting the extension, even if they do. Use glGetString(GL_EXTENSIONS) instead to support all versions of GL.
Review of attachment 279639 [details] [review]: OK.
Review of attachment 279639 [details] [review]: I assume you mean "without full GL 3" support in the commit message.
Attachment 279413 [details] pushed as a7f8274 - shell-perf-helper: Add an option for continual redraws Attachment 279414 [details] pushed as f0d4260 - ShellPerfLog: Use monotonic timestamps for logging
Comment on attachment 279418 [details] [review] Make frame-timestamp logging optional Attachment 279418 [details] pushed as e309259 - Make frame-timestamp logging optional
Created attachment 280843 [details] [review] perf: add a ShellGlobal property to call glFinish and log the time ARB timer query turned out to be a bust - parallelization of adjacent frames means knowing when a frame completes on the GPU doesn't give you reliable benchmark information. Go to something simpler. ==== It's useful to know how long frames are taking to render on the GPU. This is impossible to measure in the normal case because frames may parallelize with previous frames, but by calling glFinish() at the end of the frame, we can create a (somewhat artificial, but useful) environment where we have a meaningful timestamp for the frame finishing drawing.
Created attachment 280844 [details] [review] Add a new hwtest script Add a performance test script that is used by the GNOME Hardware Testing Project to measure metrics to be reported to perf.gnome.org.
Created attachment 280845 [details] [review] Add --hwtest option to gnome-shell-perf-helper Add a --hwtest option to gnome-shell-perf-helper which runs the tests in perf/hwtest.js with the appropriate environment, and then logs the results using the 'gnome-hwtest-log' utility which is available in the hardware testing environent. (For development of hwtest.js in a normal environment, run the tests as: gnome-shell-perf-tool --perf=hwtest --extra-filter=Gedit)
Created attachment 280846 [details] [review] Connect to ClutterStage::after-paint with connect_after() Use connect_after() to accomodate code in GNOME Shell that when benchmarking drawing performance connects to ::after-paint and calls glFinish(). The timing information from that will be more accurate if we hold off until that completes before we signal apps to begin drawing the next frame. If there are no other connections to ::after-paint, connect() vs. connect_after() doesn't matter.
Review of attachment 280843 [details] [review]: OK.
Pushing hwtest.js patches based on reviews of previous versions; no significant changes here. Attachment 280843 [details] pushed as 9fff972 - perf: add a ShellGlobal property to call glFinish and log the time Attachment 280844 [details] pushed as 4e56af3 - Add a new hwtest script Attachment 280845 [details] pushed as c6350aa - Add --hwtest option to gnome-shell-perf-helper
Review of attachment 280846 [details] [review]: OK. ::: src/compositor/compositor.c @@ +631,3 @@ + /* We use connect_after() here to accomodate code in GNOME Shell that + * when benchmarking drawing performance connects to ::after-paint "that when benchmarking" is a bit awkward phrasing, and the newline in the way confused me a bit.
Review of attachment 280844 [details] [review]: Looks good, just one comment re journal parsing (see below). ::: js/perf/hwtest.js @@ +89,3 @@ + break; + + let m = /.*"__MONOTONIC_TIMESTAMP".*"([0-9]+)"/.exec(line); This is odd we have a JSON output and we parse it with a regex instead of simply using JSON.parse() ... it is more efficient this way if you stop reading after having what you wanted so probably should just terminate the loop after finding the timestamp. @@ +117,3 @@ + Scripting.scriptEvent('desktopShown'); + + Gtk.Settings.get_default().gtk_enable_animations = false; Any specific reason for this?
(In reply to comment #30) > Review of attachment 280846 [details] [review]: > > OK. > > ::: src/compositor/compositor.c > @@ +631,3 @@ > > + /* We use connect_after() here to accomodate code in GNOME Shell that > + * when benchmarking drawing performance connects to ::after-paint > > "that when benchmarking" is a bit awkward phrasing, and the newline in the way > confused me a bit. I'll add commas "that, when benchmarking drawing performance, .. ". Hopefully that will help :-)
(In reply to comment #31) > Review of attachment 280844 [details] [review]: > > Looks good, just one comment re journal parsing (see below). > > ::: js/perf/hwtest.js > @@ +89,3 @@ > + break; > + > + let m = /.*"__MONOTONIC_TIMESTAMP".*"([0-9]+)"/.exec(line); > > This is odd we have a JSON output and we parse it with a regex instead of > simply using JSON.parse() ... it is more efficient this way > if you stop reading after having what you wanted so probably should just > terminate the loop after finding the timestamp. At this point I have no idea why I wrote it this way. I may have temporarily forgot about JSON.parse() :-) Efficiency doesn't matter here - the journald command returns one line for each time the systemd graphical target was started in this boot - this is exactly one record in the benchmark case and would never be more than a few records. I'll look at cleaning this up to use JSON.parse() > @@ +117,3 @@ > + Scripting.scriptEvent('desktopShown'); > + > + Gtk.Settings.get_default().gtk_enable_animations = false; > > Any specific reason for this? We don't reliably have animations enabled - we turn them off for software rendering, in particular, so measuring the time to the overview or the time to applications showing doesn't given you comparable numbers if you have animations on. Also it's possible that the designers might tweak the animation lengths, and we don't want that to obscure performance changes. This means that we don't catch it if there's a regression where the animations themselves cause things to be slower or faster, but most things that will cause getting to the overview to be slow will show up with or without animations enabled. Basically just reducing extraneous noise in the numbers.
Created attachment 280879 [details] [review] hwtest.js: Don't parse JSON with regexps Use JSON.parse() to parse systemd log records, rather than using regexps.
(In reply to comment #33) > > @@ +117,3 @@ > > + Scripting.scriptEvent('desktopShown'); > > + > > + Gtk.Settings.get_default().gtk_enable_animations = false; > > > > Any specific reason for this? > > We don't reliably have animations enabled - we turn them off for software > rendering, in particular, so measuring the time to the overview or the time to > applications showing doesn't given you comparable numbers if you have > animations on. > > Also it's possible that the designers might tweak the animation lengths, and we > don't want that to obscure performance changes. > > This means that we don't catch it if there's a regression where the animations > themselves cause things to be slower or faster, but most things that will cause > getting to the overview to be slow will show up with or without animations > enabled. > > Basically just reducing extraneous noise in the numbers. OK makes sense.
Review of attachment 280879 [details] [review]: OK, that's better ;)
Comment on attachment 280846 [details] [review] Connect to ClutterStage::after-paint with connect_after() Attachment 280846 [details] pushed as 182a267 - Connect to ClutterStage::after-paint with connect_after()
Attachment 280879 [details] pushed as 46c86e0 - hwtest.js: Don't parse JSON with regexps