GNOME Bugzilla – Bug 618189
Framework for performance measurement
Last modified: 2010-05-22 12:53:49 UTC
We should have a way of quantifying the shell performance - for extracting reproducible numbers for the smoothness of operation, then latency when clicking on "Activities", the total memory usage, and so forth. It would also be nice, at that same time, to be able to a time series data over a single run of the shell that can be viewed and explored interactively or compared with other runs of the shell. This set of patches adds: A performance event log A facility for scripting the shell to generate performance events, and then replaying the log to collect "metrics" (This is closely related to bug 613048, but I'm filing it separately to avoid confusing two somewhat different approaches. I'm planning to extend this patch in the future to be able to register "statistics" - which are functions that measure parameters like memory usage and store the result into the event log.)
Created attachment 160645 [details] [review] Add "leisure function" capability To support scheduling performance-measurement scripts that want to run a number of actions in series, add shell_global_run_at_leisure() to run a callback when all work is finished. The initial implementation of this is not that accurate: we track business in Tweener.js via new shell_global_begin_work(), shell_global_end_work() functions, and we also handle the case where the main loop is continually busy.
Created attachment 160646 [details] [review] Add ShellPerfLog ShellPerfLog provides a way for different parts of the code to record information for subsequent analysis and interactive exploration. Events exist of a timestamp, an event ID, and arguments to the event.
Created attachment 160647 [details] [review] [perf] Add events before and after stage paint Add performance events: clutter.stagePageStart clutter.stagePageDone to track frame repaints.
Created attachment 160648 [details] [review] [perf] Add a facility for automated performance measurement We want to be able to summarize the behavior of the shell's performance in a series of "metrics", like the latency between clicking on the Activities button and seeing a response. This patch adds the ability to create a script under perf/ in a special format that automates a series of actions in the shell, writing events to the performance log, then collects statistics as the log as replayed and turns them into a set of metrics. The script is then executed by running as gnome-shell --perf=<script>. The 'core' script which is added here will be the primary performance measurement script that we use for shell performance regression testing; right now it has a couple of placeholder metrics.
Created attachment 160856 [details] [review] [perf] Add ShellPerfLog Various fixes and cleanups compared to the last version
Created attachment 160857 [details] [review] [perf] Add events before and after stage paint Reattaching to keep things in order, no changes
Created attachment 160858 [details] [review] [perf] Add a facility for automated performance measurement Small fix for scripts throwing exceptions compared to last version
Created attachment 160859 [details] [review] [perf] Add "statistics" functionality to ShellPerfLog Add a facility for defining and recording numeric statistics as performance events shell_perf_log_define_statistic() Define a statistic and the corresponding event type shell_perf_log_update_statistic_[ix]() - Update the stored statistic value shell_perf_log_add_statistics_callback() Add a function called before collecting statistics shell_perf_collect_statistics() Update statistics and record events for them In addition to the explicit collection, statistics are recorded at a periodic interval (currently 5s)
Created attachment 160860 [details] [review] [perf] Add simple malloc statistics and metrics Add some basic statistics for allocated memory based on mallinfo(), and use that to define two metrics: usedAfterOverview: bytes used after the overview is shown once leakedAfterOverview: additional bytes used when the overview is shown a second time.
Created attachment 160869 [details] [review] Move shell-running machinery to a function Move the code that starts gnome-shell and waits for it to exit to a function in preparation for running it multiple times when doing iterations of a performance test.
Created attachment 160870 [details] [review] [perf] Allow running multiple iterations of a performance test Add gnome-shell options: --perf-iters=ITERS" Numbers of iterations of performance module to run --perf-warmup Run a dry run before performance tests Make a successful run of a performance test return 0 not non-zero, and handle the difference between that and a 0-exit in normal usage (meaning replaced) in the wrapper.
Created attachment 160931 [details] [review] Add shell_write_string_to_stream() Add a helper function to write a string as UTF-8 to a GOutputStream. The signature of GOutputStream: gboolean g_output_stream_write_all (GOutputStream *stream, const void *buffer, gsize count, gsize *bytes_written, GCancellable *cancellable, GError **error); Can't currently be handled by GJS.
Created attachment 160932 [details] [review] [perf] Dump a complete report from a performance run, as JSON When SHELL_PERF_OUTPUT is set, instead of just dumping out the metrics, dump a more complete report with: - Event descriptions - Metric descriptions and value - Event log Helper functions shell_perf_log_dump_events() and shell_perf_log_dump_log() are added to ShellPerfLog to support this. The gnome-shell wrapper is adapted to deal with the changed report format.
Review of attachment 160931 [details] [review]: We should consider adding this to Gio, but looks fine.
Created attachment 160936 [details] [review] [perf] Add an option to dump a complete performance report Add --perf-output=<filename> option to gnome-shell that combines the reports written for each run by the C/Javascript code into a complete report. If this option is not specified, a brief human-readable summary is printed to stdout instead.
Review of attachment 160645 [details] [review]: This looks fine in general. ::: src/shell-global.c @@ +1418,3 @@ + * timed individually. Using this function for other purposes will + * interfere with the ability to use it for performance measurement so + * should be avoided. If it shouldn't be used for anything else, should it be called shell_global_queue_perf_function or something?
Review of attachment 160856 [details] [review]: Looks fine, just minor comments. ::: src/shell-perf-log.c @@ +269,3 @@ + + perf_log->last_time = event_time; + Should we be setting last_time after the check for oversize events? @@ +337,3 @@ + + record_event (perf_log, get_time(), event, + (const guchar *)&arg, 4); This is totally minor, but I think the "4" would be better written as sizeof(arg) to avoid any copy&paste issues later.
Review of attachment 160857 [details] [review]: Looks fine
Created attachment 161243 [details] [review] [perf] Add units to metrics definitions Switch from having separate METRICS and METRIC_DESCRIPTIONS objects in a perf module to a single METRICS array. This is done so the perf module can define the units for each metric. In addition to improving the output in the web interface, the purpose of having units is to give some clue about how to pick from multiple values from different runs. In particular, with the assumption that "noise" on the system will increase run times, for time values we want to pick the smallest values, while for "rate" values, we want to pick the largest value.
Review of attachment 160858 [details] [review]: ::: js/perf/core.js @@ +10,3 @@ + +const Main = imports.ui.main; +/* -*- mode: js2; js2-basic-offset: 4; indent-tabs-mode: nil -*- */ If we're going to be timing this, I'd like to get data on number of installed applications at least. ::: js/ui/scripting.js @@ +19,3 @@ + +const GLib = imports.gi.GLib; +/* -*- mode: js2; js2-basic-offset: 4; indent-tabs-mode: nil -*- */ This is quite clever! However unlike a lot of other things, this will mean tricky code reworking if we wanted to use an engine other than SpiderMonkey. Not something to worry about quite yet I guess though. @@ +138,3 @@ + + let contents = JSON.stringify(result); + GLib.file_set_contents(outputFile, contents, contents.length); This isn't actually right, since contents.length is the UTF-16 character count, not bytes. Use your ShellGlobal gio wrapper instead?
Review of attachment 160859 [details] [review]: ::: src/shell-perf-log.c @@ +114,3 @@ + * triggered. + * events are enabled. Statistics collection can also be explicitly +/* Number of milliseconds between periodic statistics collection when I like to add explicit unit suffixes to constants like this, i.e. _MS. Though you might consider changing it to 5 and using g_timeout_add_seconds below. @@ +193,3 @@ +{ +statistics_timeout (gpointer data) +static gboolean An interesting statistic would be how long statistics collection took, which could be hooked in here. @@ +220,3 @@ + perf_log->enabled = enabled; + { + if (enabled != perf_log->enabled) Would like a perf_log->statistics_timeout_id = 0 here even if correctness doesn't depend on it now.
Review of attachment 160860 [details] [review]: ::: src/gnome-shell-plugin.c @@ +241,3 @@ +{ + gpointer data) +malloc_statistics_callback (ShellPerfLog *perf_log, I don't know if this is implemented outside of glibc; may need an autoconf test.
Review of attachment 160869 [details] [review]: Looks fine.
Review of attachment 160870 [details] [review]: ::: js/ui/scripting.js @@ +204,3 @@ _collect(scriptModule, outputFile); function() { + Meta.exit(Meta.ExitCode.SUCCESS); Heh, I was going to comment on that earlier, but then thought "eh...doesn't matter"
Review of attachment 160932 [details] [review]: Looks good. ::: src/shell-perf-log.c @@ +774,3 @@ +escape_quotes (const char *input) +static char * + This isn't going to handle UTF-8, but oh well. I guess we can just say don't put any snowmen in your event description.
Review of attachment 160936 [details] [review]: Ok.
Review of attachment 161243 [details] [review]: Ok
Created attachment 161394 [details] [review] [perf] Include monitor layout in performance reports Add extra key, 'monitors' to performance reports which is an array of strings: *<width>x<height>+<x>+<y> Where * marks the primary monitor.
(In reply to comment #16) > Review of attachment 160645 [details] [review]: > > This looks fine in general. > > ::: src/shell-global.c > @@ +1418,3 @@ > + * timed individually. Using this function for other purposes will > + * interfere with the ability to use it for performance measurement so > + * should be avoided. > > If it shouldn't be used for anything else, should it be called > shell_global_queue_perf_function or something? You may have a point there, but I think it just gets too long-winded if we say *both* what the function does and what it is for, and I don't think it's an improvement to make it just say what it's for without describing what the function actually does. That is, I don't want to call it: shell_global_add_leisure_function_for_perf_script() And I don't want to call it: shell_global_queue_function_for_perf_script() So, I'm going to leave it as is.
(In reply to comment #17) > Review of attachment 160856 [details] [review]: > > Looks fine, just minor comments. > > ::: src/shell-perf-log.c > @@ +269,3 @@ > + > + perf_log->last_time = event_time; > + > > Should we be setting last_time after the check for oversize events? Good catch. > @@ +337,3 @@ > + > + record_event (perf_log, get_time(), event, > + (const guchar *)&arg, 4); > > This is totally minor, but I think the "4" would be better written as > sizeof(arg) to avoid any copy&paste issues later. Yeah, better that way.
(In reply to comment #21) > Review of attachment 160858 [details] [review]: > > ::: js/perf/core.js > @@ +10,3 @@ > + > +const Main = imports.ui.main; > +/* -*- mode: js2; js2-basic-offset: 4; indent-tabs-mode: nil -*- */ > > If we're going to be timing this, I'd like to get data on number of installed > applications at least. My current workaround is just to measure hot-cache numbers which takes most of the worst part of the delta out of the picture. But yes, there has to be some way of getting apples-to-apples comparisons with different numbers of applications and, as importantly, different numbers of files. > ::: js/ui/scripting.js > @@ +19,3 @@ > + > +const GLib = imports.gi.GLib; > +/* -*- mode: js2; js2-basic-offset: 4; indent-tabs-mode: nil -*- */ > > This is quite clever! However unlike a lot of other things, this will mean > tricky code reworking if we wanted to use an engine other than SpiderMonkey. > Not something to worry about quite yet I guess though. I'm not going to change it right now, but when I thought it over, I couldn't see any real problems with just recursing the main loop. You can't do that immediately from main.js, since we want to return and letter Mutter run the main loop after initialization finishes, but at initialization we could add a leisure function, then the leisure function could run the script, recursing the main loop as it needs to wait. Not as clever, but not as Spidermonkey specific. > @@ +138,3 @@ > + > + let contents = JSON.stringify(result); > + GLib.file_set_contents(outputFile, contents, contents.length); > > This isn't actually right, since contents.length is the UTF-16 character count, > not bytes. Use your ShellGlobal gio wrapper instead? This got rewritten to do that in a later patch, so I'm not going to bother fixing up the intermediate state here. (And it pretty much is always ASCII anyways.)
(In reply to comment #23) > Review of attachment 160860 [details] [review]: > > ::: src/gnome-shell-plugin.c > @@ +241,3 @@ > +{ > + gpointer data) > +malloc_statistics_callback (ShellPerfLog *perf_log, > > I don't know if this is implemented outside of glibc; may need an autoconf > test. Dan commented to that effect on your usage of mallinfo(), I think I'm going to let the person who hits the compilation failure write the checks - partly because I'm more worried about having struct mallinfo have different fields or different meaning for fields than I am about it being missing entirely. (It has some pre-glibc heritage. (The glibc manual describes it as "This function and its associated data type are declared in `malloc.h'; they are an extension of the standard SVID/XPG version")
(In reply to comment #22) > Review of attachment 160859 [details] [review]: > > ::: src/shell-perf-log.c > @@ +114,3 @@ > + * triggered. > + * events are enabled. Statistics collection can also be explicitly > +/* Number of milliseconds between periodic statistics collection when > > I like to add explicit unit suffixes to constants like this, i.e. _MS. Though > you might consider changing it to 5 and using g_timeout_add_seconds below. Fixed. > @@ +193,3 @@ > +{ > +statistics_timeout (gpointer data) > +static gboolean > > An interesting statistic would be how long statistics collection took, which > could be hooked in here. Yeah, it's probably good to catch if we start doing something too expensive there. Rather than make it a statistic, I made in a argument to the statisticsCollected event. Interestingly, even the single mallinfo() call is a bit "slow" - it takes about 50us to call mallinfo() on this system. > @@ +220,3 @@ > + perf_log->enabled = enabled; > + { > + if (enabled != perf_log->enabled) > > Would like a perf_log->statistics_timeout_id = 0 here even if correctness > doesn't depend on it now. I assume you mean in the else where we remove it. Yeah, definitely cleaner that way.
(In reply to comment #26) > Review of attachment 160932 [details] [review]: > > Looks good. > > ::: src/shell-perf-log.c > @@ +774,3 @@ > +escape_quotes (const char *input) > +static char * > + > > This isn't going to handle UTF-8, but oh well. I guess we can just say don't > put any snowmen in your event description. Handles UTF-8 fine. If you simply pass non-ASCII bytes through from input to output, UTF-8 will be preserved.
Attachment 160645 [details] pushed as a9a513c - Add "leisure function" capability Attachment 160869 [details] pushed as 5d0536d - Move shell-running machinery to a function Attachment 160931 [details] pushed as e4e92a2 - Add shell_write_string_to_stream()
Created attachment 161638 [details] [review] [perf] Add --perf-upload option to gnome-shell Add a command line option to upload the generated performance report to a web service. The options for the upload (url, system name, secret key) are read from ~/.config/gnome-shell/perf.ini.
Review of attachment 161394 [details] [review]: yeah, looks right
Review of attachment 161638 [details] [review]: ::: src/gnome-shell.in @@ +294,3 @@ + try: + config = RawConfigParser() + f = open(os.path.expanduser("~/.config/gnome-shell/perf.ini")) haven't we fixed the rest of shell to obey g_get_user_config_dir() now? @@ +302,3 @@ + secret_key = config.get('upload', 'key') + except Exception, e: + print "Can't read upload configuration from ~/.config/gnome-shell/perf.in: " + str(e) missing "i" at the end of "perf.ini" @@ +305,3 @@ + sys.exit(1) + + # Determine host, port and upload URL from provided data what is the point of all of this, as opposed to just doing upload_url = base_url + '/system/%s/upload' % system_name ? If we're worried that the user can't copy it correctly into perf.ini, then aren't they also going to mess up the private key? Also, we can just provide them a complete already-written perf.ini file (via email or download) rather than providing them with the data and making them write the file themselves. @@ +352,3 @@ + 'User-Agent': 'gnome-shell', + 'Content-Type': 'application/json', + 'X-Shell-Signature': 'HMAC-SHA1, ' + signature no comma please. ("Header: foo, bar" is defined to be equivalent to "Header: foo\r\nHeader: bar")
Created attachment 161659 [details] [review] [perf] Add --perf-upload option to gnome-shell New version: * Honors XDG_CONFIG_HOME (open coded since the glib function isn't bound in pygobject and dont' want to add a pyxdg dep) * Fixes perf.in typo * Removes comma from X-Shell-Signature header * Leaves over-complex URI manipulation in place, though simplified a bit by removing the half-baked HTTPS support. Some justification for complexity: - httplib requires host and port, so we have to parse them out - non-default ports and not / installs were useful in my testing. - The URL is in the signature so without normalization a trailing / or a stray :80 could leave the user banging their head against "Bad Signature" errors for no apparent reason (And yes, the instructions mail does include the full perf.ini file)
Attachment 161394 [details] pushed as e722059 - [perf] Include monitor layout in performance reports Attachment 161659 [details] pushed as 4859eb6 - [perf] Add --perf-upload option to gnome-shell
Created attachment 161716 [details] [review] Fix missing scripting.js in Makefile.am This file was introduced by commit e722059, but not added to the files to install, leading to an exception and failure to start.
I was bitten by this this morning. Commit at fault is 98c2247, not e722059 as I said. OK to commit?
(In reply to comment #43) > I was bitten by this this morning. Commit at fault is 98c2247, not e722059 as I > said. OK to commit? Note that this fix is already included in the patch in bug 619276, so I'd rather see that one committed.
Ah, OK, I dismissed that message too quickly. ;-)