GNOME Bugzilla – Bug 353942
New infrastructure to allow approximate timeouts (goal: Power saving)
Last modified: 2007-08-10 23:39:57 UTC
The patch that I will attach adds a variant of g_timeout_add to the glib API, g_timeout_add_approx(). The goal of the _approx version is to allow the application to indicate that the timeout is an approximate value, and that the application doesn't care a great deal about the exact time this timer will fire. The implementation in the patch uses this "no need to be exact" information to schedule the timer only on "whole wallclock seconds". The effect of this is that applications that have multiple such timers running in parallel will have less context switches/wakeups because the timers will naturally group together this way into one wakeup moment. The effect is bigger than just the single app; on a system level, all such timers from all glib using applications will wake up around the same time in the second. This is an advantage for the following reason: (ascii art of 1 second, the X denotes a wakeup of any app) <current situation; random wakeups during the second> 0 1 |-----X----X-------XX----X----X----| <with patch 0 1 |XX--------------------------------| in the <with patch> situation the processor and the system can sleep for a much longer time, which allows for much greater power savings than the continuously waking up case. This is useful for power savings, but also for the virtualization case where frequent wakeups are also a cause for performance costs.
Created attachment 72061 [details] [review] Patch to implement described functionality
A testcase with 3 timers (1200, 2100 and 2200) shows the following behavior without the patch: 10:32:11.372810 poll(0, 0, 1159) = 0 10:32:12.532066 poll(0, 0, 801) = 0 10:32:13.334045 poll(0, 0, 198) = 0 10:32:13.533012 poll(0, 0, 199) = 0 10:32:13.733045 poll(0, 0, 1199) = 0 10:32:14.933029 poll(0, 0, 402) = 0 10:32:15.336025 poll(0, 0, 396) = 0 10:32:15.733032 poll(0, 0, 400) = 0 10:32:16.134030 poll(0, 0, 1199) = 0 10:32:17.334027 poll(0, 0, 1) = 0 10:32:17.336022 poll(0, 0, 597) = 0 10:32:17.934080 poll(0, 0, 600) = 0 and with the patch: 10:33:11.001025 poll(0, 0, 999) = 0 10:33:12.001060 poll(0, 0, 999) = 0 10:33:13.001056 poll(0, 0, 999) = 0 10:33:14.001061 poll(0, 0, 999) = 0 10:33:15.001053 poll(0, 0, 999) = 0 10:33:16.001028 poll(0, 0, 999) = 0 10:33:17.001060 poll(0, 0, 999) = 0 the difference is clear: in the patched case there are fewer wakeups in total, and they are farther appart on average.
Arjan: the change to the header adds a prototype for g_source_new_approx whereas the body contains the definition for g_timeout_source_new_approx. And you need to add the functions to the list of exported symbols in glib.symbols
Created attachment 72072 [details] [review] updated patch based on the feedback good points; updated patch attached
I suspect we'd want 'approximate' rather than 'approx' for the naming. add_full_approximate() probably should be add_approximate_full(), or I'd probably just call it add_approximate(), if we aren't going to have a variant without the destroy notifier. + if (timeout_source->expiration.tv_usec >= 250000) + timeout_source->expiration.tv_sec++; + timeout_source->expiration.tv_usec = 0; That could use a comment explaining the 0.25. Of course, the docs really need a little more explaination of what the idea behind the function is; when would the caller use it? I think I'd actually want to explain a bit about the current algorithm as well, to let people make the tradeoff in a more informed fashion. If you know the algorithm, then it's clear it would be bad to use it for a cursor that was blinking every 1.5 seconds since it would go: blink blink blink blink blink blink But that wouldn't necessarily be obvious from a more generic description. And maybe a link from the normal g_timeout_add() if we think that people should frequently use approximate timeouts instead. [ I might imagine that long-term it would be good to have a per-computer random time within the second to fire timeouts rather than always firing at usecs=0, since a common thing that is going to be done in timeouts is access network resources. And NTP is pretty common these days. ]
I agree with Owen on the need for a more docs, and also on the idea to add a "phase" on which to synchronize the timers one.
If I were using this function I'd be nervous about the "1 second" part - is that guaranteed, or could it become 5 seconds; or should it be 5 seconds, since maybe my timeout is for 1 minute or 30 minutes and I'm willing to be even more approximate? Does it make any sense to have a "resolution" argument, or perhaps pick it from a percentage of timeout length: 0-10 second timeout - sync on the second boundary 10-100 second timeout - sync on the 10 second boundary etc. Then again, I can imagine that if a timeout only runs infrequently, it doesn't matter a lot anyway for CPU powersaving issues. Anyway, I would think the docs should offer some level of guarantee on "maximum approximation" i.e. a 1200 millisecond timeout shouldn't run every 10 seconds.
do we need to differentiate between allowing to approximate the phase, and allowing to approximate the frequency ? As in, "I need this to run once per second, but I don't care if it starts exactly one second from now"
I have made provisional patches for various gnome components to see/show how these would use it (and they make basically all recurring timers in my system aligned to the full second). There are two basic cases: the "10 second is my dbus link still alive" poll, which really couldn't care less if it was 9.1 or 10.99 seconds. The other case is the "roughly once a second" thing; they don't care about the phase at all, but want roughly at least second granularity. Examples are the gtk "poll for recent files" 2 second timer, the gnome-screensaver "once per second ask for the cursor position" and even the 10 second gtk clipboard timeout. I don't think we need to add a parameter for the resolution; all users really only care about full seconds so documenting this as "rounds to an appropriate second" would be more than enough. This isn't appropriate for cursors or other "I really care about 1.2 seconds and not 1 second" users, but that's fine, they can use the existing API. Many other users however really only want some sort of watchdog like behavior.
Created attachment 72084 [details] [review] dummy patch to show how the API would be used This patch shows some likely places on where to use the new API (as you can see there is only uses of the simple variant of the API)
(I agree about the per computer skew; that will also be useful for virtualization purposes. But it really ought to be per machine, not per process; something like a hash of the hostname or a hash of the IP or MAC addresses)
as for percentage, we could define this as "1 second or 10%, whichever is largest" so that the multi-minute timers (which I've not seen yet in the gnome stack) could be aligned more liberal.
One of the problems Arjan ran into was an application which uses the python bindings gobject that implements a timeout using a custom GSource. It seems that it won't be possible to integrate that with the proposed patch. I don't know GSource in detail, but it would help if this patch could be implemented in such a way that custom sources can say that they want to allow the timeouts to be synchronized/approximated. The source for the timeout source can be found here, http://cvs.gnome.org/viewcvs/pygobject/gobject/pygmainloop.c?rev=1.7&view=markup pyg_signal_watch_prepare & pyg_signal_watch_new are the interesting parts.
pyg_signal_watch_prepare() could do this logic itself; it may make sense to make a helper function for that I suppose; it's not rocket science though; gettimeofday() + simple math.
A couple asides not strictly related to this bug: - I don't get that "dbus watcher" thing in NetworkManager, a) dbus connection should only be dropped if the session dies (or for the system dbus, should never be dropped at all) b) in any case there's a notification if the connection is dropped, polling is not needed - a per-machine unique ID is something we really should have, even if it gets changed/reset on each boot it would still be useful; here's the necessary init script: test -e /etc/host-uuid || uuidgen > /etc/host-uuid That could then be hashed to get the timeout phase. For purposes of timeout phase a per-user value is probably good enough though, since most machines will be single-user. A simple solution you could add to this patch: const char *session_bus_address = getenv("DBUS_SESSION_BUS_ADDRESS"); int phase; if (session_bus_address) phase = g_str_hash(session_bus_address) % 1000; else phase = 0; That would work 99% of the time, with no huge catastrophe when it doesn't. (The session bus address can be used as a "session id" and has some random chars in it)
before respinning the patch based on this feedback I'd like closure on the naming. 1) Based on the usage I'm fine with not providing a "full" version of the API, it has no projeted users as far as I can see; all existing code that does long timeouts uses the simple API. 2) we have so far A) g_timeout_add_approx B) g_timeout_add_approximate C) g_timeout_add_seconds [*] [*] this variant would also take seconds not miliseconds as argument; it makes it specific that the granularity is seconds and we can document that the phase of the timer is up to glib.
Created attachment 72092 [details] [review] Patch that uses g_timeout_add_seconds() as name This patch implements the name g_timeout_add_seconds(), using whole second granularity. The gdoc comment documentation is made more explicit. The per system offset is implemented as per Havocs suggestion The internal implementation allows for variable granularity although only second granularity is exposed via the external API currently. But it's very easy to add different (or even variable) granularity APIs in the future should the need arise.
Generally looks fine to me; I slightly dislike the implicit dbus dependency, but its not really worth coming up with a different solution. I think we should avoid introducing too much unexplained terminology in the docs, like granularity and phase. Owen, Havoc, do you agree with this api ?
Created attachment 72121 [details] [review] better description How's this for a description? Changes since previous version: * fall back to hashing the hostname if the DBUS idenifier doesn't exist * reworded the g_timeout_add_seconds() gdoc comment
(as an aside, the NetworkManager "poll for dbus" disease seems to have spread to gnome-screensaver too; if this is really unnneeded someone needs to eradicate this before more projects copy this code in a "eh maybe we need this too" way)
Would checking for timeouts that are about to expire be useful? E.g. @@ -3385,6 +3385,8 @@ g_timeout_prepare (GSource *source, { msec = MIN (G_MAXINT, (guint)msec + 1000 * (guint)sec); } + if (msec < timeout_source->granularity) + msec = 0; } *timeout = (gint)msec; @@ -3399,6 +3401,8 @@ g_timeout_check (GSource *source) GTimeoutSource *timeout_source = (GTimeoutSource *)source; g_source_get_current_time (source, ¤t_time); + if (timeout_source->granularity) + g_time_val_add (¤t_time, -timeout_source->granularity) return ((timeout_source->expiration.tv_sec < current_time.tv_sec) || ((timeout_source->expiration.tv_sec == current_time.tv_sec) &&
you have to be really careful with that; consider the following scenario: A whole-second timer "A" set to once per second A normal timer "B" set to once every 10 msec. Both A and B fire at time 0 and both rearm then time is 10ms B fires again as it should but A is now within a second of it's supposed-to-fire time, so it fires again. Both rearm for their next time (20ms and 1 second 10ms) at time is 20ms B fires again as it should A is *again* within the range it should fire after the rearming at 10ms.. so this probably has the effect of changing the frequency of A dramatically. (in addition, while a nice idea in general, so far measurements I've done don't seem to indicate that there's any of this really going on in practice, just about all recurring timers on a normal gnome desktop at least end up being of the second class, and those group together already)
I'm OK with _seconds() as a name; it doesn't convey everything you might want to know about the function, but: g_timeout_add_i_only_care_about_approximate_seconds() Would be a little long :-).
Some questions that came up while reading this: This would probably be the API that the clock applet should use for updating the displayed time. 1) I know that some people are sensitive about the clock ticking at the exact second and not half a second late. Would that work? 2) How about ticking at the exact minute if the clock has seconds disabled?
Company: see https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=204862 for a solution for that issue
* glib/glib.symbols: * glib/gmain.[hc]: Add functions to create approximate timeouts. (#353942, Arjan van de Ven)
(In reply to comment #11) > (I agree about the per computer skew; that will also be useful for > virtualization purposes. But it really ought to be per machine, not per > process; something like a hash of the hostname or a hash of the IP or MAC > addresses) For virtualization purposes, it actually should be per the virtualization host machine, since that would allow the guests to wake up at the same time, meaning the host gets to sleep longer, consume less power and generate less heat.