GNOME Bugzilla – Bug 347406
temperature monitoring
Last modified: 2010-10-18 08:47:02 UTC
I think a logical extension of power monitoring is temperature monitoring. They are directly related to each other. The only variability/unknown is the presence of cooling. So, I think it might make sense to be part of gpm. A few off the top of my head use cases: * Device is approaching a temperature limit: pop up a warning * Device has reached a temperature limit: power down the device Thoughts?
Sane. I've always thought stuff like this belonged in HAL, rather than the rather odd lm_sensors.conf thing.
Looking through the HAL spec, I see that we can tell whether a sensor exists, but I'm not sure how we can get data from it. Additionally, HAL doesn't expose any sensors on my Thinkpad T42 or my desktop with an nForce chipset, so I'm not sure how widespread those keys are. http://webcvs.freedesktop.org/hal/hal/doc/spec/hal-spec.html?revision=1.78.2.1#device-properties-sensor I'll take a look around and see how the temperature monitor applet does it, for instance. Once we have the data, I'd think we could add a section under the Notification tab in the preferences where the user can opt in or out of "Temperature Warnings". Hopefully we'll be able to determine some sane defaults so that we don't have to ask people what temperature they would like to be warned at for each device. Perhaps we could take a running average from the device for the first few ticks, and then later warn if it varies by a certain factor. Performing actions based on the temperature would be step two, so I think we'd want to get plenty of testing with just notifications before we bother implementing "shutdown on temp".
On certain well supported hardware, this is low hanging fruit: I don't know how I could safely test it, but my IBM exposes exactly what we need under /proc/acpi/thermal_zone/* ...including current and unsafe temperatures and state ('ok', etc). See here: http://acpi.sourceforge.net/documentation/thermal.html. For a start, we could report whenever the state is not 'ok', or when the current temperature approaches/passes the unsafe temperature.
(In reply to comment #2) > Looking through the HAL spec, I see that we can tell whether a sensor exists, > but I'm not sure how we can get data from it. Additionally, HAL doesn't expose > any sensors on my Thinkpad T42 or my desktop with an nForce chipset, so I'm not sure how widespread those keys are. I wrote that lump of spec - :-) There isn't any probers or addons currently looking for sensors and adding them to the hal db - it was just waiting for someone with a bit of free time. Does ACPI also expose the sensors in sysfs? /proc/acpi is going away real soon. Richard.
The only mention of (thermal|temp|sensor) in my entire /sys/ directory is a section that lists 'thermal' as a module in /sys/module. It has some version information and a refcount, but no data. Is sysfs where HAL gets its data currently?
It should only use syfs, but due to the kernel being broken we also use /proc/acpi and /dev/pmu etc. It might be worth having a look online and ask on LKML before we start hacking on this - or we could just hack up the addon and then change it later when the sysfs stuff comes online.
I was sitting in class looking at the HAL acpi addon, and I just want to make sure I understand: the addon monitors the /proc/acpi/events file or connects to acpid if it can be found. As new events come in, it looks for ones that it recognizes (battery, ac_adapter and button at the moment), and uses DBUS to add them to the HAL device list? Shouldn't it be as simple as also asking it to check for thermal_zone changes, which it can then update straight into HAL?
Looking through the issue some more, it became clear that the first step would be to get a spec drawn up for how these new keys would be added to HAL. Here is a first draft: http://www.hoodidge.net/development/thermal_zone_v1.html Its seperate from the sensors namespace that you created Richard, because in the context of ACPI, a thermal zone is an abstract device, possibly made up of many sensors on many physical devices. I'm especially interested in figuring out how I could clarify the different trip_points. Since there are 7 possible actions, its on the border of being excessive to have them all listed, but the alternative would be to have developers parse the strings to determine the S-level, temperature and action. Additionally, what other methods could be useful? I've been going off of the information on this page: http://acpi.sourceforge.net/documentation/thermal.html . Thanks!
Dude, post this to the hal list, hal@lists.freedesktop.org My initial impression is looks good, but the HAL list will give it a good grilling. Richard.
I'll close this one for now. The gnome-sensors-applet seems to work well for me, and seeing that g-p-m is session based it's a poor choice for doing temperature policy. This is something that can be handled in OHM or PPM.
I was just looking for exactly this issue and I'm sad to see it is closed. My laptop shut off on me a few minutes ago, and it took me 10 minutes to dig through logs (as root!) to figure out what went wrong. The problem showed up in my kernel log: Sep 17 15:47:44 horatio kernel: [ 364.622292] thinkpad_acpi: THERMAL EMERGENCY: a sensor reports something is extremely hot! Sep 17 15:47:44 horatio kernel: [ 364.624287] thinkpad_acpi: temperatures (Celsius): 97 35 33 N/A 50 N/A 26 N/A 37 37 47 N/A N/A N/A N/A N/A Sep 17 15:47:44 horatio kernel: [ 364.625663] Critical temperature reached (101 C), shutting down. Could gnome-power-manager not detect that event somehow and pop up a message? For something that critical I think it might even warrant a whole-screen overlayed message telling me what is happening, seeing as the computer will only be on for another 5 seconds or so... Is there any chance of reconsidering this feature? I don't necessarily want full temperature sensor support, but it would at least be nice to detect APCI thermal emergency events.
(In reply to comment #11) > Could gnome-power-manager not detect that event somehow and pop up a message? > For something that critical I think it might even warrant a whole-screen > overlayed message telling me what is happening, seeing as the computer will > only be on for another 5 seconds or so... > > Is there any chance of reconsidering this feature? I don't necessarily want > full temperature sensor support, but it would at least be nice to detect APCI > thermal emergency events. I think this is exactly the kind of functionality you could add to gnome-settings-daemon as a plugin. Richard.