After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 575572 - gnome-settings-daemon/libgdu notification support (S.M.A.R.T. warnings, RAID array degradation)
gnome-settings-daemon/libgdu notification support (S.M.A.R.T. warnings, RAID ...
Status: RESOLVED FIXED
Product: gnome-disk-utility
Classification: Core
Component: general
unspecified
Other Linux
: Normal enhancement
: ---
Assigned To: gnome-disk-utility-maint
Depends on:
Blocks:
 
 
Reported: 2009-03-16 15:59 UTC by Tomas Bzatek
Modified: 2009-04-12 01:26 UTC
See Also:
GNOME target: ---
GNOME version: Unversioned Enhancement


Attachments
gnome-settings-daemon-2.25.92-gdu-2.patch (26.98 KB, patch)
2009-03-16 16:00 UTC, Tomas Bzatek
needs-work Details | Review
0001-gnome-settings-daemon-plugin.patch (28.82 KB, patch)
2009-03-17 16:02 UTC, Tomas Bzatek
needs-work Details | Review
0001-gnome-settings-daemon-plugin.patch (29.14 KB, patch)
2009-03-19 13:28 UTC, Tomas Bzatek
needs-work Details | Review

Description Tomas Bzatek 2009-03-16 15:59:34 UTC
This patch adds libgdu (gnome-disk-utility) integration to Gnome desktop. It monitors hard drives and RAID arrays and notifies user when something goes wrong (hardware failure, mdadm kicking out mirror from array, etc.)

TODO:
 - delayed write alert (how can we detect that?)
 - the "Investigate" button should spawn palimpsest with the selected (failing) device -- needs support in palimpsest
Comment 1 Tomas Bzatek 2009-03-16 16:00:39 UTC
Created attachment 130757 [details] [review]
gnome-settings-daemon-2.25.92-gdu-2.patch

Proposed patch against final g-s-d-2.26.0
Requires libgdu-0.2
Comment 2 Tomas Bzatek 2009-03-16 16:31:08 UTC
Changing to gnome-disk-utility component, we agreed to make this as external g-s-d plugin. Will post new patch soon.
Comment 3 Tomas Bzatek 2009-03-17 16:02:38 UTC
Created attachment 130835 [details] [review]
0001-gnome-settings-daemon-plugin.patch

Ported to gnome-disk-utility source tree, g-s-d is an optional dependency now.
Comment 4 Matthias Clasen 2009-03-17 17:48:06 UTC
+  GsdGDUManager *manager;   /*  TODO: I don't like circular references  */
+} GsdGduWatch;

Minor nit: inconsistent capitalization, GsdGDU vs GsdGdu
Comment 5 Matthias Clasen 2009-03-17 17:49:59 UTC
+  if (program)
+    g_free (program);
+  if (icon_text)
+    g_free (icon_text);


Another pedantic comment: no need for the branches here, g_free handles NULL fine.
Comment 6 Matthias Clasen 2009-03-17 17:52:52 UTC
+static void
+do_investigate (GsdGduWatch *watch)
+{
+  g_spawn_command_line_async (DISK_MANAGEMENT_UTILITY, NULL);
+
+  if (watch->status_icon != NULL) {
+    g_object_unref (watch->status_icon);
+    watch->status_icon = NULL;
+  }
+}

Longer-term, it would be nice to launch the ui in a way that focuses the right disk/partition/whatever was causing the notification in the first place.
Comment 7 Matthias Clasen 2009-03-17 18:29:27 UTC
+static void
+monitored_list_populate (GsdGDUManager *manager)
+{
+  GsdGDUManagerPrivate *p = manager->priv;
+  GList *presentables, *l;
+
+  presentables = gdu_pool_get_presentables (p->pool);
+
+  for (l = presentables; l != NULL; l = l->next) {
+    if (l->data)
+      add_monitor (manager, l->data);
+  }
+
+  g_list_foreach (presentables, (GFunc) g_object_unref, NULL);
+  g_list_free (presentables);
+}

Is this something that takes time ? If so, we might consider delaying it somewhat to not slow down login...
Comment 8 Tomas Bzatek 2009-03-19 13:28:11 UTC
Created attachment 130964 [details] [review]
0001-gnome-settings-daemon-plugin.patch

(In reply to comment #6)
> Longer-term, it would be nice to launch the ui in a way that focuses the right
> disk/partition/whatever was causing the notification in the first place.

This was on my TODO list. Filed bug 575948 for tracking.
 
(In reply to comment #7)
> Is this something that takes time ? If so, we might consider delaying it
> somewhat to not slow down login...

Not necessarily, no signs of that in documentation (libgdu should keep this list cached). Anyway, it's a good idea to postpone monitoring start.

Forgot to mention last time: when plugin starts, it checks devices health upon start. So any present failures will show up one minute after desktop session starts.

(In reply to comment #4)
> Minor nit: inconsistent capitalization, GsdGDU vs GsdGdu
Fixed.

(In reply to comment #5)
> Another pedantic comment: no need for the branches here, g_free handles NULL
> fine.
Fixed.
Comment 9 Matthias Clasen 2009-03-22 13:37:49 UTC
You could use g_timeout_add_seconds for the STARTUP_DELAY

The 10 seconds seem a bit short for the duplicate warning check ?


We probably need to work on the wording of these messages a bit:

"S.M.A.R.T. subsystem reported hardware failure. There's high risk of data loss."

"The array is running in degraded mode. There's a chance of data loss when device fails."


I don't have a perfect wording right now, but maybe something like:

"One of your drives is reporting hardware probems. There is a high risk of data loss."

"Your RAID array is running in degraded mode. There is a chance of data loss when a drive fails."



Other than that, the patch looks good enough for an initial commit to me. David ?
Comment 10 David Zeuthen (not reading bugmail) 2009-03-24 00:51:19 UTC
Hi,

A few comments

 - coding style doesn't match the rest of the module and also appear
   to be inconsistent (in -plugin.c vs. -manager.c)

   - see src/gdu/*.[ch] for details and please copy the Modeline too, e.g.

   /* -*- Mode: C; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 8 -*- */

 - hmm, maybe this is a stupid question but how do you actually get the
   code to run? Does it involve logging in/out all the time? Sending a
   HUP signal to the gnome-settings-daemon process took down all apps
   in my session.

   So maybe we just want this to be a separate daemon instead since

    - libgdu may be slightly unstable in the next six months and
      we don't want to crash sessions; and

    - it's going to make hacking on this much harder.

   So let's go with a separate daemon for now. Suggest to put the code
   in the src/notification/ sub-directory.

 - I'm not a big fan of libnotify and I don't think it should be the
   primary UI for this. I think instead we want a notification icon
   instead (the base user experience) and then *maybe* add libnotify
   support later.

   The user experience we want should be

    - no failing devices => no icon

    - failing devices -> icon. Tooltip should say something like
      "One or more devices are failing" or maybe even special case
      it depending on what failures we have. I don't know.

      User can right or left click the icon and get a menu with

         "[Presentable_Icon] <Presentable_Name> - Disk is failing"
         "[Presentable_Icon] <Presentable_Name> - Array is degraded"

      and clicking the icon should launch Palimpsest. That's bug
      575948 right?

   Note that we already have icons for this (gdu-error, see data/icons/*)

  - Do we need a way for the user to ignore errors/devices? I don't
    know. But it seems like something we should at least consider...

You ask in the patch whether the smart data is automatically updated. Yes, it is automatically updated; we poll the disk every 30 minutes (might change that to 10 minutes) if the disk isn't sleeping. Ditto for RAID array status. So the logic in your code is right.

Also note that gnome-disk-utility HEAD and DeviceKit-disks HEAD has seen some changes to use libatasmart instead of parsing smartctl output. So the API has changed. This patch should fix it.

$ diff -up gsd-gdu-manager.c.orig gsd-gdu-manager.c
--- gsd-gdu-manager.c.orig	2009-03-23 20:07:57.000000000 -0400
+++ gsd-gdu-manager.c	2009-03-23 20:11:37.000000000 -0400
@@ -181,7 +181,6 @@ watch_check_state (GsdGduWatch *watch)
 {
   GduDevice *device;
   char *title;
-  GduSmartData *smart;
 
   g_return_if_fail (watch != NULL);
   g_return_if_fail (watch->presentable != NULL);
@@ -194,15 +193,12 @@ watch_check_state (GsdGduWatch *watch)
 
   /*  S.M.A.R.T.  */
   if (time (NULL) > watch->last_smart_warning + NOTIFICATION_DUPLICATION_TIMEOUT &&
-      gdu_device_drive_smart_get_is_capable (device) && gdu_device_drive_smart_get_is_enabled (device))
-  {
-    smart = gdu_device_get_smart_data (device);
-    if (smart) {
-      if (gdu_smart_data_get_is_failing (smart)) {
-        gdu_show_notification (watch, title, _("S.M.A.R.T. subsystem reported hardware failure. There's high risk of data loss."));
-        watch->last_smart_warning = time (NULL);
-      }
-      g_object_unref (smart);
+      gdu_device_drive_ata_smart_get_is_available (device)) {
+    if ((gdu_device_drive_ata_smart_get_is_failing (device) && gdu_device_drive_ata_smart_get_is_failing_valid (device)) ||
+        gdu_device_drive_ata_smart_get_has_bad_sectors (device) ||
+        gdu_device_drive_ata_smart_get_has_bad_attributes (device)) {
+      gdu_show_notification (watch, title, _("S.M.A.R.T. subsystem reported hardware failure. There's high risk of data loss."));
+      watch->last_smart_warning = time (NULL);
     }
   }
Comment 11 Matthias Clasen 2009-03-24 02:33:36 UTC
>   So let's go with a separate daemon for now. Suggest to put the code
>   in the src/notification/ sub-directory.

FWIW, I disagree with that idea. Yes, gnome-settings-daemon is not as stable as one would hope for right now. But putting our head in the sand and doing our own thing instead is not the right answer to that. 

I'm going to build a fixed PackageKit in rawhide in a bit that will make killing gnome-settings-daemon work without taking down everything. Unfortunately, even after that, restarting g-s-d still kills gnome-session, since it can't handle loading the bug-buddy module for some reason. That can easily worked around by 

gconftool-2 --set /apps/gnome_settings_daemon/gtk-modules/gnomebreakpad --type boolean false

bug-buddy is useless anyway
Comment 12 Tomas Bzatek 2009-03-24 14:02:46 UTC
(In reply to comment #9)
> You could use g_timeout_add_seconds for the STARTUP_DELAY
Makes sense, we could save a little power by reduce polling, especially with such always running things.

> The 10 seconds seem a bit short for the duplicate warning check ?
The original issue was to prevent burst of "changed" events spawned at the same time, e.g. when DK-disks daemon refreshed SMART data. Thinking about it now, we can extend this interval to say one minute.
Comment 13 Tomas Bzatek 2009-03-24 14:56:57 UTC
(In reply to comment #10)
>  - hmm, maybe this is a stupid question but how do you actually get the
>    code to run? Does it involve logging in/out all the time? Sending a
>    HUP signal to the gnome-settings-daemon process took down all apps
>    in my session.
> 
>    So maybe we just want this to be a separate daemon instead since
> 
>     - libgdu may be slightly unstable in the next six months and
>       we don't want to crash sessions; and
> 
>     - it's going to make hacking on this much harder.
> 
>    So let's go with a separate daemon for now. Suggest to put the code
>    in the src/notification/ sub-directory.
The reason why gnome-settings-daemon was chosen is to have the notification ability without needing to run extra command within gnome-session. As for XFCE, you can choose to start Gnome services on startup and this thing would be started automatically within g-s-d. But I understand your concern about stability, it's all in a single process.

For debugging/hacking, we can easily add a simple #ifdef around main() function, just like Vuntz did in http://bugzilla.gnome.org/attachment.cgi?id=124020&action=view
The plugin is quite independent, the only difference is how is it spawned.

>  - I'm not a big fan of libnotify and I don't think it should be the
>    primary UI for this. I think instead we want a notification icon
>    instead (the base user experience) and then *maybe* add libnotify
>    support later.
Status icon (stock exclamation mark) is already in the code and libnotify can be easily disabled. Do you think having status icon only would be enough for user to register that his disks are failing? I mean I often have about ten icons in the notification area, two blinking for incoming messages (e-mail, jabber, icq), three icons from PackageKit (a big red circle for critical updates, another exclamation mark that my session needs restart, ...). I'm not usability expert though...

Otherwise I agree with your proposal about the icon menu.

> You ask in the patch whether the smart data is automatically updated. Yes, it
> is automatically updated; we poll the disk every 30 minutes (might change that
> to 10 minutes) if the disk isn't sleeping. Ditto for RAID array status. So the
> logic in your code is right.
Thanks for the confirmation, we should note this to documentation (I'll file separate bug for doc issues later).
Comment 14 David Zeuthen (not reading bugmail) 2009-03-24 17:23:52 UTC
(In reply to comment #13)
> The reason why gnome-settings-daemon was chosen is to have the notification
> ability without needing to run extra command within gnome-session. As for XFCE,
> you can choose to start Gnome services on startup and this thing would be
> started automatically within g-s-d. But I understand your concern about
> stability, it's all in a single process.
> 
> For debugging/hacking, we can easily add a simple #ifdef around main()
> function, just like Vuntz did in
> http://bugzilla.gnome.org/attachment.cgi?id=124020&action=view
> The plugin is quite independent, the only difference is how is it spawned.

For the time being I'd like this to be a separate daemon; reasons include

 - it's much easier to hack on; things like running gdb, debug output
   and so forth

 - libgdu is going to be unstable for at least the next six months
   and I don't want to deal with bringing down gnome-settings-daemon
   (neither for users nor when I'm break API in libgdu and do 'make install')
   since that is going to cause a lot of trouble

When things are baked and all, we can make this into a gnome-settings-daemon plug-in. But for now I'd like a separate daemon. Simply because it makes things a lot easier for us as developers.

> >  - I'm not a big fan of libnotify and I don't think it should be the
> >    primary UI for this. I think instead we want a notification icon
> >    instead (the base user experience) and then *maybe* add libnotify
> >    support later.
> Status icon (stock exclamation mark) is already in the code and libnotify can
> be easily disabled. Do you think having status icon only would be enough for
> user to register that his disks are failing? I mean I often have about ten
> icons in the notification area, two blinking for incoming messages (e-mail,
> jabber, icq), three icons from PackageKit (a big red circle for critical
> updates, another exclamation mark that my session needs restart, ...). I'm not
> usability expert though...
> 
> Otherwise I agree with your proposal about the icon menu.

Sounds good.

     David
Comment 15 David Zeuthen (not reading bugmail) 2009-04-11 21:49:50 UTC
I added a notification daemon in this commit

 http://git.gnome.org/cgit/gnome-disk-utility/commit/?id=f5c0e92affa5ec515d03eb1d8abfcccd2b4fe3eb

since I just implemented the "show dialog for slow unmount operations" feature (bug 578722). I'll look into making it show ATA SMART warnings etc.
Comment 16 Matthias Clasen 2009-04-11 21:55:35 UTC
Note that Tomas was also working on the smart warnings again, recently. 
You may want to check with him to avoid duplicate work.
Comment 17 David Zeuthen (not reading bugmail) 2009-04-12 01:26:35 UTC
I've implemented a very simple status icon for ATA SMART status, see

 http://people.freedesktop.org/~david/gdu-ata-smart-warning.png
 http://git.gnome.org/cgit/gnome-disk-utility/commit/?id=f6c622352767cdd7ffe94b78f58749b9e76ba71a

There's room for improvement in what I committed (note the TODO item) but let's deal with such improvements in separate bugs.