Bug 701676 – gparted doesn't inhibit systemd mounting, leading to potential data loss

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 701676 - gparted doesn't inhibit systemd mounting, leading to potential data loss


Summary:	gparted doesn't inhibit systemd mounting, leading to potential data loss


Status:	RESOLVED FIXED

Product:	gparted
Classification:	Other
Component:	application
Version:	unspecified
Hardware:	Other All

Importance:	Normal critical
Target Milestone:	---
Assigned To:	gparted maintainers alias
QA Contact:	gparted maintainers alias

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2013-06-05 21:04 UTC by Allison Karlitskaya (desrt)
Modified:	2014-05-22 03:06 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
strace output (433.03 KB, text/plain) 2013-06-11 03:37 UTC, Allison Karlitskaya (desrt)		Details
Patch set to inhibit automount using systemctl runtime mask (3.08 KB, patch) 2013-06-11 19:28 UTC, Curtis Gedak	none	Details \| Review
Use systemctl runtime mask to prevent automounting (#701676) (2.03 KB, patch) 2013-06-11 20:49 UTC, Allison Karlitskaya (desrt)	none	Details \| Review
Patch to inhibit automount using systemctl runtime mask (3.16 KB, patch) 2013-06-11 21:35 UTC, Curtis Gedak	none	Details \| Review
Patch set to inhibit automount using systemctl runtime mask and limit GParted to one instance (4.02 KB, patch) 2013-06-13 16:26 UTC, Curtis Gedak	none	Details \| Review

Description Allison Karlitskaya (desrt) 2013-06-05 21:04:09 UTC

I wanted to resize my home partition (/home) using gparted, so I umounted it and created /home/desrt on the root partition so that I could login.

Once logged in, I had a new empty home directory, as expected.

I ran "gparted" from a terminal and it asked for the root password to acquire permissions.

I then tried to resize my home directory.  gparted refused to do that on the basis that the partition was mounted.

I was very very shocked to see this happen so I attempted to reproduce it.  The simple act of _starting_ gparted was enough to cause /home to be mounted.

This is a huge problem, of course.  The session doesn't normally expect the home directory to be suddenly remounted under it, and I lost a good deal of data because of this.  My dconf database was blown away and my tracker database corrupted (since these programs assume that their database won't suddenly be replaced when they're in the middle of accessing it).

This is gparted 0.14.1 from the Fedora 19 beta.

Comment 1 Curtis Gedak 2013-06-06 02:35:05 UTC

GParted uses tools such as udisks, devkit-disks, and hal-lock to prevent other utilities from automounting directories.

The script that invokes these protections can be viewed at the following link:

https://git.gnome.org/browse/gparted/tree/gparted.in


Perhaps these utilities are no longer installed by default on Fedora 19 beta?

To check if any of these are running, would you be able to provide the output from the following command?

   ps -ef | egrep 'hald|devkit-disks-da|udisks-daemon'

Comment 2 Allison Karlitskaya (desrt) 2013-06-06 04:40:57 UTC

This is all that matches:

desrt     1053     1  0 Jun05 ?        00:00:00 /usr/libexec/gvfs-udisks2-volume-monitor
root      1055     1  0 Jun05 ?        00:00:02 /usr/lib/udisks2/udisksd --no-debug

It's worth noting that I don't have any 'udisks' or 'hal-lock' binaries installed.  udisks2 seems to use a new binary called 'udisksctl'.

So let's assume that it's udisks2 that does this (since I don't have hal or devkit).  This even explains why knowledge of /dev/sda5 being associated with /home could persist (in the daemon) after I commented it out in fstab.  Why would it automount partitions in response to gparted merely being started?

Comment 3 Allison Karlitskaya (desrt) 2013-06-06 04:47:44 UTC

The systemd journal shows this activity around the time I ran gparted:


Jun 05 16:14:46 moonpix userhelper[3931]: running '/usr/sbin/gparted' with root privileges on behalf of 'desrt'
Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 05 16:14:47 moonpix systemd[1]: Mounting /home...
Jun 05 16:14:47 moonpix systemd[1]: home.mount: Directory /home to mount over is not empty, mounting anyway.
Jun 05 16:14:47 moonpix kernel: EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: (null)
Jun 05 16:14:47 moonpix systemd[1]: Mounted /home.

*sigh*

Comment 4 Allison Karlitskaya (desrt) 2013-06-06 05:16:10 UTC

So http://www.freedesktop.org/software/systemd/man/systemd.mount.html tells an interesting story...

It seems that in the brave new world of systemd, /etc/fstab is semi-deprecated (which is why nothing changed when I modified it) and systemd is in charge of all the things.  It wants to mount partitions when it sees (what it assumes to be) hotplug events from the kernel -- which are generated as a side effect of gparted's probing (which I guess is why you already had problems with earlier systems).

You can mount and unmount with 'systemctl start/stop home.mount'.

You cannot disable the 'home.mount' unit because it doesn't truly exist:

[desrt@moonpix ~]$ sudo systemctl disable home.mount
Failed to issue method call: No such file or directory


But you can mask it:

[desrt@moonpix ~]$ sudo systemctl mask home.mount
ln -s '/dev/null' '/etc/systemd/system/home.mount'

but this seems like a pretty bad idea since it persists across boots...

Fortunately, we have this:

[desrt@moonpix ~]$ sudo systemctl --runtime mask home.mount
ln -s '/dev/null' '/run/systemd/system/home.mount'

So we can probably do something like so:

[desrt@moonpix ~]$ sudo systemctl --runtime mask `systemctl list-unit-files -t mount --no-legend | cut -f1 -d' '`
ln -s '/dev/null' '/run/systemd/system/dev-hugepages.mount'
ln -s '/dev/null' '/run/systemd/system/dev-mqueue.mount'
ln -s '/dev/null' '/run/systemd/system/proc-fs-nfsd.mount'
ln -s '/dev/null' '/run/systemd/system/proc-sys-fs-binfmt_misc.mount'
ln -s '/dev/null' '/run/systemd/system/sys-fs-fuse-connections.mount'
ln -s '/dev/null' '/run/systemd/system/sys-kernel-config.mount'
ln -s '/dev/null' '/run/systemd/system/sys-kernel-debug.mount'
ln -s '/dev/null' '/run/systemd/system/tmp.mount'
ln -s '/dev/null' '/run/systemd/system/var-lib-nfs-rpc_pipefs.mount'



but really, systemd should _not_ be mounting partitions based on the 'appearance' of a device that already existed and doubly so when said partition was already mounted at boot and the user has explicitly unmounted it since...

Comment 5 Curtis Gedak 2013-06-06 15:35:45 UTC

Thank you Ryan for all your research into why this problem is occurring.

Your solution proposal in comment #4 sounds reasonable, and has the added benefit that it should not persist beyond the runtime of GParted.

Are you interested in writing a patch?

If so, details on GParted development can be found at:
http://gparted.org/development.php

If not, then no worries.  I can set up a fedora 19 beta virtual machine and try to implement your proposed solution.

Comment 6 Curtis Gedak 2013-06-06 16:03:40 UTC

Hmm... after looking at udisks2, it does not appear to have the same functionality as udisks1.  By this I mean that I did not observe a udisks2 related command for disabling automount.


Regarding systemctl, it appears that the --runtime option will persist until the next reboot.  This is most likely longer than the runtime of gparted, but could be considered a better compromise than potentially losing data as you unfortunately experienced.


From the systemctl man page:
http://www.freedesktop.org/software/systemd/man/systemctl.html

  --runtime

      When used with enable, disable, is-enabled (and related commands),
      make changes only temporarily, so that they are lost on the next
      reboot. This will have the effect that changes are not made in
      subdirectories of /etc but in /run, with identical immediate
      effects, however, since the latter is lost on reboot, the changes
      are lost too.

      Similar, when used with set-cgroup-attr, unset-cgroup-attr,
      set-cgroup and unset-cgroup, make changes only temporarily, so
      that they are lost on the next reboot.


I will keep looking to see if there is another way to inhibit automounting only while GParted is running.

Comment 7 Curtis Gedak 2013-06-06 16:23:58 UTC

One option might be to store a copy of the "list-unit-files" that we mask.  Then we could issue an unmask command after GParted finishes execution.

For example:

1)  MOUNTLIST=`systemctl list-unit-files -t mount --no-legend | cut -f1 -d' '`

2)  sudo systemctl --runtime mask $MOUNTLIST

3)  Execute gparted

4)  sudo systemctl --runtime unmask $MOUNTLIST

Comment 8 Allison Karlitskaya (desrt) 2013-06-06 16:47:59 UTC

I considered something like this, but I think we would actually want a 'grep -v masked' in the first line so that we don't bother masking mounts that are already masked (and don't unmask them when we're done).

As to life-of-the-system vs. life-of-gparted, I don't think this is too much concern.  My concern with 'mask' (storing files in /etc) is if the system crashes or loses power while gparted is running.  As long as gparted exits cleanly we will get a chance to do the unmasking.


There is also a concern that the masking/unmasking is boolean and not using some sort of refcount semantics.  This means that if we run two copies of gparted:


  A                         B

  run gparted
  (mask a b c d)
                            run gparted
                            (mask nothing because it's already masked)


  exit gparted
  (unmask a b c d)


we can find ourselves in a case where gparted 'B' is still running, but mounts are unmasked again.

And of course this would apply not only to two copies of gparted but also to gparted running at the same time as any other program that engages in similar tricks.

Comment 9 Curtis Gedak 2013-06-06 16:55:04 UTC

Good points Ryan.

Regarding multiple instances of GParted, with hal-lock gparted was limited to running only one instance because the Device.Storage exclusive lock could not be acquired a second time.

Ideally we wish to prevent multiple instances of GParted.  We might consider adding a check to see if gpartedbin is already running.  If it is then the script should not invoke a second instance.

Comment 10 Lennart Poettering 2013-06-06 17:25:11 UTC

What's your precise fstab line?

Comment 11 Allison Karlitskaya (desrt) 2013-06-06 19:08:44 UTC

Nothing unusual; here's the whole thing.

#
# /etc/fstab
# Created by anaconda on Sat Jun  1 19:29:32 2013
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=dab3bdea-61e4-4533-9998-a2dbc0afa4ef /                       ext4    noatime         1 1
UUID=802d2024-dc02-4f45-814d-8756af26a040 /boot                   ext4    noatime         1 2
UUID=33DE-E1DE          /boot/efi               vfat    umask=0077,shortname=winnt 0 0
UUID=3f74ddf8-a052-4667-9320-3b42bebfe31a swap                    swap    defaults        0 0
UUID=1d4e32d6-e3bd-4331-9be5-149f3059a193 /home                   ext4    noatime         0 0

Comment 12 Curtis Gedak 2013-06-07 16:17:43 UTC

Hi Ryan,

I just finished a default install (with LVM) of Fedora 19 beta and so far have been unable to reproduce the problem you experienced.

Would you be able to list the exact steps to create the problem?

Curtis

Comment 13 Curtis Gedak 2013-06-10 16:40:43 UTC

I have still been unable to recreate the problem.

The steps I used were:

1)  Setup a VM with a default install of Fedora 19 beta

2)  Install gparted 0.14.1 package

3)  Create a 512 MiB ext2 partition /dev/sdb1

4)  Edit /etc/fstab to add the following entry:

    UUID={uuid-for-sdb1-ext2-file-system} /data ext2 noatime 1 3

5)  Reboot

6)  Start GParted

7)  With GParted, unmount /data (/dev/sdb1)

8)  Resize /dev/sdb1 from 512 MiB to 768 MiB and apply operation

9)  No problems encountered, and /data was not "automounted".


Ryan,

Would you be able to list the exact steps to create the problem?

Comment 14 Allison Karlitskaya (desrt) 2013-06-10 17:19:09 UTC

Sure.  Your setup above looks fine except you need to modify it like so:

 5)  Reboot
+
+5.5)  umount /data
+
 6)  Start GParted



and you will see that step 6 (merely _starting_ gparted) results in /data being automatically remounted.  That's the problem.

Comment 15 Curtis Gedak 2013-06-10 19:15:23 UTC

Thank you Ryan for the clarification on the steps.

When I perform the three steps from comment #14, /data remains unmounted when I start gparted.

Can you test this on your system as well?

Perhaps there is something else that we are missing in the steps.

Comment 16 Allison Karlitskaya (desrt) 2013-06-11 03:01:36 UTC

So a few ideas about what may be different:

a) I didn't use LVM

b) I'm outside of a VM (I mention this because this problem is fairly clearly related to hardware probing)

c) You mention that your partition is /dev/sdb1 which makes me suspect it's a different partition on a separate disk. Mine was on /dev/sda, so probing the partitions already on /dev/sda might have caused it to be remounted...

Other than that, I'm not sure what the differences might be. I am indeed able to reproduce this again and again, though.

Indeed. this is enough for me:

a) add this line to fstab, having never done so before:

UUID=45341d93-b5df-4321-88e5-521ef3997b5d /mnt ext2 noatime 0 0

where this UUID is my /dev/sda6 partition

then

b) sudo systemctl daemon-reload

then

c) gparted

These three steps are enough for /dev/sda6 to end up mounted on /mnt and I see this in the system journal:

Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3.
Jun 10 22:59:01 moonpix systemd[1]: Mounting /mnt...

ie: I think those three "Found device INTEL_SSDSC2CW240A3" lines are in response to gparted probing the existing partitions causing some sort of hardware coldplug which systemd then picks up and mounts /mnt based on...

Comment 17 Allison Karlitskaya (desrt) 2013-06-11 03:02:05 UTC

s/three/five/ obviously.

Comment 18 Allison Karlitskaya (desrt) 2013-06-11 03:34:05 UTC

So this backtrace happens on startup:

ie: we are attempting to rewrite the partition table on startup.

+ Trace 232035

Thread 1 (Thread 0x7ffff7fbca40 (LWP 21111))

#0 gdk_pixbuf_get_rowstride
at gdk-pixbuf.c line 613
#1 IA__gdk_cairo_set_source_pixbuf
at gdkcairo.c line 198
#2 pixbuf_render
at pixbuf-render.c line 496
#3 theme_pixbuf_render
at pixbuf-render.c line 830
#4 draw_simple_image
at pixbuf-draw.c line 145
#5 draw_box
at pixbuf-draw.c line 668
#6 gtk_progress_bar_paint
at gtkprogressbar.c line 998
#7 gtk_progress_bar_expose
at gtkprogressbar.c line 542
#8 _gtk_marshal_BOOLEAN__BOXED
at gtkmarshalers.c line 86
#9 g_closure_invoke
at gclosure.c line 777
#10 signal_emit_unlocked_R
at gsignal.c line 3622
#11 g_signal_emit_valist
at gsignal.c line 3338
#12 g_signal_emit
#13 gtk_widget_event_internal
at gtkwidget.c line 5017
#14 IA__gtk_widget_send_expose
at gtkwidget.c line 4846
#15 IA__gtk_main_do_event
at gtkmain.c line 1610
#16 _gdk_window_process_updates_recurse
at gdkwindow.c line 5427
#17 _gdk_window_process_updates_recurse
at gdkwindow.c line 5400
#18 _gdk_windowing_window_process_updates_recurse
at gdkwindow-x11.c line 5643
#19 gdk_window_process_updates_internal
at gdkwindow.c line 5594
#20 IA__gdk_window_process_all_updates
at gdkwindow.c line 5702
#21 gtk_container_idle_sizer
at gtkcontainer.c line 1360
#22 gdk_threads_dispatch
at gdk.c line 512
#23 g_main_dispatch
at gmain.c line 3054
#24 g_main_context_dispatch
at gmain.c line 3630
#25 g_main_context_iterate
at gmain.c line 3701
#26 g_main_context_iteration
at gmain.c line 3762
#27 IA__gtk_main_iteration_do
at gtkmain.c line 1358
#28 Gtk::Main::iteration_impl
at main.cc line 549
#29 GParted::Win_GParted::show_pulsebar
at Win_GParted.cc line 629
#30 GParted::Win_GParted::menu_gparted_refresh_devices
at Win_GParted.cc line 1212
#31 Gtk::Widget_Class::show_callback
at widget.cc line 3855
#32 g_closure_invoke
at gclosure.c line 777
#33 signal_emit_unlocked_R
at gsignal.c line 3514
#34 g_signal_emit_valist
at gsignal.c line 3328
#35 g_signal_emit
at gsignal.c line 3384
#36 IA__gtk_widget_show
at gtkwidget.c line 3243
#37 Gtk::Main::run
at main.cc line 482
#38 main
at main.cc line 60

Comment 19 Allison Karlitskaya (desrt) 2013-06-11 03:37:54 UTC

Created attachment 246478 [details]
strace output

and here's a strace of what appears to be going on in that thread....

note the many many many calls like so:

23:11:38.175525 ioctl(5, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, {start=0, length=0, pno=106, devname="", volname=""}}) = -1 ENXIO (No such device or address)


and in particular this one:

23:11:38.190546 ioctl(5, BLKPG, {BLKPG_ADD_PARTITION, flags=0, datalen=152, {start=209182523392, length=30874869248, pno=6, devname="/dev/sda6", volname=""}}) = 0

Comment 20 Allison Karlitskaya (desrt) 2013-06-11 03:54:05 UTC

Looks like this behaviour of committing the partition table during get_devices() was introduced here:

https://git.gnome.org/browse/gparted/commit/?id=286579d5780099b8501fece8250473a1253ccf08

as a way to test if the kernel was capable of properly handling this rewrites.

Comment 21 Curtis Gedak 2013-06-11 19:28:20 UTC

Created attachment 246559 [details] [review]
Patch set to inhibit automount using systemctl runtime mask

Thank you Ryan for your the extra investigative work.

Would you be able to test this patch?

Since you provided a stack trace, I assume that you already know how to work with revision control systems and patches.  If not then please don't hesitate to ask and I will provide guidance.

Comment 22 Allison Karlitskaya (desrt) 2013-06-11 20:23:16 UTC

There are a few issues here:

1) The mask-and-then-unmask thing, as mentioned before, has some problems.
   Would be good to have a proper systemd solution here...

2) list-unit-files appears to only list unit _files_, and doesn't list the
   ones from the generator (ie: the ones from /etc/fstab).  You need to use
   list-units for this.  My bad.

3) This failed for me anyway because of a strange systemd issue: if you edit
   /etc/fstab and then do 'systemctl daemon-reload' it will create a unit
   for your new filesystem (which you can query) but this new unit will not
   appear in list-unit-files *or* list-units.  Obviously not the fault of
   this patch, and a reboot fixed systemd.

4) The mechanism used for ensuring uniqueness in the second patch would
   probably be better handled by becoming a GApplication and/or using a
   backend D-Bus service to actually manage the partitions.  Also better
   would be a systemd mechanism for ensuring exclusive access as for the
   various other backends you've had to support over the years...


Since two of these points are pie-in-the-sky and one is almost certainly a systemd bug, I guess all that's left is to change the 'list-unit-files' to 'list-units'....  At least that will be an improvement.

Comment 23 Allison Karlitskaya (desrt) 2013-06-11 20:35:02 UTC

Ah.  #3 is not a weird systemd issue, but rather goes straight to the heart of what we are trying to prevent: list-units, by default, only lists _active_ units (ie: mounted).  We need to add --all.

So, it should look like this, then:

systemctl list-units -t mount --no-legend --all | grep -v masked | cut -f1 -d' '

Comment 24 Allison Karlitskaya (desrt) 2013-06-11 20:41:19 UTC

We also need to use --full to prevent systemd from turning "sys-fs-fuse-connections.mount" into sys-fs-fu...onnections.mount

We also need to give "--" to the second command to prevent it from interpreting "-.mount" as an option.

Comment 25 Allison Karlitskaya (desrt) 2013-06-11 20:49:48 UTC

Created attachment 246564 [details] [review]
Use systemctl runtime mask to prevent automounting (#701676)

Updated with suggested changes.

This one seems to work when I test it here.

Comment 26 Curtis Gedak 2013-06-11 21:17:54 UTC

Thank you Ryan for the testing and the updated patch in comment #25.

So far I have not been able to replicate the problem.  Perhaps the difference is my default LVM install, or the fact I am using a Virtual Machine.  Though my thoughts are that this should not be the reason.

Regarding a systemd solution, one challenge we encounter is that systemd is not used by default on all GNU/Linux distributions.  For example neither Debian nor Ubuntu use systemd by default.

Also with GParted we try to come up with solutions that will run on all GNU/Linux distributions.  We also try to limit the complexity around preventing automounting to the gparted shell script.

I will combine your patch with the "prevent multiple GParted instances" patch and post the combined patch in this report.

Comment 27 Curtis Gedak 2013-06-11 21:35:30 UTC

Created attachment 246565 [details] [review]
Patch to inhibit automount using systemctl runtime mask

Ryan,

Would you be able to test this combined patch?

In the meantime I will test the patch with some other GNU/Linux distributions.

Thanks,
Curtis

Comment 28 Curtis Gedak 2013-06-11 22:36:03 UTC

I have successfully tested the patch from comment #27 on the following systems:

  Debian 6, 7
  Ubuntu 10.04, 12.04, 13.04
  Fedora 17, 18, 19 beta
  openSUSE 11.2, 12.2, 12.3


Mike Fleetwood,

When you have time, would you be able to review this patch set and test with CentOS?

Thanks,
Curtis

Comment 29 Mike Fleetwood 2013-06-12 22:08:42 UTC

Hi Curtis,


Tested successfully on CentOS 5.9 and Fedora 14.
Discussion of minor issues follows.


[PATCH] Use systemctl runtime mask to prevent automounting (#701676)

On Fedora 14 I get this:

    $ su root -c /usr/local/bin/gparted
    Password: 
    systemctl: unrecognized option '--no-legend'
    systemctl: unrecognized option '--runtime'
    ======================
    libparted : 2.3
    ======================
    systemctl: unrecognized option '--runtime'

1) Fedora 14 has an old version of systemctl which doesn't support
   those options.  May not matter as it's an unsupported OS now.

2) Fedora 14 only had systemd available as a preview and not the default
   init.  Debian 7 falls into this category too.  I think several other
   distributions did do / are doing the same thing and having systemd as
   an alternative init, before optionally making in the default in a
   later release.  Should we check that the init process is actually
   systemd?

A simple answer to both of these is just to ignore it.  Users who run
gparted from an icon never see this anyway and the failures cause no
harm.  Alternatively we could just redirect output from all 3 systemctl
commands to /dev/null.

I would probably keep this patch as it is.


[PATCH] Only permit one copy of GParted to execute at a time

This works but a user who runs GParted for a second time from an icon
will never see the error message.  Simple choice is to do nothing; bit
harder it to use zenity or xmessage to display an error dialog from the
gparted shell script; and the hardest would be to get gpartedbin exe to
display an error dialog like it does when the user isn't root.

I would personally do something about this because a user gets no
feedback as to why a second gparted didn't appear.  However I will take
the patch as it is if you want.


Thanks,
Mike

Comment 30 Curtis Gedak 2013-06-13 02:22:40 UTC

Hi Mike,

Thank you for the testing and for the thoughtful comments.

1) Fedora 14 unrecognized options.

I agree that we probably do not need to concern ourselves with an unsupported OS. Having said that, I can easily add a redirect to /dev/null so that the messages do not appear.

2) Should we check that the init process is actually systemd?

I could certainly add a check for both systemctl command and a systemd process such as the following:

HAVE_SYSTEMCTL=no
for k in '' `echo "$PATH" | sed 's,:, ,g'`; do
if test -x "$k/systemctl"; then
if test "z`ps -e | grep systemd`" != "z"; then
HAVE_SYSTEMCTL=yes
break
fi
fi
done

Or did you have something else in mind that would be a more robust check?

[PATCH] Only permit one copy of GParted to execute at a time

3) This works but a user who runs GParted for a second time from an icon
will never see the error message.

This is the current behaviour with the existing code. For example on systems that use HAL, invoking gparted from a menu icon will silently fail the second time . This is because HAL will not permit a second exclusive lock to be taken, and hence will not invoke gpartedbin.

I agree that having a graphical notification to the user would be nice. Unfortunately I am not aware of such a utility that is common across all desktop environments and GNU/Linux distros. We could try adding logic to test for the existence of some common ones.

A more complex solution would be to add the code to GParted. Of course this would not work for HAL systems because hal-lock would not invoke gpartedbin. Udisks and devkit-disks might respond in similar ways. This means that a gpartedbin solution has several shortcomings.

In summary I think a solution that tests for existence of several GUI notification tools might be the best approach.

I will work on an updated patch set that tests for utilities like xmessage and zenity.

Comment 31 Patrick Verner 2013-06-13 03:55:35 UTC

>> more robust check?

Doesn't every Linux distribution include which?

which systemctl
which xmessage
which zenity

etc...

Comment 32 Mike Fleetwood 2013-06-13 13:25:46 UTC

Hi Curtis,

[PATCH] Use systemctl runtime mask to prevent automounting (#701676)

1) Lets not redirect errors to /dev/null from systemctl.
   They aren't seen when running from an icon and might help future
   diagnosis when running gparted from the command line.

2) Lets do check for the systemd process running.  It's not a perfect
   check but is pretty good and seems unlikely to produce false
   positives coded as you did with `ps -e | grep systemd`.


[PATCH] Only permit one copy of GParted to execute at a time

3) Given the complexities you mentioned I'm OK with just having a
   second gpartedbin just silently doing nothing.  If we did display
   a message to the user with zenity/xmessage we ought to translate it
   and that seems too much effort.  Just go with the existing error
   written by the script.

Thanks,
Mike

Comment 33 Curtis Gedak 2013-06-13 16:26:36 UTC

Created attachment 246743 [details] [review]
Patch set to inhibit automount using systemctl runtime mask and limit GParted to one instance

Hi Mike,

I agree with all three of your points in comment #32.  I came to much the same conclusions after I posted comment #30 and had some more time to think.

Changes in this patch set:

[PATCH] Use systemctl runtime mask to prevent automounting (#701676)

2) Includes check for the systemd process running.


[PATCH] Only permit one copy of GParted to execute at a time

3) Logic left alone.  Comments have been updated.


Patrick,

I believe you are correct that many GNU/Linux distros include the "which" command.  If I recall correctly there is some advice for Portable Shell programming that recommends relying on as few outside utilities as possible to enable shell scripts to work on a broader range of platforms.  That is the reason why the "gparted" shell script checks the path for executables instead of using the "which" command.

Curtis

Comment 34 Mike Fleetwood 2013-06-13 19:22:20 UTC

Hi Curtis,

Patch set from comment #33 passed testing.

On Fedora 14 the errors don't occur because the systemctl command isn't
run as systemd isn't running.

Worked on Fedora 18 which uses systemd and CentOS 5.9 which doesn't.

I assume we are ready for committing upstream based on my testing and
Ryan's testing up to comment #24.  I'll commit tomorrow unless I hear
otherwise.

Mike

Comment 35 Curtis Gedak 2013-06-13 19:50:19 UTC

Hi Mike,

Thank you for your suggestions and reviewing and testing the patch set in comment #33.

I have successfully retested this patch set on:

  Debian 6
  Ubuntu 10.04
  Fedora 19 beta
  openSUSE 12.2

As such I believe this patch set is ready for commit to the git repository.

Curtis

Comment 36 Mike Fleetwood 2013-06-13 20:27:28 UTC

Patch set to resolve this bug has been commited to the GParted git
repository.  Patches can be see here:

Use systemctl runtime mask to prevent automounting (#701676)
https://git.gnome.org/browse/gparted/commit/?id=4c109df9b59e55699bd42023cf4007ee359793e9

Only permit one instance of GParted to execute at a time
https://git.gnome.org/browse/gparted/commit/?id=4c9c70d697ed728362a1cc74c35c0f76e3caf909

Thanks,
Mike

Comment 37 Allison Karlitskaya (desrt) 2013-06-13 20:39:34 UTC

btw: I like the fact that gparted now inhibits systemd mounting when it runs, but I think there's still a pretty major problem here:  gparted should not be writing to my partition table when I merely start it up.

Comment 38 Curtis Gedak 2013-06-14 15:50:55 UTC

Hi Ryan

> gparted should not be writing to my partition table when I merely
> start it up.

Do you have an alternate suggestion on how to approach this challenge?
If so, please create a new bug report and provide details on an alternate solution.

I agree with Bart's comment in the original commit noted in comment #20.  Namely that first testing the kernel can reread the partition table before enabling partition actions will help avoid data loss.

In fact any time that the kernel cannot reread the partition table can lead to many problems.  We experienced this first hand in late 2009 with the following forum reports:

WARNING! Problem Resizing File Systems with GParted
http://gparted-forum.surf4.info/viewtopic.php?id=13777

Since then we have improved the GParted logic.  However, the fact remains that actions performed on partitions when the kernel cannot re-read the partition table increase the likelihood of file system problems leading to data loss.

Curtis

Comment 39 Phillip Susi 2013-06-18 15:32:58 UTC

I'm still not clear why gparted does that sync on startup, but whether it happens on startup or after modifying some other, unrelated partition it is still not good.  I worked up a patch yesterday to parted to avoid removing and re-adding unmodified partitions to work around this little quirk causing Unity ( on Ubuntu ) to re-show partitions on the dock that you had removed.

If you removed the line from /etc/fstab and systemd is still mounting the partition in /home, that would be a bug in systemd.  At worst, it should mount it in /media instead.

Comment 40 Curtis Gedak 2013-06-18 15:48:06 UTC

Hi Phillip,

> I'm still not clear why gparted does that sync on startup

GParted checks to see if the kernel can re-read the partition table.  If not, then GParted disables some partition actions.  This is done to minimize the chance of data loss that can occur when the kernel has a differing view of the actual disk partition layout.

If there is another way to perform this "kernel can re-read partition table" check outside of libparted, then that would be another way to approach this problem.

Curtis

Comment 41 Phillip Susi 2013-06-18 15:59:06 UTC

Could you be more specific about what operations can not be allowed if updating won't work?  Also given the improvements to libparted over the years, this test will now never fail anyhow, so it seems kind of pointless.  It used to only fail if a partition was mounted, which you could test for by checking the busy flag, like we do for showing the locked icon.

Comment 42 Curtis Gedak 2013-06-18 16:21:13 UTC

> Could you be more specific about what operations can not be allowed if
> updating won't work?

Basically, if the kernel cannot re-read the partition table then GParted marks the device as "readonly".

If the device is marked by this check as readonly then the GUI will disable the "Resize/Move" menu option.

To see the code where this happens, search for "readonly" in the Win_GParted::set_valid_operations method.

https://git.gnome.org/browse/gparted/tree/src/Win_GParted.cc#n933

The link to the original commit that added this behaviour is in comment #20.


> It used to only fail if a partition was mounted, which you could test for
> by checking the busy flag, like we do for showing the locked icon.

Unfortunately this is not the only situation in which the kernel would fail to reread the partition table.  With some combinations of kernels and libparted, even with all partitions unmounted, we experienced situations where the kernel failed to reread the partition table.

See comment #38 for the link to several of these problems.

For this reason I believe we need to test for more than just mount status, but whether the kernel can actually re-read the partition table for a device.

Comment 43 Mike Fleetwood 2013-06-18 16:42:06 UTC

Also some current distributions don't use new versions of parted.

The latest RHEL/CentOS, release 6.4, still uses parted 2.1 which only
used the BLKRRPART ioctl so required all partitions to be unused for the
kernel to successfully re-read the partition table.  Otherwise you get
this infamous error:

    WARNING: the kernel failed to re-read the partition table on %s (%s).
    As a result, it may not reflect all of your changes until after reboot.

Comment 44 Phillip Susi 2013-06-18 17:15:58 UTC

What harm is there in detecting the error later?  And just because you can sync at startup doesn't mean it won't fail later.

Comment 45 Lennart Poettering 2013-06-18 17:32:17 UTC

I am pretty usre BTW that we should provide some built-in functionality in systemd that allows to temporarily disable auto-activation of mount points as soon as the device shows up. Or we maybe should remove that functionality entirely. Or in other words: I think this is a systemd problem that should be fixed in systemd, and without requiring manual masking.

I added this to the TODO list of systemd for now.

Comment 46 Curtis Gedak 2013-06-18 17:36:52 UTC

Phillip,

> What harm is there in detecting the error later?

The main difference I can see is in preventing the user from moving/resizing a
partition and potentially losing data.  The earlier we detect problems with the
kernel re-reading the partition table, the better we can try to prevent data
loss.

> And just because you can sync at startup doesn't mean it won't fail later.

I would agree.  There is a chance of losing data whenever the kernel fails to
re-read changes to the partition table.


Lennart,

Thank you for raising the issue with systemd.

We try to do what we can to prevent users from losing data when they edit partition tables with GParted.  Often this involves work-arounds because gparted is used on many different GNU/Linux distros using various versions of software.

Comment 47 Phillip Susi 2013-06-18 17:51:26 UTC

How can you lose data because syncing the partition table fails?  That's the part I'm not seeing.  If you try to move a partition, and syncing the temporary expanded partition table fails, then the operation fails, and the original partition table should be rolled back, with no harm done.  Same when enlarging a partition.

If you are shrinking a partition, then the fs will be shrunk, and if syncing the table fails, you just inform the user they have to reboot for the changes to take affect.

Comment 48 Curtis Gedak 2013-06-18 18:12:24 UTC

> How can you lose data because syncing the partition table fails?

The problem arises when file system actions are performed on the partition after the syncing fails.
  
The sequence of steps that would demonstrate the problem are:

1)  Change a partition boundaries
2)  Kernel fails to reread the changes to the partition table,
3)  Execute file system commands on the changed partition
    (for example maximize the file system to fit within a shrunken partition)

The partition tools see the disk view of the partition table, whereas the file system commands see the kernel's view of the partition table.

> If you try to move a partition, and syncing the temporary
> expanded partition table fails, then the operation fails, and the original
> partition table should be rolled back, with no harm done.

I agree that if the partition table is rolled back when syncing fails, then there should not be a problem.  However, this is not what actually happens -- the partition table is NOT rolled back.

I believe that not only GParted, but parted and fdisk leave the partition in the changed state when the kernel fails to reread the partition table.

Comment 49 Phillip Susi 2013-06-18 18:50:47 UTC

Well obviously you don't ignore the error and continue, and I'm pretty sure gparted already does halt if the sync fails rather than try to run tools on a partition that is not properly synced.  If it doesn't that would certainly be a bug that should be fixed, rather than trying to mask it by doing a "test run" sync first, since things can change between then and when you go to do the move.

It makes sense to leave the table in the new state if that is the final intended state, but not if it is only a temporary state, such as the union partition used during the move.  In that case, it makes sense to roll back, and that is what gparted currently does if there is an error during the actual data move, so it should do the same ( if it doesn't already ) if syncing the table prior to starting the move fails.

Comment 50 Curtis Gedak 2013-06-18 19:18:48 UTC

> Well obviously you don't ignore the error and continue, and I'm pretty
> sure gparted already does halt if the sync fails rather than try to run
> tools on a partition that is not properly synced.

Yes, I agree.  If a user disregards the kernel reread failure error and proceeds, then they must accept the consequences.

I also believe that GParted correctly halts when the kernel reread failure occurs.

I believe intention of first checking is to prevent users from resize/move actions on a device that demonstrates a kernel reread failure.  Without this prevention, a user could apply a resize/move operation and would not discover the problem until GParted runs into the error while applying the operation.

In this situation, no data should be lost, but the user does have to wait while GParted tries the operation and then finally fails.

I guess it boils down to how we want GParted to function.

Should gparted:

A) let the user queue a resize/move operation and wait until it fails
   while applying becuase the kernel cannot reread the partition table?

Or,

B) prevent the user from queuing a resize/move operation because we
   already know the kernel cannot reread the partition table?

Comment 51 Gabriel de Perthuis 2013-06-20 21:49:08 UTC

It's a big hammer, but wrapping some sections with:

udevadm control --stop-exec-queue
udevadm control --start-exec-queue

would work; once the device is locked by an invoked tool or the kernel you can leave the critical section.

Comment 52 Curtis Gedak 2013-09-18 16:47:09 UTC

The patches to address this report have been included in GParted 0.16.2 released on September 18, 2013.

Comment 53 Lennart Poettering 2014-05-22 03:06:22 UTC

There's now a sane API for this in current udev, and gparted should be adapted to make use of it: simply take a BSD file lock on the main device node of the device in LOCK_EX. udev will also lock the device, and will skip processing it if it cannot get it. Also, udev watches when a process closes the device node after write, and then retrigger the device. Putting this together this is a very simple and natural way to make sure partitioners and udev rule processing don't interfere.

Note that you need to take the lock on the "main" device node ("/dev/sda"), i.e. not the partition device nodes ("/dev/sda5").