GNOME Bugzilla – Bug 701676
gparted doesn't inhibit systemd mounting, leading to potential data loss
Last modified: 2014-05-22 03:06:22 UTC
I wanted to resize my home partition (/home) using gparted, so I umounted it and created /home/desrt on the root partition so that I could login. Once logged in, I had a new empty home directory, as expected. I ran "gparted" from a terminal and it asked for the root password to acquire permissions. I then tried to resize my home directory. gparted refused to do that on the basis that the partition was mounted. I was very very shocked to see this happen so I attempted to reproduce it. The simple act of _starting_ gparted was enough to cause /home to be mounted. This is a huge problem, of course. The session doesn't normally expect the home directory to be suddenly remounted under it, and I lost a good deal of data because of this. My dconf database was blown away and my tracker database corrupted (since these programs assume that their database won't suddenly be replaced when they're in the middle of accessing it). This is gparted 0.14.1 from the Fedora 19 beta.
GParted uses tools such as udisks, devkit-disks, and hal-lock to prevent other utilities from automounting directories. The script that invokes these protections can be viewed at the following link: https://git.gnome.org/browse/gparted/tree/gparted.in Perhaps these utilities are no longer installed by default on Fedora 19 beta? To check if any of these are running, would you be able to provide the output from the following command? ps -ef | egrep 'hald|devkit-disks-da|udisks-daemon'
This is all that matches: desrt 1053 1 0 Jun05 ? 00:00:00 /usr/libexec/gvfs-udisks2-volume-monitor root 1055 1 0 Jun05 ? 00:00:02 /usr/lib/udisks2/udisksd --no-debug It's worth noting that I don't have any 'udisks' or 'hal-lock' binaries installed. udisks2 seems to use a new binary called 'udisksctl'. So let's assume that it's udisks2 that does this (since I don't have hal or devkit). This even explains why knowledge of /dev/sda5 being associated with /home could persist (in the daemon) after I commented it out in fstab. Why would it automount partitions in response to gparted merely being started?
The systemd journal shows this activity around the time I ran gparted: Jun 05 16:14:46 moonpix userhelper[3931]: running '/usr/sbin/gparted' with root privileges on behalf of 'desrt' Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 05 16:14:47 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 05 16:14:47 moonpix systemd[1]: Mounting /home... Jun 05 16:14:47 moonpix systemd[1]: home.mount: Directory /home to mount over is not empty, mounting anyway. Jun 05 16:14:47 moonpix kernel: EXT4-fs (sda5): mounted filesystem with ordered data mode. Opts: (null) Jun 05 16:14:47 moonpix systemd[1]: Mounted /home. *sigh*
So http://www.freedesktop.org/software/systemd/man/systemd.mount.html tells an interesting story... It seems that in the brave new world of systemd, /etc/fstab is semi-deprecated (which is why nothing changed when I modified it) and systemd is in charge of all the things. It wants to mount partitions when it sees (what it assumes to be) hotplug events from the kernel -- which are generated as a side effect of gparted's probing (which I guess is why you already had problems with earlier systems). You can mount and unmount with 'systemctl start/stop home.mount'. You cannot disable the 'home.mount' unit because it doesn't truly exist: [desrt@moonpix ~]$ sudo systemctl disable home.mount Failed to issue method call: No such file or directory But you can mask it: [desrt@moonpix ~]$ sudo systemctl mask home.mount ln -s '/dev/null' '/etc/systemd/system/home.mount' but this seems like a pretty bad idea since it persists across boots... Fortunately, we have this: [desrt@moonpix ~]$ sudo systemctl --runtime mask home.mount ln -s '/dev/null' '/run/systemd/system/home.mount' So we can probably do something like so: [desrt@moonpix ~]$ sudo systemctl --runtime mask `systemctl list-unit-files -t mount --no-legend | cut -f1 -d' '` ln -s '/dev/null' '/run/systemd/system/dev-hugepages.mount' ln -s '/dev/null' '/run/systemd/system/dev-mqueue.mount' ln -s '/dev/null' '/run/systemd/system/proc-fs-nfsd.mount' ln -s '/dev/null' '/run/systemd/system/proc-sys-fs-binfmt_misc.mount' ln -s '/dev/null' '/run/systemd/system/sys-fs-fuse-connections.mount' ln -s '/dev/null' '/run/systemd/system/sys-kernel-config.mount' ln -s '/dev/null' '/run/systemd/system/sys-kernel-debug.mount' ln -s '/dev/null' '/run/systemd/system/tmp.mount' ln -s '/dev/null' '/run/systemd/system/var-lib-nfs-rpc_pipefs.mount' but really, systemd should _not_ be mounting partitions based on the 'appearance' of a device that already existed and doubly so when said partition was already mounted at boot and the user has explicitly unmounted it since...
Thank you Ryan for all your research into why this problem is occurring. Your solution proposal in comment #4 sounds reasonable, and has the added benefit that it should not persist beyond the runtime of GParted. Are you interested in writing a patch? If so, details on GParted development can be found at: http://gparted.org/development.php If not, then no worries. I can set up a fedora 19 beta virtual machine and try to implement your proposed solution.
Hmm... after looking at udisks2, it does not appear to have the same functionality as udisks1. By this I mean that I did not observe a udisks2 related command for disabling automount. Regarding systemctl, it appears that the --runtime option will persist until the next reboot. This is most likely longer than the runtime of gparted, but could be considered a better compromise than potentially losing data as you unfortunately experienced. From the systemctl man page: http://www.freedesktop.org/software/systemd/man/systemctl.html --runtime When used with enable, disable, is-enabled (and related commands), make changes only temporarily, so that they are lost on the next reboot. This will have the effect that changes are not made in subdirectories of /etc but in /run, with identical immediate effects, however, since the latter is lost on reboot, the changes are lost too. Similar, when used with set-cgroup-attr, unset-cgroup-attr, set-cgroup and unset-cgroup, make changes only temporarily, so that they are lost on the next reboot. I will keep looking to see if there is another way to inhibit automounting only while GParted is running.
One option might be to store a copy of the "list-unit-files" that we mask. Then we could issue an unmask command after GParted finishes execution. For example: 1) MOUNTLIST=`systemctl list-unit-files -t mount --no-legend | cut -f1 -d' '` 2) sudo systemctl --runtime mask $MOUNTLIST 3) Execute gparted 4) sudo systemctl --runtime unmask $MOUNTLIST
I considered something like this, but I think we would actually want a 'grep -v masked' in the first line so that we don't bother masking mounts that are already masked (and don't unmask them when we're done). As to life-of-the-system vs. life-of-gparted, I don't think this is too much concern. My concern with 'mask' (storing files in /etc) is if the system crashes or loses power while gparted is running. As long as gparted exits cleanly we will get a chance to do the unmasking. There is also a concern that the masking/unmasking is boolean and not using some sort of refcount semantics. This means that if we run two copies of gparted: A B run gparted (mask a b c d) run gparted (mask nothing because it's already masked) exit gparted (unmask a b c d) we can find ourselves in a case where gparted 'B' is still running, but mounts are unmasked again. And of course this would apply not only to two copies of gparted but also to gparted running at the same time as any other program that engages in similar tricks.
Good points Ryan. Regarding multiple instances of GParted, with hal-lock gparted was limited to running only one instance because the Device.Storage exclusive lock could not be acquired a second time. Ideally we wish to prevent multiple instances of GParted. We might consider adding a check to see if gpartedbin is already running. If it is then the script should not invoke a second instance.
What's your precise fstab line?
Nothing unusual; here's the whole thing. # # /etc/fstab # Created by anaconda on Sat Jun 1 19:29:32 2013 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # UUID=dab3bdea-61e4-4533-9998-a2dbc0afa4ef / ext4 noatime 1 1 UUID=802d2024-dc02-4f45-814d-8756af26a040 /boot ext4 noatime 1 2 UUID=33DE-E1DE /boot/efi vfat umask=0077,shortname=winnt 0 0 UUID=3f74ddf8-a052-4667-9320-3b42bebfe31a swap swap defaults 0 0 UUID=1d4e32d6-e3bd-4331-9be5-149f3059a193 /home ext4 noatime 0 0
Hi Ryan, I just finished a default install (with LVM) of Fedora 19 beta and so far have been unable to reproduce the problem you experienced. Would you be able to list the exact steps to create the problem? Curtis
I have still been unable to recreate the problem. The steps I used were: 1) Setup a VM with a default install of Fedora 19 beta 2) Install gparted 0.14.1 package 3) Create a 512 MiB ext2 partition /dev/sdb1 4) Edit /etc/fstab to add the following entry: UUID={uuid-for-sdb1-ext2-file-system} /data ext2 noatime 1 3 5) Reboot 6) Start GParted 7) With GParted, unmount /data (/dev/sdb1) 8) Resize /dev/sdb1 from 512 MiB to 768 MiB and apply operation 9) No problems encountered, and /data was not "automounted". Ryan, Would you be able to list the exact steps to create the problem?
Sure. Your setup above looks fine except you need to modify it like so: 5) Reboot + +5.5) umount /data + 6) Start GParted and you will see that step 6 (merely _starting_ gparted) results in /data being automatically remounted. That's the problem.
Thank you Ryan for the clarification on the steps. When I perform the three steps from comment #14, /data remains unmounted when I start gparted. Can you test this on your system as well? Perhaps there is something else that we are missing in the steps.
So a few ideas about what may be different: a) I didn't use LVM b) I'm outside of a VM (I mention this because this problem is fairly clearly related to hardware probing) c) You mention that your partition is /dev/sdb1 which makes me suspect it's a different partition on a separate disk. Mine was on /dev/sda, so probing the partitions already on /dev/sda might have caused it to be remounted... Other than that, I'm not sure what the differences might be. I am indeed able to reproduce this again and again, though. Indeed. this is enough for me: a) add this line to fstab, having never done so before: UUID=45341d93-b5df-4321-88e5-521ef3997b5d /mnt ext2 noatime 0 0 where this UUID is my /dev/sda6 partition then b) sudo systemctl daemon-reload then c) gparted These three steps are enough for /dev/sda6 to end up mounted on /mnt and I see this in the system journal: Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 10 22:59:01 moonpix systemd[1]: Found device INTEL_SSDSC2CW240A3. Jun 10 22:59:01 moonpix systemd[1]: Mounting /mnt... ie: I think those three "Found device INTEL_SSDSC2CW240A3" lines are in response to gparted probing the existing partitions causing some sort of hardware coldplug which systemd then picks up and mounts /mnt based on...
s/three/five/ obviously.
So this backtrace happens on startup: ie: we are attempting to rewrite the partition table on startup.
+ Trace 232035
Thread 1 (Thread 0x7ffff7fbca40 (LWP 21111))
Created attachment 246478 [details] strace output and here's a strace of what appears to be going on in that thread.... note the many many many calls like so: 23:11:38.175525 ioctl(5, BLKPG, {BLKPG_DEL_PARTITION, flags=0, datalen=152, {start=0, length=0, pno=106, devname="", volname=""}}) = -1 ENXIO (No such device or address) and in particular this one: 23:11:38.190546 ioctl(5, BLKPG, {BLKPG_ADD_PARTITION, flags=0, datalen=152, {start=209182523392, length=30874869248, pno=6, devname="/dev/sda6", volname=""}}) = 0
Looks like this behaviour of committing the partition table during get_devices() was introduced here: https://git.gnome.org/browse/gparted/commit/?id=286579d5780099b8501fece8250473a1253ccf08 as a way to test if the kernel was capable of properly handling this rewrites.
Created attachment 246559 [details] [review] Patch set to inhibit automount using systemctl runtime mask Thank you Ryan for your the extra investigative work. Would you be able to test this patch? Since you provided a stack trace, I assume that you already know how to work with revision control systems and patches. If not then please don't hesitate to ask and I will provide guidance.
There are a few issues here: 1) The mask-and-then-unmask thing, as mentioned before, has some problems. Would be good to have a proper systemd solution here... 2) list-unit-files appears to only list unit _files_, and doesn't list the ones from the generator (ie: the ones from /etc/fstab). You need to use list-units for this. My bad. 3) This failed for me anyway because of a strange systemd issue: if you edit /etc/fstab and then do 'systemctl daemon-reload' it will create a unit for your new filesystem (which you can query) but this new unit will not appear in list-unit-files *or* list-units. Obviously not the fault of this patch, and a reboot fixed systemd. 4) The mechanism used for ensuring uniqueness in the second patch would probably be better handled by becoming a GApplication and/or using a backend D-Bus service to actually manage the partitions. Also better would be a systemd mechanism for ensuring exclusive access as for the various other backends you've had to support over the years... Since two of these points are pie-in-the-sky and one is almost certainly a systemd bug, I guess all that's left is to change the 'list-unit-files' to 'list-units'.... At least that will be an improvement.
Ah. #3 is not a weird systemd issue, but rather goes straight to the heart of what we are trying to prevent: list-units, by default, only lists _active_ units (ie: mounted). We need to add --all. So, it should look like this, then: systemctl list-units -t mount --no-legend --all | grep -v masked | cut -f1 -d' '
We also need to use --full to prevent systemd from turning "sys-fs-fuse-connections.mount" into sys-fs-fu...onnections.mount We also need to give "--" to the second command to prevent it from interpreting "-.mount" as an option.
Created attachment 246564 [details] [review] Use systemctl runtime mask to prevent automounting (#701676) Updated with suggested changes. This one seems to work when I test it here.
Thank you Ryan for the testing and the updated patch in comment #25. So far I have not been able to replicate the problem. Perhaps the difference is my default LVM install, or the fact I am using a Virtual Machine. Though my thoughts are that this should not be the reason. Regarding a systemd solution, one challenge we encounter is that systemd is not used by default on all GNU/Linux distributions. For example neither Debian nor Ubuntu use systemd by default. Also with GParted we try to come up with solutions that will run on all GNU/Linux distributions. We also try to limit the complexity around preventing automounting to the gparted shell script. I will combine your patch with the "prevent multiple GParted instances" patch and post the combined patch in this report.
Created attachment 246565 [details] [review] Patch to inhibit automount using systemctl runtime mask Ryan, Would you be able to test this combined patch? In the meantime I will test the patch with some other GNU/Linux distributions. Thanks, Curtis
I have successfully tested the patch from comment #27 on the following systems: Debian 6, 7 Ubuntu 10.04, 12.04, 13.04 Fedora 17, 18, 19 beta openSUSE 11.2, 12.2, 12.3 Mike Fleetwood, When you have time, would you be able to review this patch set and test with CentOS? Thanks, Curtis
Hi Curtis, Tested successfully on CentOS 5.9 and Fedora 14. Discussion of minor issues follows. [PATCH] Use systemctl runtime mask to prevent automounting (#701676) On Fedora 14 I get this: $ su root -c /usr/local/bin/gparted Password: systemctl: unrecognized option '--no-legend' systemctl: unrecognized option '--runtime' ====================== libparted : 2.3 ====================== systemctl: unrecognized option '--runtime' 1) Fedora 14 has an old version of systemctl which doesn't support those options. May not matter as it's an unsupported OS now. 2) Fedora 14 only had systemd available as a preview and not the default init. Debian 7 falls into this category too. I think several other distributions did do / are doing the same thing and having systemd as an alternative init, before optionally making in the default in a later release. Should we check that the init process is actually systemd? A simple answer to both of these is just to ignore it. Users who run gparted from an icon never see this anyway and the failures cause no harm. Alternatively we could just redirect output from all 3 systemctl commands to /dev/null. I would probably keep this patch as it is. [PATCH] Only permit one copy of GParted to execute at a time This works but a user who runs GParted for a second time from an icon will never see the error message. Simple choice is to do nothing; bit harder it to use zenity or xmessage to display an error dialog from the gparted shell script; and the hardest would be to get gpartedbin exe to display an error dialog like it does when the user isn't root. I would personally do something about this because a user gets no feedback as to why a second gparted didn't appear. However I will take the patch as it is if you want. Thanks, Mike
Hi Mike, Thank you for the testing and for the thoughtful comments. 1) Fedora 14 unrecognized options. I agree that we probably do not need to concern ourselves with an unsupported OS. Having said that, I can easily add a redirect to /dev/null so that the messages do not appear. 2) Should we check that the init process is actually systemd? I could certainly add a check for both systemctl command and a systemd process such as the following: HAVE_SYSTEMCTL=no for k in '' `echo "$PATH" | sed 's,:, ,g'`; do if test -x "$k/systemctl"; then if test "z`ps -e | grep systemd`" != "z"; then HAVE_SYSTEMCTL=yes break fi fi done Or did you have something else in mind that would be a more robust check? [PATCH] Only permit one copy of GParted to execute at a time 3) This works but a user who runs GParted for a second time from an icon will never see the error message. This is the current behaviour with the existing code. For example on systems that use HAL, invoking gparted from a menu icon will silently fail the second time . This is because HAL will not permit a second exclusive lock to be taken, and hence will not invoke gpartedbin. I agree that having a graphical notification to the user would be nice. Unfortunately I am not aware of such a utility that is common across all desktop environments and GNU/Linux distros. We could try adding logic to test for the existence of some common ones. A more complex solution would be to add the code to GParted. Of course this would not work for HAL systems because hal-lock would not invoke gpartedbin. Udisks and devkit-disks might respond in similar ways. This means that a gpartedbin solution has several shortcomings. In summary I think a solution that tests for existence of several GUI notification tools might be the best approach. I will work on an updated patch set that tests for utilities like xmessage and zenity.
>> more robust check? Doesn't every Linux distribution include which? which systemctl which xmessage which zenity etc...
Hi Curtis, [PATCH] Use systemctl runtime mask to prevent automounting (#701676) 1) Lets not redirect errors to /dev/null from systemctl. They aren't seen when running from an icon and might help future diagnosis when running gparted from the command line. 2) Lets do check for the systemd process running. It's not a perfect check but is pretty good and seems unlikely to produce false positives coded as you did with `ps -e | grep systemd`. [PATCH] Only permit one copy of GParted to execute at a time 3) Given the complexities you mentioned I'm OK with just having a second gpartedbin just silently doing nothing. If we did display a message to the user with zenity/xmessage we ought to translate it and that seems too much effort. Just go with the existing error written by the script. Thanks, Mike
Created attachment 246743 [details] [review] Patch set to inhibit automount using systemctl runtime mask and limit GParted to one instance Hi Mike, I agree with all three of your points in comment #32. I came to much the same conclusions after I posted comment #30 and had some more time to think. Changes in this patch set: [PATCH] Use systemctl runtime mask to prevent automounting (#701676) 2) Includes check for the systemd process running. [PATCH] Only permit one copy of GParted to execute at a time 3) Logic left alone. Comments have been updated. Patrick, I believe you are correct that many GNU/Linux distros include the "which" command. If I recall correctly there is some advice for Portable Shell programming that recommends relying on as few outside utilities as possible to enable shell scripts to work on a broader range of platforms. That is the reason why the "gparted" shell script checks the path for executables instead of using the "which" command. Curtis
Hi Curtis, Patch set from comment #33 passed testing. On Fedora 14 the errors don't occur because the systemctl command isn't run as systemd isn't running. Worked on Fedora 18 which uses systemd and CentOS 5.9 which doesn't. I assume we are ready for committing upstream based on my testing and Ryan's testing up to comment #24. I'll commit tomorrow unless I hear otherwise. Mike
Hi Mike, Thank you for your suggestions and reviewing and testing the patch set in comment #33. I have successfully retested this patch set on: Debian 6 Ubuntu 10.04 Fedora 19 beta openSUSE 12.2 As such I believe this patch set is ready for commit to the git repository. Curtis
Patch set to resolve this bug has been commited to the GParted git repository. Patches can be see here: Use systemctl runtime mask to prevent automounting (#701676) https://git.gnome.org/browse/gparted/commit/?id=4c109df9b59e55699bd42023cf4007ee359793e9 Only permit one instance of GParted to execute at a time https://git.gnome.org/browse/gparted/commit/?id=4c9c70d697ed728362a1cc74c35c0f76e3caf909 Thanks, Mike
btw: I like the fact that gparted now inhibits systemd mounting when it runs, but I think there's still a pretty major problem here: gparted should not be writing to my partition table when I merely start it up.
Hi Ryan > gparted should not be writing to my partition table when I merely > start it up. Do you have an alternate suggestion on how to approach this challenge? If so, please create a new bug report and provide details on an alternate solution. I agree with Bart's comment in the original commit noted in comment #20. Namely that first testing the kernel can reread the partition table before enabling partition actions will help avoid data loss. In fact any time that the kernel cannot reread the partition table can lead to many problems. We experienced this first hand in late 2009 with the following forum reports: WARNING! Problem Resizing File Systems with GParted http://gparted-forum.surf4.info/viewtopic.php?id=13777 Since then we have improved the GParted logic. However, the fact remains that actions performed on partitions when the kernel cannot re-read the partition table increase the likelihood of file system problems leading to data loss. Curtis
I'm still not clear why gparted does that sync on startup, but whether it happens on startup or after modifying some other, unrelated partition it is still not good. I worked up a patch yesterday to parted to avoid removing and re-adding unmodified partitions to work around this little quirk causing Unity ( on Ubuntu ) to re-show partitions on the dock that you had removed. If you removed the line from /etc/fstab and systemd is still mounting the partition in /home, that would be a bug in systemd. At worst, it should mount it in /media instead.
Hi Phillip, > I'm still not clear why gparted does that sync on startup GParted checks to see if the kernel can re-read the partition table. If not, then GParted disables some partition actions. This is done to minimize the chance of data loss that can occur when the kernel has a differing view of the actual disk partition layout. If there is another way to perform this "kernel can re-read partition table" check outside of libparted, then that would be another way to approach this problem. Curtis
Could you be more specific about what operations can not be allowed if updating won't work? Also given the improvements to libparted over the years, this test will now never fail anyhow, so it seems kind of pointless. It used to only fail if a partition was mounted, which you could test for by checking the busy flag, like we do for showing the locked icon.
> Could you be more specific about what operations can not be allowed if > updating won't work? Basically, if the kernel cannot re-read the partition table then GParted marks the device as "readonly". If the device is marked by this check as readonly then the GUI will disable the "Resize/Move" menu option. To see the code where this happens, search for "readonly" in the Win_GParted::set_valid_operations method. https://git.gnome.org/browse/gparted/tree/src/Win_GParted.cc#n933 The link to the original commit that added this behaviour is in comment #20. > It used to only fail if a partition was mounted, which you could test for > by checking the busy flag, like we do for showing the locked icon. Unfortunately this is not the only situation in which the kernel would fail to reread the partition table. With some combinations of kernels and libparted, even with all partitions unmounted, we experienced situations where the kernel failed to reread the partition table. See comment #38 for the link to several of these problems. For this reason I believe we need to test for more than just mount status, but whether the kernel can actually re-read the partition table for a device.
Also some current distributions don't use new versions of parted. The latest RHEL/CentOS, release 6.4, still uses parted 2.1 which only used the BLKRRPART ioctl so required all partitions to be unused for the kernel to successfully re-read the partition table. Otherwise you get this infamous error: WARNING: the kernel failed to re-read the partition table on %s (%s). As a result, it may not reflect all of your changes until after reboot.
What harm is there in detecting the error later? And just because you can sync at startup doesn't mean it won't fail later.
I am pretty usre BTW that we should provide some built-in functionality in systemd that allows to temporarily disable auto-activation of mount points as soon as the device shows up. Or we maybe should remove that functionality entirely. Or in other words: I think this is a systemd problem that should be fixed in systemd, and without requiring manual masking. I added this to the TODO list of systemd for now.
Phillip, > What harm is there in detecting the error later? The main difference I can see is in preventing the user from moving/resizing a partition and potentially losing data. The earlier we detect problems with the kernel re-reading the partition table, the better we can try to prevent data loss. > And just because you can sync at startup doesn't mean it won't fail later. I would agree. There is a chance of losing data whenever the kernel fails to re-read changes to the partition table. Lennart, Thank you for raising the issue with systemd. We try to do what we can to prevent users from losing data when they edit partition tables with GParted. Often this involves work-arounds because gparted is used on many different GNU/Linux distros using various versions of software.
How can you lose data because syncing the partition table fails? That's the part I'm not seeing. If you try to move a partition, and syncing the temporary expanded partition table fails, then the operation fails, and the original partition table should be rolled back, with no harm done. Same when enlarging a partition. If you are shrinking a partition, then the fs will be shrunk, and if syncing the table fails, you just inform the user they have to reboot for the changes to take affect.
> How can you lose data because syncing the partition table fails? The problem arises when file system actions are performed on the partition after the syncing fails. The sequence of steps that would demonstrate the problem are: 1) Change a partition boundaries 2) Kernel fails to reread the changes to the partition table, 3) Execute file system commands on the changed partition (for example maximize the file system to fit within a shrunken partition) The partition tools see the disk view of the partition table, whereas the file system commands see the kernel's view of the partition table. > If you try to move a partition, and syncing the temporary > expanded partition table fails, then the operation fails, and the original > partition table should be rolled back, with no harm done. I agree that if the partition table is rolled back when syncing fails, then there should not be a problem. However, this is not what actually happens -- the partition table is NOT rolled back. I believe that not only GParted, but parted and fdisk leave the partition in the changed state when the kernel fails to reread the partition table.
Well obviously you don't ignore the error and continue, and I'm pretty sure gparted already does halt if the sync fails rather than try to run tools on a partition that is not properly synced. If it doesn't that would certainly be a bug that should be fixed, rather than trying to mask it by doing a "test run" sync first, since things can change between then and when you go to do the move. It makes sense to leave the table in the new state if that is the final intended state, but not if it is only a temporary state, such as the union partition used during the move. In that case, it makes sense to roll back, and that is what gparted currently does if there is an error during the actual data move, so it should do the same ( if it doesn't already ) if syncing the table prior to starting the move fails.
> Well obviously you don't ignore the error and continue, and I'm pretty > sure gparted already does halt if the sync fails rather than try to run > tools on a partition that is not properly synced. Yes, I agree. If a user disregards the kernel reread failure error and proceeds, then they must accept the consequences. I also believe that GParted correctly halts when the kernel reread failure occurs. I believe intention of first checking is to prevent users from resize/move actions on a device that demonstrates a kernel reread failure. Without this prevention, a user could apply a resize/move operation and would not discover the problem until GParted runs into the error while applying the operation. In this situation, no data should be lost, but the user does have to wait while GParted tries the operation and then finally fails. I guess it boils down to how we want GParted to function. Should gparted: A) let the user queue a resize/move operation and wait until it fails while applying becuase the kernel cannot reread the partition table? Or, B) prevent the user from queuing a resize/move operation because we already know the kernel cannot reread the partition table?
It's a big hammer, but wrapping some sections with: udevadm control --stop-exec-queue udevadm control --start-exec-queue would work; once the device is locked by an invoked tool or the kernel you can leave the critical section.
The patches to address this report have been included in GParted 0.16.2 released on September 18, 2013.
There's now a sane API for this in current udev, and gparted should be adapted to make use of it: simply take a BSD file lock on the main device node of the device in LOCK_EX. udev will also lock the device, and will skip processing it if it cannot get it. Also, udev watches when a process closes the device node after write, and then retrigger the device. Putting this together this is a very simple and natural way to make sure partitioners and udev rule processing don't interfere. Note that you need to take the lock on the "main" device node ("/dev/sda"), i.e. not the partition device nodes ("/dev/sda5").