Bug 685740 – Refactor to use asynchronous command execution

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 685740 - Refactor to use asynchronous command execution


Summary:	Refactor to use asynchronous command execution


Status:	RESOLVED FIXED

Product:	gparted
Classification:	Other
Component:	application
Version:	GIT HEAD
Hardware:	Other Linux

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	gparted maintainers alias
QA Contact:	gparted maintainers alias

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2012-10-08 16:57 UTC by Curtis Gedak
Modified:	2013-03-19 16:54 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch to prevent hang when scanning disk devices with an ext3 partition (897 bytes, patch) 2012-10-08 17:00 UTC, Curtis Gedak	none	Details \| Review
Delay clearing line output until a non Carriage Return is received (2.38 KB, patch) 2013-02-04 20:51 UTC, Curtis Gedak	none	Details \| Review
Handle old versions of e2fsprogs without ext4 support (v1) (4.43 KB, patch) 2013-02-16 00:56 UTC, Mike Fleetwood	none	Details \| Review
Backport spawn asynchronous code to old GNU/Linux distributions (5.54 KB, patch) 2013-02-17 17:56 UTC, Curtis Gedak	none	Details \| Review
Patch set collection including all previous changes up till now (213.70 KB, patch) 2013-02-17 20:12 UTC, Curtis Gedak	none	Details \| Review
Patch set collection 2 includes all previous changes (210.62 KB, patch) 2013-02-22 17:35 UTC, Curtis Gedak	none	Details \| Review
Fix shrinking reiserfs file systems (2.01 KB, patch) 2013-02-22 23:54 UTC, Mike Fleetwood	none	Details \| Review
Patch set collection 3 includes all previous changes (213.84 KB, patch) 2013-02-25 23:14 UTC, Curtis Gedak	none	Details \| Review
Patch to fix race condition when spawning commands and connecting IO stream handler (3.26 KB, patch) 2013-02-27 17:52 UTC, Curtis Gedak	none	Details \| Review
Patch set collection 4 includes all previous changes (217.11 KB, patch) 2013-02-27 17:56 UTC, Curtis Gedak	none	Details \| Review
Patch to add 'sh', '-c', 'command' argument vector to spawn asynchonous execution (4.12 KB, patch) 2013-02-27 20:03 UTC, Curtis Gedak	none	Details \| Review
Patch set collection 5 includes all previous changes (221.23 KB, patch) 2013-02-27 20:05 UTC, Curtis Gedak	none	Details \| Review
GParted apply operations dialog regression - extra blank entries in details (73.76 KB, image/png) 2013-02-28 18:46 UTC, Curtis Gedak		Details
GParted apply operations dialog regression - Last copy block sometimes missed in details (76.96 KB, image/png) 2013-03-02 18:07 UTC, Curtis Gedak		Details
gparted_details.htm log file of move BTRFS with Cancel, Force Cancel, and Cancel Operation pressed - still rolls back file system. (9.91 KB, text/html) 2013-03-04 18:31 UTC, Curtis Gedak		Details
gparted_details.htm log file of move BTRFS with Cancel, wail then Force Cancel, and Cancel Operation press in copy rollback FS - second cancel works.. (10.00 KB, text/html) 2013-03-04 19:55 UTC, Curtis Gedak		Details
gparted_details.htm log file of move BTRFS with Cancel, Force Cancel, and Cancel Operation pressed still rolls back file system. (13.30 KB, text/html) 2013-03-05 17:09 UTC, Curtis Gedak		Details
Patch to fix mount problem of known mount points - root cause was 80 char wrap in command output. (1.30 KB, patch) 2013-03-06 00:36 UTC, Curtis Gedak	none	Details \| Review
gparted_details.htm log file of move BTRFS with Cancel, Force Cancel, and Cancel Operation pressed - correctly skips file system roll back. (11.01 KB, text/html) 2013-03-06 02:14 UTC, Curtis Gedak		Details
Patch set collection 6 includes all previous changes (227.97 KB, patch) 2013-03-06 16:51 UTC, Curtis Gedak	none	Details \| Review
0001-Consider-logical-partitions-when-selecting-default-f.patch (2.43 KB, patch) 2013-03-17 02:18 UTC, Phillip Susi	none	Details \| Review
Select largest free partition (v3) (4.88 KB, patch) 2013-03-17 13:30 UTC, Mike Fleetwood	none	Details \| Review
0001-Consider-logical-partitions-when-selecting-default-f.patch (2.73 KB, patch) 2013-03-17 21:33 UTC, Phillip Susi	none	Details \| Review

Description Curtis Gedak 2012-10-08 16:57:40 UTC

Currently GParted uses threads to execute external commands.  It would
be beneficial to refactor GParted to use asynchronous command
execution instead of using threads, for the following reasons:

1)  The thread implementation is not optimal

    The thread implementation in Dialog_Progress.cc creates a separate
    thread to invoke the external command.  Once the thread is
    running, GParted basically waits in an busy loop for the thread to
    complete.

2)  The cancelling of external commands is not effective

    The method pthread_cancel is invoked when a user chooses to cancel
    a running operation.  With the current implementation, the external
    command is not guaranteed to be terminated.

3)  Improved progress tracking of command execution is desired

    By changing command execution from using threads to using
    asynchronous spawning, we should be able to improve the ability to
    monitor the progress of the spawned command.

With these improvements in mind, Phillip Susi has put significant work
into refactoring GParted.  Phillip has created a separate branch to
track his work.  The branch is named "psusi/refactor" and can be found
at the following link:
http://git.gnome.org/browse/gparted/log/?h=psusi/refactor

Comment 1 Curtis Gedak 2012-10-08 17:00:54 UTC

Created attachment 226060 [details] [review]
Patch to prevent hang when scanning disk devices with an ext3 partition

The code in the current psusi/refactor branch (as of March 20, 2012), will hang when a disk device containing an ext3 file system is encountered.  The patch in this post addresses this situation.

Comment 2 Curtis Gedak 2013-01-16 17:25:10 UTC

Last night Phillip and I discussed on the parted IRC some changes to GParted to make it easier to refactor the code to use asynchronous command execution.

The three items discussed, reasoning, and decisions made are as follows:

A) Retire the GParted custom method for handling DMRAID. (Yes)

To be able to do this requires increasing the minimum required version of libparted because libparted 1.7.1 does not properly support DMRAID.

We believe that dmraid support goes back to at least libparted 1.9. However, libparted versions prior to the 2.x series do not support devices with sector sizes other than 512 bytes. Also there were many resizing problems with GParted and libparted failing to inform the kernel of partition changes with libparted 2.0 and 2.1. Hence a choice of libparted 2.2 or higher would be preferred.

Starting with GParted 0.8.1 it has been possible to compile GParted using the libparted DMRAID support with the configure option --enable-libparted-dmraid. Phillip built the packages for Ubuntu using this option and so far no problems have been reported specifically due to this difference.

Based on increasing the minimum libparted version (see B), we decided to retire GParted's custom method for handling DMRAID.

B) Increase the minimum required version of libparted. (Yes)

With GParted we try to ensure that the code compiles and runs on the currently supported major GNU/Linux distributions.

At the end of April 2013, the versions of parted/libparted in the currently supported major GNU/Linux distributions will be:

Parted 2.2 in Ubuntu 10.04 LTS
Parted 2.3 in Debian squeeze (stable)
Parted 2.3 in Fedora 16

Based on the above, we decided that the minimum required version of libparted could be moved from 1.7.1 to 2.2.

C) Retire the Reiser4 file system. (No)

The Reiser4 patches have not been incorporated in the the Linux 3.x kernels. However, the patches appear to have been updated to work with the Linux 3.x kernel and are included in the current release of the System Rescue CD (3.2.0).

Based on the fact that it is still viable to use Reiser4 file systems with the Linux 3.x kernels, we decided that Reiser4 support will remain. In other words, the Reiser4 code will be enhanced to take advantage of the asynchronous command execution enhancement.

In conclusion we plan to increase the minimum libparted version to 2.2, retire GParted's custom method for handling DMRAID, and keep Reiser4 support.

Comment 3 Curtis Gedak 2013-02-04 17:32:53 UTC

Following are some updates regarding discussions with Phillip on the psusi/refactor branch:

A)  We will not retire GParted custom method for handling DMRAID at this time.

The reason being is that at least a portion of this method is required to detect DMRAID devices.  At such time that GParted uses libparted's ped_device_probe_all, then this method can be completely retired.

B)  No need to increase minimum libparted version due to (A).

Since we are not removing the GParted custom method for handling DMRAID, we maintain backward compatibility to libparted 1.7.1.

More development work and testing continues on the psusi/refactor branch.

Of note is that command progress output is now displayed live in the apply dialog.

Comment 4 Curtis Gedak 2013-02-04 20:51:28 UTC

Created attachment 235163 [details] [review]
Delay clearing line output until a non Carriage Return is received

This patch applies to the psusi/refactor branch as of Feb 2, 2013.
The last commit in this branch at that time was:

commit 9c374ca3059d0bb7e34983c50e7dde76db8d957e
Author: Phillip Susi <psusi@ubuntu.com>
Date:   Sat Feb 2 17:54:28 2013 -0500

    fixup! Pass Partition instead of just its path to FileSystem::copy()


The previous code for cancelling operations in progress was not robust.  The new code in the branch is much improved.

However, I have discovered one problem:

1)  When creating a large ext2 file system, and press Cancel while the mke2fs command is running.  Do not click on Force cancel when this button displays.  After control is returned to the user, the ext2 file system is not recognized in the partition.
 
This could be considered misleading because the user did not press "Force cancel".

A similar outcome occurs if the user selects "Force Cancel", and then clicks the "Continue operation" on the warning screen.  The expected outcome when clicking "Continue operation" would be that all would be okay.


For testing I created a large 140 GiB ext2 file system so that I had time to press the Cancel button while mke2fs was running.

Comment 5 Phillip Susi 2013-02-05 00:01:04 UTC

Yes, this is because of the commit where I flagged all mkfs operations as safe to cancel.  Before cancelling, you did not have a valid fs, and after cancelling, you still do not have a valid fs, so nothing's lost.  You asked to cancel creating the fs, so that's what it did.  The force cancel is for when stopping now will not only not perform the requested action, but also result in data loss.  The only instance where that really applies is during the revert step after cancelling a move.

Comment 6 Curtis Gedak 2013-02-07 20:21:13 UTC

Hi Phillip,

I have started my testing of the code in the psusi/refactor branch and have encountered a problem when copying between disks with different sector sizes.

In the psusi/branch I used, the last commit is:

commit 8eab0fdd44131d5c49e42b9733555ac476e28134
Author: Phillip Susi <psusi@ubuntu.com>
Date:   Wed Jan 30 21:35:52 2013 -0500

    Pass Partition instead of just its path to FileSystem::copy()


The problem appears to be that the copy operation does not recognize the difference in sector sizes.

The test steps I used are:

1)  Create a RAMDISK using 4096 bytes/sector

    $ sudo modprobe scsi_debug dev_size_mb=64 sector_size=4096

2)  Write an msdos partition table to the RAMDISK

3)  Create a 16 MiB ext2 partition using Cylinder alignment at the start of a 512 bytes/sector device with MBR.

4)  Select this 16 MiB ext2 partition and choose Partition --> Copy.

5)  Select the RAMDISK

6)  Paste the 16 MiB ext2 partition to the RAMDISK using cylinder alignment.

This is where the problem occurs.  The resulting partition should be ~16 MiB, but instead grows to almost fill the disk (~62 MiB).

With GParted 0.14.1, the pasted partition is also ~16 MiB.

Would you be able to investigate this problem?

Comment 7 Curtis Gedak 2013-02-07 20:43:15 UTC

Hi Phillip,

My bad.  Scratch that earlier report in comment #6.

I was using cylinder alignment, so of course the size of the partition had to grow to match the cylinder size using 4096 byte sectors.  This is expected behaviour.

Sincerely,
Curtis

Comment 8 Curtis Gedak 2013-02-08 17:18:34 UTC

I have finished my first round of testing.  Kudos to Phillip on this
patch set as almost all functionality I tested worked as expected.

All the following tests were performed on Kubuntu 12.04.  As can be
seen, mosts tests passed.  There are still a couple of details to iron
out.


STATUS   DESCRIPTION                           RESULTS
======   ===================================   =======================
pass     Ped_exception_handler - grow gpt      Prompted to fix GPT

pass     Pulsebar scanning devices             Bounces back and forth
pass     Pulsebar applying operations          Bounces back and forth

pass     Copy ext2, linux-swap, ntfs, xfs      All FS copies valid

pass     Move btrfs, ext3, linux-swap, ntfs    All FS moves valid

pass     Copy ext2: 512 to 4096 bytes/sector   Size similar - expected
                                               FS unrecognized - expected

pass     Label FAT16                           Set and clear works
pass     Label FAT32                           Set and clear works

pass     Cancel copy ext2, ntfs, xfs           Orig okay, copy bad
pass     Cancel move ext3                      Orig restored okay
pass     Cancel mkfs ext4, ntfs                New FS bad - expected

pass     Copy ntfs progress live updates       Live % complete shown
pass     Copy ext3 progress live updates       Live Mib copied shown

STATUS   DESCRIPTION                           RESULTS
======   ===================================   =======================
pass     Save gparted details                  Steps saved

pass     Unmount partition                     Partition unmounted
pass     Toggle swapon/swapoff                 Swap enabled/disabled
pass     Deactivate LVM2 PV                    LV deactivated

pass     Mkfs and copy safe to cancel          No user data lost on cancel

pass     Unallocated space default selected    True if unnallocated available
pass     INSert is Partition --> New           Works if unnallocated selected
pass     CTRL+Return is Apply all operations   Works

pass     Remove read-test pass on move         True.  Cancel rolls back okay.
pass     Check FS after reverting Part Tbl     True.  Cancel rolls back okay.

pass     Operation time values valid           True.  Values valid.

pass     Combined code works for ext2/3/4      Works.

pass     Dialog progress details grow size     Can see status column

pass     All features still available          Yes, see View --> FS support


The following two tests failed.

STATUS   DESCRIPTION                           RESULTS
======   ===================================   =======================
fail     Compiles/Runs on supported distros?
           fail                 Ubuntu 10.04   No Glib::Thread::Mutex.
                                               Try Glib::RecMutex or
                                               Glib::Mutex.
                                               When deprecated???
           ????                 Fedora 16      Not yet tested

fail     Cancel move/copy                      GUI hangs with Force Cancel (5)
                                               button disabled
                                               IRC discussion with Phillip
                                               shows hang on non-threaded
                                               on ped_device_close in
                                               Copy_Blocks method

Comment 9 Mike Fleetwood 2013-02-16 00:56:35 UTC

Created attachment 236320 [details] [review]
Handle old versions of e2fsprogs without ext4 support (v1)

Hi Curtis and Phillip,

I found that this commit makes some assumptions which aren't true on
RHEL/CentOS 5.9 with e2fsprogs 1.39, which doesn't support ext4.  Patch
uploaded to resolve.

    http://git.gnome.org/browse/gparted/commit/?h=psusi/refactor&id=0520ca495831898e7ec5bcd2f06a5b803a79e253
    Combine duplicate code for ext[234]


I also found that GParted fails to compile on Debian 6 with
Glib::Threads related errors.  (See below).

Thanks,
Mike
--
g++ -DHAVE_CONFIG_H -I. -I.. -pthread -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include   -pthread -I/usr/include/gtkmm-2.4 -I/usr/lib/gtkmm-2.4/include -I/usr/include/giomm-2.4 -I/usr/lib/giomm-2.4/include -I/usr/include/pangomm-1.4 -I/usr/lib/pangomm-1.4/include -I/usr/include/gtk-2.0 -I/usr/include/gtk-unix-print-2.0 -I/usr/include/atkmm-1.6 -I/usr/include/gdkmm-2.4 -I/usr/lib/gdkmm-2.4/include -I/usr/include/glibmm-2.4 -I/usr/lib/glibmm-2.4/include -I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/sigc++-2.0 -I/usr/lib/sigc++-2.0/include -I/usr/include/cairomm-1.0 -I/usr/lib/cairomm-1.0/include -I/usr/include/pango-1.0 -I/usr/include/cairo -I/usr/include/pixman-1 -I/usr/include/freetype2 -I/usr/include/libpng12 -I/usr/lib/gtk-2.0/include -I/usr/include/atk-1.0 -I/usr/include/gio-unix-2.0/   -DGPARTED_DATADIR=\""/usr/local/share"\" -DGNOMELOCALEDIR=\""/usr/local/share/locale"\"   -Wall	 -g -O2 -MT GParted_Core.o -MD -MP -MF .deps/GParted_Core.Tpo -c -o GParted_Core.o GParted_Core.cc
GParted_Core.cc:3366: error: ‘Glib::Threads’ has not been declared
GParted_Core.cc:3366: error: ISO C++ forbids declaration of ‘Mutex’ with no type
GParted_Core.cc:3366: error: expected ‘;’ before ‘mutex’
GParted_Core.cc:3367: error: ‘Glib::Threads’ has not been declared
GParted_Core.cc:3367: error: ISO C++ forbids declaration of ‘Cond’ with no type
GParted_Core.cc:3367: error: expected ‘;’ before ‘cond’
GParted_Core.cc: In function ‘bool GParted::_ped_exception_handler(GParted::ped_exception_ctx*)’:
GParted_Core.cc:3386: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘cond’
GParted_Core.cc:3394: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘mutex’
GParted_Core.cc:3395: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘cond’
GParted_Core.cc:3396: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘mutex’
GParted_Core.cc: In static member function ‘static PedExceptionOption GParted::GParted_Core::ped_exception_handler(PedException*)’:
GParted_Core.cc:3406: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘mutex’
GParted_Core.cc:3408: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘cond’
GParted_Core.cc:3408: error: ‘struct GParted::ped_exception_ctx’ has no member named ‘mutex’
make: *** [GParted_Core.o] Error 1

Comment 10 Phillip Susi 2013-02-16 01:32:46 UTC

Curtis already noticed the mutex thing and was looking into whether referring to it in one of the other, depreciated namespaces would work in all cases.  I suppose this patch has minimal cost so I can't object to it too much, other than to say, support for a 7 year old release of e2fsprogs?  Really?  Isn't that a bit excessive?

Comment 11 Curtis Gedak 2013-02-16 17:33:59 UTC

Hi Phillip and Mike,

Thank you for all your work on GParted.  I do appreciate all of your efforts to improve GParted.  Without your help, many of the recently added features would still be on the planning boards.

Regarding backward compatibility, I have a personal pet peeve with software that requires the latest and greatest versions to even compile.

As such I strongly believe that with GParted we should maintain compatibility with older supported versions of the major GNU/Linux distributions.  If GParted compiles on even older distros, then all the better.  :-)


Phillip,

Regarding the mutex thing, would you like me to investigate further and develop a patch to resolve backward compatibility?

If so, I welcome any insight that you or Mike might have regarding a possible solution.

Comment 12 Phillip Susi 2013-02-16 19:59:58 UTC

7 years is pretty darn far from the latest and greatest, that's all.  2 or 3 years seems to be my sweet spot.

In this case though the effort required is minimal, so I'll go ahead and apply it.

Comment 13 Curtis Gedak 2013-02-17 17:56:25 UTC

Created attachment 236469 [details] [review]
Backport spawn asynchronous code to old GNU/Linux distributions

The patch in this attachment will backport the spawn asynchronous code to compile on older GNU/Linux distributions, and fix a race condition that could cause GParted to crash.

This patch was developed along with Phillip Susi while conversing on IRC.


Phillip and Mike,

Do you anticipate further changes to the psusi/refactor branch to support spawning commands asynchronously?

If not, then I would like to collect the whole patch set together to perform another complete round of tests.  Following successful testing I would like to commit this patch set to the master.

Comment 14 Curtis Gedak 2013-02-17 20:12:20 UTC

Created attachment 236489 [details] [review]
Patch set collection including all previous changes up till now

Attached is the patch set collection including all patches up until now.

This patch set was created from the most recent psusi/refactor branch that includes Mike's patch from comment #9.  The patch from comment #13 has been rolled into / squashed into the original code.

This is the patch set that I intend to use for running my last series of tests before committing these changes to master.

Comment 15 Mike Fleetwood 2013-02-17 21:12:55 UTC

From a quick check of a few things (most important first):


1)  GParted no longer executes shell command pipelines affecting the
    following:
    * display of jfs file system usage fails
    * resize of reiserfs doesn't resize the file system.  It instead just
      successfully echos "y | resize_reiserfs /dev/PTN"
    * deletion of a non-empty lvm2 pv doesn't remove the LVM metadata
      before deleting the partition.  It instead successfully echos
      "y | lvm pvremove --force --force /dev/PTN".


2)  I've got a core dump on Fedora 14 w/ Glibmmm/Gtkmm-2.24 at the point
    when the main page is being refreshed after applying some operations.
    Still looking for the core file though!  Will look into this further
    tomorrow.


3)  In patch:
        Backport spawn asynchronous code to compile on older distros (#685740)
    is this conditional code necessary?

        #ifdef HAVE_GTKMM_2_22_PLUS /* HAVE_REMOVE_ALL_MESSAGES */
                statusbar.remove_all_messages();
        #else
                statusbar.pop();
        #endif /* HAVE_REMOVE_ALL_MESSAGES */

    As there is no extra code for the older statusbar.pop() case, can
    that just always be used regardless of the Gtkmm version?


4)  Can you update the commit id in the message for my patch when the
    code finally gets committed as the branch has been rebased and the
    commit it is trying to refer to has changed.
 
        Subject: [PATCH] Handle old versions of e2fsprogs without ext4 support

        Address issues with:
HERE>>      4fcfc88290874412f144db7070326237567a0ca5
            Combine duplicate code for ext[234]

Comment 16 Mike Fleetwood 2013-02-17 22:42:53 UTC

(In reply to comment #15)
> 
> 1)  GParted no longer executes shell command pipelines affecting the
>     following:
>     * display of jfs file system usage fails
>     * resize of reiserfs doesn't resize the file system.  It instead just
>       successfully echos "y | resize_reiserfs /dev/PTN"
>     * deletion of a non-empty lvm2 pv doesn't remove the LVM metadata
>       before deleting the partition.  It instead successfully echos
>       "y | lvm pvremove --force --force /dev/PTN".
>

These cases don't really need to be able to pipe 2 commands together,
they just need to be able provide some text to stdin of the command.

    $ fgrep ' | ' jfs.cc lvm2_pv.cc reiserfs.cc
    jfs.cc: if ( ! Utils::execute_command( "echo dm | jfs_debugfs " +
                              partition .get_path(), output, error, true ) )
    lvm2_pv.cc: cmd = "echo y | lvm pvremove --force --force " +
                      partition .get_path() ;
    reiserfs.cc: Glib::ustring str_temp = "echo y | resize_reiserfs " +
                                          partition_new .get_path() ;

Comment 17 Phillip Susi 2013-02-18 02:18:30 UTC

I used remove_all_messages() originally just because I noticed the existing code was pushing more messages on and never popping them off, causing a memory leak, and so I figured that would make sure to remove all messages if more than one was pushed.  Curtis added the #ifdef because that method isn't available on 10.04.  It probably could just call pop() always if we are sure we are now matching every push() with a pop().

As for the pipelines, doesn't the --force option make sure it won't prompt for a 'y' input?  And isn't there a similar option to resize_reiserfs?  If so that would be preferable, otherwise, those commands will need to be changed to explicitly execute sh -c and pass the pipeline command to that.

Ideally I'd like to fold the ext4 fix into the original commit but I didn't want to do that without your consent because that would remove crediting you as the author of the fix.  I can fixup the SHA-1 ref if you prefer.  There was also one trailing whitespace warning in the patch that I did fix.

Comment 18 Mike Fleetwood 2013-02-18 14:01:52 UTC

Lvm2 pv: Command can changed to:
  "lvm pvremove --force --force --yes " + partition .get_path()

reiserfs: resize_reiserfs still prompts when shrinking the FS even
with the force flag so still needs "y" on stdin.

Consent granted to fold ext4 fix into ext[234] merge.

Comment 19 Mike Fleetwood 2013-02-18 19:29:14 UTC

Just explaining my view regarding folding fixes:
I usually think even when refactoring or developing a new feature it is
better have separate commits for fixes and workarounds as the messages can
be insightful as to why it is necessary.  Both now for review and later
when looking back.  For me credit is nice but secondary.

Anyway merge consent given.


I have looked into my core dump some more.  It's crashing in
GTK::Statusbar::remove_all_messages() and is because it's dereferencing a
NULL pointer which I assume is meant to be the part of the linked list of
messages being popped.

# gdb ~mike/programming/c/gparted/src/gpartedbin 
...
(gdb) run
...
Program received signal SIGSEGV, Segmentation fault.
IA__gtk_statusbar_remove_all (statusbar=0x81d6cf8 [gtkmm__GtkStatusbar], context_id=0) at gtkstatusbar.c:521
521	          list = prev->next;
(gdb) backtrace 5

+ Trace 231533

#0 IA__gtk_statusbar_remove_all
at gtkstatusbar.c line 521
#1 Gtk::Statusbar::remove_all_messages
at statusbar.cc line 315
#2 GParted::Win_GParted::hide_pulsebar
at Win_GParted.cc line 646
#3 GParted::Win_GParted::menu_gparted_refresh_devices
at Win_GParted.cc line 1225
#4 GParted::Win_GParted::activate_apply
at Win_GParted.cc line 2639

516	          g_slist_free_1 (list);
517	
518	          if (prev == NULL)
519	            prev = statusbar->messages;
520	
521	          list = prev->next;
522	        }
523	      else
524	        {
525	          prev = list;
(gdb) print prev
$2 = 0x0
(gdb) print statusbar->messages
$6 = 0x0

Comment 20 Curtis Gedak 2013-02-18 21:08:38 UTC

Mike,

Phillip and I just found the cause of a crash in fedora 13 with parted 2.2.  With parted 2.1, fedora 13 would seem to hang.

The fix we discovered is to add "ctx.mutex.unlock();" after
  "ctx.cond.wait( ctx.mutex );".

For example the ped_exception handler should look like this:

PedExceptionOption GParted_Core::ped_exception_handler( PedException * e )
{
	struct ped_exception_ctx ctx;
	ctx.ret = PED_EXCEPTION_UNHANDLED;
	ctx.e = e;
	if (Glib::Thread::self() != GParted_Core::mainthread) {
		ctx.mutex.lock();
		g_idle_add( (GSourceFunc)_ped_exception_handler, &ctx );
		ctx.cond.wait( ctx.mutex );
		ctx.mutex.unlock();
	} else _ped_exception_handler( &ctx );
	return ctx.ret;
}

Phillip will roll this change into the psusi/refactor branch.

Comment 21 Curtis Gedak 2013-02-18 21:55:25 UTC

Hi Mike and Phillip,

RE: View regarding not folding fixes.  I tend to agree.  However, there are some times when I find a problem myself and go back and fold the change into a previous commit.  As you correctly point out, some history is lost by doing this that might be useful later on.


Mike,

RE: fedora 14 crash dump

Do you recall any libparted warnings or error messages?
Are you able to reproduce this problem?
If so, what are the steps?

For some reason I don't happen to have my fedora 14 VM anymore, so I'm scouring the Internet to track one down.

On your fedora 14 install, which version of libparted is in use?

When I tested on fedora 13 with libparted 2.3, I did not encounter the crash or the hang problem mentioned in comment #20.

Curtis

Comment 22 Mike Fleetwood 2013-02-18 22:33:29 UTC

No warnings or errors from libparted.

Reproduction steps is just to apply any operation.  After it completes
the operation and starts the rescan to refresh the main window it
crashes.  Even crashes after just labelling a file system.

Fedora 14 has parted/libparted 2.3.

Tomorrow I'll try the with the fix mentioned in comment #20 and then
start debugging the push and pops of messages to the statusbar.

Comment 23 Curtis Gedak 2013-02-19 00:09:11 UTC

Great work Mike finding another crash in GParted.  We certainly want to iron out the bugs before making a public release of these code enhancements.

I finally rebuilt my fedora 14 VM, and I am seeing the same problem.  Specifically GParted crashes while rescanning devices.

I can consistently reproduce the crash by invoking the menu option:
     GParted --> Refresh Devices  Ctrl+R

From gdb I also see that the crash occurs in Win_GParted.cc on the statusbar.remove_all_messages(); line in the hide_pulsebar() method.


If in Win_GParted.cc I change the line from:
     statusbar.remove_all_messages();
to:
     statusbar.pop();

Then the crash problem goes away for me.

Comment 24 Phillip Susi 2013-02-19 01:44:49 UTC

I tend to think of squashing fixes the other way around.  If I hadn't made the mistake in the first place, then there wouldn't be a second commit.  If the fix is important enough to document, then it should have been documented in the first place.  In this case I think the comments in the code are good documentation.  Having it in a second commit just clutters the history and makes bisecting more difficult.

This remove_all_messages() crash sounds like it is a bug in the gtk version on Fedora 14.  If so it would be interesting to document it and file a bug report on it.  At the end of the day, just going back to using pop() always is probably what we should do, but it would be good to understand the bug first.

I have rebased the branch again after removing the mkfs_cmd variable from the ext4 fix and squashing it, since this variable didn't seem to be needed.  Also added the other mutex.unlock() for the older libs that don't seem to like it being left locked.

Comment 25 Curtis Gedak 2013-02-19 17:25:53 UTC

Hi Mike and Phillip,

In an effort to troubleshoot this problem I added some debug prints in
the code each time there is a statusbar push(), pop(), or
remove_all_messages().

Following is the debug output from running gparted and then pressing
Ctrl+R to refresh devices.  Following the rescan of devices, GParted
crashes.


[fedora@fedora14 gparted-0.14.1-git]$ sudo /usr/local/sbin/gparted

======================
libparted : 2.3
======================
statusbar .push( status_message) ; in show_pulsebar()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.remove_all_messages(); in hide_pulsebar()        **
statusbar .pop();                in Refresh_Visual()       ** empty pop() 
statusbar .push( String::ucomp); in Refresh_Visual()       ==
statusbar .push( status_message) ; in show_pulsebar()      == double push()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.pop();           in pulsebar_pulse()
statusbar.push( tmp_msg ); in pulsebar_pulse()
statusbar.remove_all_messages(); in hide_pulsebar()        -- Not empty! Bug?
[fedora@fedora14 gparted-0.14.1-git]$ 


From the output, it appears that the pushes and pops are matched in
the show_pulsebar(), pulsebar_pulse(), and hide_pulsebar() methods.
However there is a mismatch when the Refresh_Visual() method is tossed
into the mix.

The fact that the crash in remove_all_messages() does not appear in
newer GNU/Linux distributions suggests to me that newer versions of
the gtkmm libraries have improved edge case handling.

If we replace the remove_all_messages() method with the pop() method
then GParted should run without crashing.  However there will be a
small memory leak.  This memory leak would have been present before
this patch set.


Mike,

Have you discovered anything more on this problem?

What are your thoughts about replacing the remove_all_message() method
with the pop() method?

Comment 26 Phillip Susi 2013-02-19 20:39:18 UTC

Can you post the implementation of remove_all_messages() in the crashing version for comparison?

Comment 27 Mike Fleetwood 2013-02-19 21:31:53 UTC

It's gnome bug 640487 "crash on gtk_statusbar_remove_all()."
    https://bugzilla.gnome.org/show_bug.cgi?id=640487

The broken gtk_statusbar_remove_all() was released in GTK+ 2.22.0 on
23-Sep-2010.  (Development GTK+ release 2.21.1).  It was fixed in
GTK+ 2.24.1 released ~21-Feb-2011.
    http://upstream-tracker.org/changelogs/gtk+/2.24.1/changelog.html

Looking at a few distributions on distrowatch.com, broken versions GTK+
were included in:
    Fedora 14
    OpenSUSE 11.1
    Ubuntu 10.10
but not in any Debian or RHEL/CentOS releases.

Options:
1) Easiest would be to just switch back to using pop() and don't use
   remove_all_messages() at all.
2) Bit harder would be to write a configure test for gtkmm24 >= 2.24.1
   to define HAVE_REMOVE_ALL_MESSAGES.  If selecting this option
   recommend making a separate patch to keep an explanation in the
   history.

Comment 28 Phillip Susi 2013-02-19 21:58:58 UTC

Good find.  I'm also feeling better about just going back to pop() and making sure the push/pops are balanced.  I think the remove_all change also broke the status message about the number of pending operations, so it would be good to get that back too.

Comment 29 Curtis Gedak 2013-02-20 16:26:30 UTC

Great sleuth work Mike!  It's good to know that we are not just losing our minds.  :-)

Regarding the path forward.  My vote is for simpler code and to proceed with  implementing the pop() method.

Phillip,

If/when you fold in the pop() change, please note that I had changed the name of the GTKMM 2.22 test in configure.in from HAVE_GET_MESSAGE_AREA to HAVE_GTKMM_2_22_PLUS.  To keep the history cleaner, we should go back to the HAVE_GET_MESSAGE_AREA code.

Comment 30 Mike Fleetwood 2013-02-20 18:25:15 UTC

Hint for GIT usage:

You'll be able to get previous versions of your code out of your GIT
repository. As you've been rebasing (actually creating new commits
and moving the branch label) your psusi/refactor branch points at your
latest version, but if you can find the commit ids of where that branch
pointed you can copy the old commit over to get to an earlier version of
a patch without HAVE_GET_MESSAGE_AREA changed into HAVE_GTKMM_2_22_PLUS.
If you created other branches or tags before each rebase you can use
those. If not you can create temporary tags by using git reflog to
identify those commits.

$ git reflog show | grep 'checkout: .*refactor$'
acde1f6 HEAD@{7}: checkout: moving from master to psusi/refactor
acde1f6 HEAD@{9}: checkout: moving from erase-devel3 to psusi/refactor
acde1f6 HEAD@{56}: checkout: moving from ext2-merge-fix2 to psusi/refactor
acde1f6 HEAD@{59}: checkout: moving from ext2-merge-fix to psusi/refactor
acde1f6 HEAD@{62}: checkout: moving from master to psusi/refactor
8eab0fd HEAD@{64}: checkout: moving from master to psusi/refactor

So my repo has seen psusi/refactor branch point a 2 different commits:
acde1f6 and 8eab0fd. Crate temporary tags using:

$ git tag t1 8eab0fd
$ git tag t2 acde1f6

Now you can identify the commit ids of commits from all your different
heads that have been labelled branch psusi/refactor. Use gitk or git log.

$ git log --decorate --pretty=oneline t1
8eab0fdd44131d5c49e42b9733555ac476e28134 (tag: t1) Pass Partition instead of just its path to FileSystem::copy()
5a11e910074bf0d2b40c14687c8e139d1a3f81bb Cleanup duplicate fs code
12ae1435b1b166c4d07bdd2ad1ca96334ca04fc1 Fix dialog progress details view size
0520ca495831898e7ec5bcd2f06a5b803a79e253 Combine duplicate code for ext[234]
...

Then I would create a new branch from master and cherry-pick each
individual commit from each different copy of psusi/refactor branch
to get the wanted version of each patch.

$ git checkout master
$ git checkout -b new_refactor
$ git cherry-pick COMMITID
...

There may be more efficient ways to cherry-pick a range of commits that
I don't know of.

Also git keeps at least 2 weeks of non-reachable commits. But after that
it may start to delete them permanently.

Anyway hope this rambling was useful.

Comment 31 Mike Fleetwood 2013-02-21 00:57:36 UTC

So have done the git kung fu described in comment #30 above.

The results are available from:
    https://rockover.homeip.net/cgit/gparted/
    git clone https://rockover.homeip.net/cgit/gparted

Relevant branches:
    refactor-20130207 - How branch psusi/refactor looked on 2013-02-07
    refactor-20130214 - Dito 2013-02-14
    refactor-20130219 - Dito 2013-02-19 (the latest)
    refactor-20130221 - New branch with all the commits from
        refactor-20130219 except for "Reduce threading (#685740)" which
        was plucked from refactor-20130214.

Between "Reduce threading" patch in 0214 and 0219 you changed at least 2
things:
1) fix compile error with related to Glib::Threads reported in comment #9;
2) change configure.in to define HAVE_GTKMM_2_22_PLUS.
Therefore the code doesn't compile for me again.  Will need to apply patch
for (1) again.

(Pretty please stop squashing changes ;-)  It make stuff like this hard.
But at least it is possible with GIT).


For the record the git commands were:

git checkout master
git branch -D psusi/refactor
git pull
git checkout psusi/refactor
git reflog show | egrep 'checkout: .*refactor$'
git tag t1 8eab0fd
git tag t2 acde1f6
git tag t3 bbf09ca
git checkout master
git checkout -b refactor-20130221
git cherry-pick master..9709a523123b2e3ad8fb6cc69636212c81dc1d30
git cherry-pick 4cc9f1acb11dbb21e8cbb689b7593a87c64dd92c
git cherry-pick be446e9e340228158d8d1c19daa4b292db13cb26..t3
git branch refactor-20130207 t1
git branch refactor-20130214 t2
git branch refactor-20130219 t3
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130207
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130214
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130219
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130221

Comment 32 Mike Fleetwood 2013-02-21 00:58:59 UTC

So have done the git kung fu described in comment #30 above.

The results are available from:
    https://rockover.homeip.net/cgit/gparted/
    git clone https://rockover.homeip.net/cgit/gparted

Relevant branches:
    refactor-20130207 - How branch psusi/refactor looked on 2013-02-07
    refactor-20130214 - Dito 2013-02-14
    refactor-20130219 - Dito 2013-02-19 (the latest)
    refactor-20130221 - New branch with all the commits from
        refactor-20130219 except for "Reduce threading (#685740)" which
        was plucked from refactor-20130214.

Between "Reduce threading" patch in 0214 and 0219 you changed at least 2
things:
1) fix compile error with related to Glib::Threads reported in comment #9;
2) change configure.in to define HAVE_GTKMM_2_22_PLUS.
Therefore the code doesn't compile for me again.  Will need to apply patch
for (1) again.

(Pretty please stop squashing changes ;-)  It make stuff like this hard.
But at least it is possible with GIT).


For the record the git commands were:

git checkout master
git branch -D psusi/refactor
git pull
git checkout psusi/refactor
git reflog show | egrep 'checkout: .*refactor$'
git tag t1 8eab0fd
git tag t2 acde1f6
git tag t3 bbf09ca
git checkout master
git checkout -b refactor-20130221
git cherry-pick master..9709a523123b2e3ad8fb6cc69636212c81dc1d30
git cherry-pick 4cc9f1acb11dbb21e8cbb689b7593a87c64dd92c
git cherry-pick be446e9e340228158d8d1c19daa4b292db13cb26..t3
git branch refactor-20130207 t1
git branch refactor-20130214 t2
git branch refactor-20130219 t3
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130207
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130214
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130219
git push /PATH/TO/MY/WEB/CGIT/REPO refactor-20130221

Comment 33 Phillip Susi 2013-02-21 01:46:19 UTC

I find it much simpler to just do another interactive rebase, flag the commit for editing, review the diff in git gui, find the hunks we no longer want, and unstage them, then commit, then discard the remaining uncommitted changes and continue the rebase.

Comment 34 Curtis Gedak 2013-02-21 16:56:03 UTC

MY PERSPECTIVE ON FOLDING CHANGES

I have to say that I am learning more and more about the power of git as we've been working thought this enhancement. Thank you both for sharing your expertise. :-)

Having said that, the power to rebase and fold/squash changes makes it more difficult for me to follow the progress.

I think that when it is only one person working alone, then that person is in full control of changes and knows exactly what is going on, and can rebase and fold/squash changes as desired.

However once the code is shared and others are collaborating to test and improve the code, then it becomes increasingly difficult to follow changes that have been rebased and folded/squashed into the repository.

The challenges I have faced personally are:

1) Not being sure that I have the correct up-to-date branch locally, since a simple 'git pull' does not work for me.

2) More difficulty seeing the difference between code that I previously tested, and the new code containing folded changes (and any other potential changes). This forces me to repeat more tests than I believe would be otherwise necessary.

Some of these difficulties can be attributed to my inexperience with all of the features of git, but not all of these challenges.

I believe it is in our collective best interest to make it easier to work together and I think that means that we should avoid folding/squashing changes when we collaborate on enhancements.

WHERE DOES THAT LEAVE US ON THIS ENHANCEMENT?

For some reason I have been unable to clone the git repository posted by Mike in comment #30. The error message I see is:

$ git clone http://rockover.homeip.net/cgit/gparted
Cloning into 'gparted'...
error: Failed connect to rockover.homeip.net:80; Connection timed out while accessing http://rockover.homeip.net/cgit/gparted/info/refs
fatal: HTTP request failed
$

Since multiple code branches can also make it more difficult to follow changes (at least for me ;-), and since Phillip created the initial code for this enhancement, I propose the following:

1) Phillip to update the psusi/refactor branch to a state that includes all of the improvements identified to date.

2) When Phillip pushes these updates, all future updates are tracked in individual commits and we stop folding/squashing changes when we are collaborating on code.

These are my thoughts, and I am certainly not an expert, but I do want to make it easier for us to work together.

Does this sound like a reasonable approach?
If not, please explain your position.

Comment 35 Mike Fleetwood 2013-02-21 18:52:34 UTC

Curtis,

My repo is only accessible via HTTPS not HTTP.  You may also need to set
this too:
    git config --global http.sslverify false

Comment 36 Curtis Gedak 2013-02-21 19:23:53 UTC

Hi Mike,

Thanks for the git config tip.  :-)

I had tried both the HTTPS the HTTP.  I wasn't sure how to get around the SSL invalid certificate.  I see now that I pasted the wrong git clone text into the report.  'Sorry about that.


The important issue here is to make it easy for all of us to work together.  I think that it works better when we do not fold and squash changes.  Also I found that it worked well when patch sets are posted to the bug report.

My thoughts are that having more than one place containing the patch set code tends to complicate matters.

Having said all that, I think it is important that we come to agreement on how to handle patch sets so that we can work together easily.  :-)

Comment 37 Mike Fleetwood 2013-02-21 19:42:54 UTC

As we've seen co-operative development and reviewing using branches
which are rebased has issues.  Rebasing does have its place though.

Perhaps we only need to:

1) Push/publish static branches for testing and reviewing using a
   naming convention like this:
      feature-v1
      feature-v2
   Keeping a private branch which can be rebased when needed which
   is never pushed.
      feature-dev

   Then with one extra git command create a new branch and push/publish
   that instead.
      git branch feature-v1 feature-dev
      git push GITREPO feature-v1

2) Post a short summary in the bug report listing the new URL to get
   the latest feature-v# branch from and what patches have been added,
   changed and removed.

I think that we shouldn't use the GParted GIT repo on GNOME for this.
Anyone who wants to develop this way should probably get their own copy
of the GPared GIT repo hosted at their favourite GIT hosting.
(For me this is my home machine over a broadband link).

Comment 38 Phillip Susi 2013-02-21 20:40:35 UTC

You can put off the rebase by committing the fix with the name of the original commit, prefixed with "fixup! ". That way others can still pull and review the changes. Then you can go through a few rounds of that, and only rebase at the end, and git rebase -i --autosquash will automatically move the flagged commits to be squashed. The longer you do this though, the more likely it will be that you will run into conflicts that require manual resolution during the rebase.

Adding a version number to the new branch does prevent the pull error, though it also means you have to notice the new branch and check it out instead, so I'm not sure it provides any advantage. This is closer to how most projects manage patch sets via email. Each new revision of the patch set is emailed with a cover letter explaining what revision it is, and what was fixed since last time.

It sounds like you guys mostly just got thrown by the unexpected error when you tried to pull. Once you get over the initial shock, it's just as simple as adding --rebase to the git pull command. Actually, it's good to do that regardless when your intent is to update your local branch from a remote master, rather than merge someone else's changes into yours.

In other words, if you are used to subversion, before you commit to the central server you update first to pull in any changes others have made, then you commit yours on top. This is basically what git pull --rebase does.

Why do you think we shouldn't use the gnome repository for experimental branches?

Comment 39 Phillip Susi 2013-02-22 02:10:56 UTC

Ok, I have removed the remove_all_messages changes, as well as fixed up the external commands using pipelines, and rebased on top of the new master which has a few new commits.  Hopefully this round should pass muster.  I also double checked that git pull --rebase does integrate the new rebased branch into a previously checked out one without incident.

Comment 40 Curtis Gedak 2013-02-22 17:35:08 UTC

Created attachment 237202 [details] [review]
Patch set collection 2 includes all previous changes

Thank you Phillip for updating the psusi/refactor branch will all the changes to date.


Also thank you for the reminder of the "git pull --rebase" command.  That makes it easier to pull down the folded changes.

However the problem with testing and reviewing is that folded changes are not obvious to the tester or reviewer, as opposed to adding a new commit with the changes.

This means more work for the tester or reviewer to discover the changes and to re-test or review the affected areas.


With this in mind I would like to use the current psusi/refactor base for testing and reviewing this enhancement.  If problems are found, I would like us to add a new commit that addresses the problems instead of folding the change into the existing branch.


I plan to re-test all of the functionality again using this psusi/refactor branch as encapsulated in the attached patch set collection 2.

Comment 41 Curtis Gedak 2013-02-22 21:58:52 UTC

Hi Phillip and Mike,

While testing on Ubuntu 12.04 I have run into a problem when changing
a label on a fat32 file system.  Specifically the label operation
hangs while writing the new label, and I cannot cancel the operation
using the cancel buttons.

The steps to recreate the problem are:

1)  Create a FAT32 partition with the label "Fat32-Label"

2)  Apply this operation

3)  Select the FAT32 partition and choose the menu option:
      Partition --> Label

4)  Delete the trailing "abel" from the fat32 label.
      The label will look like "FAT32-L"

5)  Click on OK

6)  Apply this operation

7)  The operation will hang at on the
    'Set Partition label to "Fat32-L" on /path-to-device' while running
    the command:
      mlabel ::Fat32-L -i /path-to-device

8)  Click Cancel, wait for the Force Cancel button, then click Force Cancel.

9)  At the dialog box, click "Cancel Operation".
    NOTE that the operation does not cancel!

Based on this testing it appears that there are two problems.

A)  The operation does not cancel using the cancel/force cancel buttons

B)  The mlabel operation hangs


I am looking further into problem (B).

Comment 42 Mike Fleetwood 2013-02-22 23:54:38 UTC

Created attachment 237220 [details] [review]
Fix shrinking reiserfs file systems

With the attached patch applied all the issues I previously reported in
comment #15 are resolved.  Specifically 1) pipe commands not working; and
2) core dump during refresh on Fedora 14.

Comment 43 Phillip Susi 2013-02-23 00:36:11 UTC

Oops, I put the quote in the wrong place.  Feel free to push that patch yourself.  Would you mind it being squashed before the branch is merged?

The label thing doesn't happen for me on Ubuntu 13.04.  It sounds like a bug in mtools.

Comment 44 Mike Fleetwood 2013-02-23 09:02:04 UTC

Fix to shrinking reiserfs is definately a squashable correction as far as
I'm concerned.  Took the oppertunity to reorder the fs size option in the
generated resize_reiserfs command too.

Comment 45 Mike Fleetwood 2013-02-23 13:54:17 UTC

On the non-cancel issue, (A) from comment #41, I can reproduce it on
my Fedora 14 box independantly of FAT32.  I changed the ext2 write
label command to make it take a very long time by adding a sleep 1000,
changing ext2::write_label() to this:
    return ! execute_command( "sh -c 'e2label " + partition .get_path() + " \"" + partition .get_label() + "\"; sleep 1000'", operationdetail ) ;

When applying I pressed [Cancel], after the 5 second count down I pressed
[Force Cancel].  The [Force Cancel] button became grayed out but the
"Applying pending operations" continued with the pulse bar bouncing back
and forth.

Comment 46 Curtis Gedak 2013-02-23 18:03:19 UTC

REISERFS SHRINK PROBLEM

From testing on fedora 14, I can confirm the problem Mike noted when shrinking reiserfs file systems.  After applying the patch from comment #42, I can also confirm that the patch fixes the problem.


FAT32 LABEL HANG PROBLEM
From further testing, the fat32 label hanging problem (B) does not occur on my physical kubuntu 12.04 install, nor on my fedora 14 VM.

Since my ubuntu 12.04 VM, where the problem arises, has begun to act strangely I will rebuild the VM and retest.

Comment 47 Phillip Susi 2013-02-23 18:24:29 UTC

Ok, I pushed a patch to fix that.  The shell ignores SIGINT, so I gave the child its own group ID and send the SIGINT to the whole group.

Comment 48 Curtis Gedak 2013-02-23 21:23:04 UTC

FAT32 LABEL HANG PROBLEM
Since rebuilding my ubuntu 12.04 VM, I have been unable to reproduce the hang that occurred when labelling a fat32 file system.  I have since tested using the following:
   kubuntu 12.04 physical installation
   ubuntu 12.04 VM
   fedora 14 VM
   Debian 6 VM
Since the hang does not occur in these above distros, I am going to asumme that this problem was dueo to some peculiar set-up or possibly a corrupt VM.  As such I do not plan to pursue this problem further.


OPERATION FAILS TO CANCEL PROBLEM
Using the sample ext2 label test noted in comment #45, and by using the updated code by Phillip mentioned in comment #47 I can confirm that the cancel problem is fixed.

The code I am referring to specifically is the commit entitled:
   fixup! Add proper cancel support (#601239) 
in the psusi/refactor branch with commit ID:
   7d6c5f4c9835760f7e3272f5d2b90ffacf4407b4

Comment 49 Curtis Gedak 2013-02-24 21:42:29 UTC

FYI:  Testing on openSUSE

While testing with virtual machines containing scratch installs of the KDE desktop version of openSUSE 11.2 and 12.2, I discovered that gparted would hang when scanning devices.

After applying all distro updates, gparted successfully ran on both openSUSE 11.2 and 12.2.

NOTE:  With openSUSE, the sudo command does not permit gparted to access
       the display.

       You can invoke gparted on openSUSE with one of the following commands:

          su -c /usr/local/sbin/gparted

       or for a graphical password prompt,

          xdg-su -c /usr/local/sbin/gparted

Comment 50 Curtis Gedak 2013-02-25 23:14:52 UTC

Created attachment 237399 [details] [review]
Patch set collection 3 includes all previous changes

Hi Phillip and Mike,

Attached is the patch set I have been using for testing.  It includes the code updates from comment #42 and comment #47.  I have encountered two issues in testing:

A)  GParted hang in scanning devices problem

On rare occasions GParted hangs in scanning devices with the pulse bar moving back and forth.  On the terminal screen from which GParted was invoked, the following message is seen:

$ sudo gparted
======================
libparted : 2.3
======================

(gpartedbin:26390): GLib-CRITICAL **: g_source_unref: assertion `source != NULL' failed

(gpartedbin:26390): GLib-CRITICAL **: g_source_unref: assertion `source != NULL' failed


When I use emacs in gud-gdb mode to attach to the process, the code is stuck waiting in Utils::execute_command on the line:

		status.cond.wait( status.mutex );


Unfortunately I have not been able to reliably reproduce the problem.  I have encountered the problem when testing in kubuntu 12.04, and openSUSE 11.2.


B)  Missing "Operation cancelled" text in gparted details.

When I cancel a copy operation, I do not see any indication in the gparted_details.htm log file that the operation was cancelled.  I thought I had seen this in my earlier testing, but with all the refactoring I can't really be sure.


I will continue testing and also try to isolate a set of steps to reproduce the GParted hang in scanning devices problem.

Comment 51 Phillip Susi 2013-02-27 01:57:29 UTC

I have identified and reported the bug in gtkmm that leads to that error message.  I'm still not sure the changes we discussed on IRC didn't fix the gparted hang though, as the gtkmm bug should just result in a leaked object reference.

You had mentioned on IRC that execute_command_eof() and execute_command_store_exit_status were being called, and so if that is the case, then pipecount should be decremented to 0 and running set to false, thus triggering the cond.signal() call to wake the backround thread in execute_command() so it can return.

Wait... maybe cond does not quite work the way I expected.  I thought that if you signal it before the other thread is waiting, the signal state would be retained and when the other thread waited, would immediately be satisfied.  IIRC, this is how the win32 equivalent worked.  Maybe it does not do this though, so the signal is lost if sent before the other thread has waited.  In that case, moving the mutex.lock() call ( only if status.foreground is false ) up to above where the PipeCapture is connected should fix it.

Comment 52 Curtis Gedak 2013-02-27 17:52:38 UTC

Created attachment 237546 [details] [review]
Patch to fix race condition when spawning commands and connecting IO stream handler

The GParted GUI would occasionally hang in scanning devices, or when
applying operations.  When this occurred, it appears that the spawned
command had completed before gparted could connect to the input/output
signal.
    
With modern GNU/Linux distributions this problem occurs very
rarely.  Fortunately the problem would regularly occur in my openSUSE
11.2 virtual machine.  This permitted Phillip and me to debug and
address the problem while chatting in IRC.

Many thanks to Phillip for co-developing this patch.

Phillip,

This patch is a candidate for folding into git.  However, before doing this I would like to complete my testing to see if any more bugs await discovery.  :-)

Comment 53 Curtis Gedak 2013-02-27 17:56:32 UTC

Created attachment 237549 [details] [review]
Patch set collection 4 includes all previous changes

I plan to use the above patch set collection 4 for my next round of testing.

Comment 54 Curtis Gedak 2013-02-27 20:03:01 UTC

Created attachment 237560 [details] [review]
Patch to add 'sh', '-c', 'command' argument vector to spawn asynchonous execution

The entries 'sh', '-c', and 'command' have been added to an argument
vector that is passed to the spawn asynchronous execute command.  By
doing this the exact same command used in an interactive shell can
also be used in the gparted source code.  Another advantage is that we
do not need to concern ourselves about additional quoting required for
wrapping a command with "sh -c".

The original reason for wrapping commands with "sh -c" was to enable shell directives like the pipe symbol '|' to work.

Comment 55 Curtis Gedak 2013-02-27 20:05:48 UTC

Created attachment 237561 [details] [review]
Patch set collection 5 includes all previous changes

Another complete patch set collection for use in testing.

Comment 56 Mike Fleetwood 2013-02-27 22:28:56 UTC

Hi Curtis,

Re: patch: Add 'sh', '-c', 'command' argument vector to spawn asynch (#685740)

Are there any other reasons this patch is needed besides what is mentioned
in the comment?

It is usually considered better if programs execute sub-programs directly,
rather than via a shell when they don't need to.  It's faster and doesn't
open up problems with how the shell interprets special characters.  So I
though it a plus when Phillip's refactor code stopped using it for almost
everything.

Thanks,
Mike

Comment 57 Curtis Gedak 2013-02-27 23:57:08 UTC

(In reply to comment #56)
> Are there any other reasons this patch is needed besides what is mentioned
> in the comment?

None that I can think of.

The resources needed by GParted will be less _without_ the patch in comment #54.  For example one less process spawned.

My original thoughts were to try to minimize the changes to the original code, but I could certainly be swayed towards GParted using less resources.

From my IRC conversation with Phillip:

<gedakc> psusi:  What are you thoughts on adding the "sh", "-c", "command"
argument vector to spawn?  I can create a patch and reverse the individual
"sh -c commands".
<psusi> if it works, sure

From this I don't think Phillip feels too strongly one way or the other.


Phillip,

Do you think we should include or exclude the patch from comment #54?


Based on Mike's statements, I plan to exclude the patch from comment #54 in my testing.

Comment 58 Phillip Susi 2013-02-28 00:16:03 UTC

Yea, I don't feel very strongly one way or the other, but I do lean slightly towards not running sh when it isn't needed simply because I prefer to not use any more resources than required, though practically speaking it isn't anything that is going to be noticed.

Comment 59 Phillip Susi 2013-02-28 01:06:00 UTC

You know, maybe we should go ahead and do the output progress parsing for this next release too?  It should only take two or three more lines of code for each of the few places that have the output.

In C I would use strrchr() to find the last newline, then sscanf to extract the information from it.  What's the right way to do that in C++ with the Glib ustring holding the output?

What utilities do we want to cover?  ntfsresize?  Are there any others?

Comment 60 Curtis Gedak 2013-02-28 02:17:51 UTC

Hi Phillip,

Thanks for the quick response regarding the patch in comment #54.  We seem to be in agreement so I will proceed with testing without that patch.


Regarding output progress parsing for this next release too:

My preference is to finish testing what we have, and then package this into a GParted release.

This patch set is a huge leap forward in terms of being able to track live command output.  Previously users had to wait until a command completed before they could see any output.  Now they can view the output as it progresses, which better permits determining if a command has failed or hung, or how much longer it might take to complete.

Part of my reasoning is that this patch set changes core command execution so I would prefer to put it out in it's own release.

Also I am currently the hold-up in other bug reports on some patches awaiting review and testing.  I have informed those developers that I would wrap up this patch set, make a release, and then I can work with them on testing and code review.

I certainly encourage improving command feedback further, but let's do that in a separate bug report and probably a different release.  :-)

Comment 61 Phillip Susi 2013-02-28 03:33:01 UTC

It just seems to me there are two possible reasons for not doing the last step this release:

1)  It will take some more time, and you want to release now
2)  It is may cause problems, so better to release what works

I think all of the problems have been in the patches so far and hopefully we have managed to iron them all out now.  The likelihood of the last step causing trouble I think is rather low.  If you can suggest how to best parse the strings I should be able to have a working patch for it tomorrow.

Whether it makes it into this release or a future one, I'd still like to go ahead and get the patch worked up.  I also have just completed one to have gparted use e2image for copy/move of ext[234].  That certainly won't be in the next release, but I'll file a separate bug report with it in case anyone wants to start testing it, but I'd like to get it parsing the output properly first since it offers not only the fraction complete, but also estimated time until done, and average throughput.

Comment 62 Curtis Gedak 2013-02-28 16:48:55 UTC

Hi Phillip,

Both of the reasons you listed apply to why I would like to release this patch set earlier rather than later. The time factor is me trying to juggle lots of other upcoming priorities.

Please note that I am not saying that we can't include improved progress output parsing in the next release. However for manageability reasons it would help to have improved output progress parsing and a corresponding patch in it's own bug report. That way if it isn't ready soon, we can easily bump it to a subsequent release.

Background Reasoning for Separate Bug Reports
---------------------------------------------
With each enhancement we find it easier to review, test, and incorporate smaller changes. Or if smaller changes are not possible, then at least changes that are all related and must be done at the same time.

This patch set includes much more than just asynchronous spawning of commands. For instance it includes changes such as:

Refactor to execute commands asynchronously (the main reason for the patch)
Improved cancel support (likely needed with this patch)
Re-work of mtools commands to no longer need mtoolsrc file (extra)
New key bindings for Insert and Ctrl+Enter (extra)
Combine duplicate code for ext2/3/4 (extra)
Remove read test simulation pass (extra)
Select unallocated partition by default (extra)
Fix dialog progress details view size (extra)

These are all good changes. However, the changes marked "(extra)" are not directly related to asynchronous spawning of commands. By having all these extra enhancements in one bug report means that it is an all or nothing approach to handling this bug report. Adding more changes will serve to further delay testing and acceptance of this bug report.

Regarding C++ String Parsing
----------------------------
Though I am certainly not an expert in C++, I have seen the following two methods for parsing string output in C++:

A) Use the Glib::Ustring methods

Glib::Ustring contains methods similar to std::string. Often I find more or better documentation for std::string.
http://www.cplusplus.com/reference/string/string/

B) Use streams to handle the string character flow

An example of this is in the old Utils::cleanup_cursor method.
http://www.cplusplus.com/reference/sstream/stringstream/

Comment 63 Curtis Gedak 2013-02-28 18:46:58 UTC

Created attachment 237644 [details]
GParted apply operations dialog regression - extra blank entries in details

This is just a heads up that I am investigating a regression wherein extra blank entries occur in the output contained in the details log.

The attached picture shows this occurring in an xfs copy operation.

Comment 64 Curtis Gedak 2013-02-28 20:00:55 UTC

Regarding the extra blank entries in details:

From looking at the code, it appears that this was likely a design decision made due to constraints imposed when updating progress details on-the-fly.

With the old code would an entry would be added if and only if the output stream or the error stream was non-empty.

With the new code, place holders are required for the output stream and the error stream, so that any new data on these streams can be immediately appended to an already existing entry for output and error log details respectively.


Based on this constraint, I can see a few possible options:

1)  Leave the code as is and live with the extra blank entries

2)  Consider combining the error and output stream
    - By doing this there would be at most one extra blank entry.
    - A disadvantage would be that the content of output and error
      streams could become entangled making the resulting combined
      text hard to read

3)  Consider removing extra blank entries after the command completes
    - An advantage is that the saved gparted_details.htm log would be
      cleaner.
    - One would need to be careful not to orphan the "Apply pending
      operations" dialog when removing an entry that is currently in
      the details view display.

4)  Some other option?


Mike and Phillip,

What are your thoughts on this regression?

Comment 65 Phillip Susi 2013-02-28 20:56:33 UTC

I don't think that's it Curtis.  I ran into this at one point earlier on and it was because the code was adding the temporary operation detail, and then later a second one was added, then the output updated the second one, and the first one was left blank.

Notice that there is no time or status either, so it's not just a lack of output from the external command.

Comment 66 Mike Fleetwood 2013-02-28 21:06:11 UTC

Re: extra blank entries

May not be relevant after Phillip's comment #65, but how about ...

4)  Lazy add entries for command output
    - Only add entries for stdout and stderr into the operation details
      tree when the first data is read from the relevant stream.  If
      there was no output there would be no entry.  Just like now.
    - Without looking at any code; just save a reference to the node in
      the operation detail tree for added when data is first read.

Comment 67 Curtis Gedak 2013-02-28 22:43:16 UTC

Thank you Phillip and Mike for your insight and suggestions.

Based on comment #65 it would seem that my first thoughts on what causes the problem might be wrong.  It looks like further investigation is required to pinpoint the exact root cause.

Comment 68 Phillip Susi 2013-03-01 02:53:57 UTC

Ok, yea, I didn't solve the problem correctly that I ran into before.  I ran into trouble because OperationDetail::get_last_child() actually returns a copy instead of the original, so to modify the original I switched to get_childs(), and mucked up the way the items are added to the tree.

I have fixed the ordering of the details being added to the tree and fixed get_last_child() to not return a temporary copy to facilitate its use for this and pushed the new patch, along with the previous fixup patches.

Comment 69 Curtis Gedak 2013-03-01 22:33:52 UTC

Hi Phillip.  I just tried out your latest patch in the psusi/refactor branch.  So far I see the same extra blank entries in the output.

From playing around with the code and not creating the OperationDetail child for the error stream, I can confirm that this removes one of the entries after executed commands.

The two entries after each command execution are simply the contents of the output and error streams respectively.  There is no time or status beside these entries in the GParted 0.14.1 release either, as was alluded to in comment #65.


If one looks at the detail log tree for a command execution, there are two entries for each command.  One to hold the output stream and one to hold the error stream.  Often times the error stream is empty.  This is where the change in behaviour occurs.


Specifically the change from (older FileSystem::execute_command code):

	if ( ! output .empty() )
		operationdetail .get_last_child() .add_child( OperationDetail( output, STATUS_NONE, FONT_ITALIC ) ) ;

	if ( ! error .empty() )
		operationdetail .get_last_child() .add_child( OperationDetail( error, STATUS_NONE, FONT_ITALIC ) ) ;

to (newer code):

	operationdetail.get_last_child().add_child(
		OperationDetail( output, STATUS_NONE, FONT_ITALIC ) );
	operationdetail.get_last_child().add_child(
		OperationDetail( error, STATUS_NONE, FONT_ITALIC ) );


The older code has the advantage that the command has completed, so the content of the output stream and error stream are known before creating the OperationDetail child(ren).

With the newer refactored code, we do not have the luxury of knowing beforehand if either the output stream or the error stream will contain data.  The code creates these children, but each child might not contain any data after the command completes.

I like Mike's suggestion in comment #66 to perform a lazy add of entries for output and error streams.  However I am unable to see my way through to implementing this idea in code.

The challenge I see is that we must create objects so that we can attach signal handlers to the object to capture potential data from the output and error streams.  This precludes only creating the objects after we have received some data on the output or error streams.

I am open to ideas on how we might tackle this challenge.

Comment 70 Phillip Susi 2013-03-02 02:13:27 UTC

Oh, I didn't even realize the lines weren't there before, I just thought I never noticed them because you have to click the expander on the command to see them, and before that last patch, instead of being children of the command, they were siblings, so no expander.

If you really think we should get rid of them, we probably could delete them after completion if they are blank, though there doesn't currently seem to be a method to remove them.  I'm not really sure there's any harm in having them there.

Comment 71 Curtis Gedak 2013-03-02 17:51:47 UTC

I agree Phillip that there is no real harm in having the extra blank entries in the details log.  My concern is more that this is a change from previous behaviour.  From an aesthetic viewpoint it would be nicer to not have these empty entries.

Having said that I highly value the live feedback, so I don't think we would hold back this enhancement due to the extra blank entries.

My testing continues....   I will post the results when completed.

Comment 72 Curtis Gedak 2013-03-02 18:07:23 UTC

Created attachment 237826 [details]
GParted apply operations dialog regression - Last copy block sometimes missed in details

Hi Phillip,

Often when I am testing a copy that uses the updated internal copy algorighm (such as copying ext2), the copy of the last block is missed in the details log.

For example in the above screen shot, the log says:

   424.00 MiB of 432.00

Comment 73 Phillip Susi 2013-03-03 02:45:59 UTC

Fixed it and pushed patch.

We're getting quite a few fixup patches now, are you sure you don't want to rebase for what will hopefully be the final round of testing?

Comment 74 Curtis Gedak 2013-03-03 16:34:19 UTC

Thank you Phillip for the patch.  I will test this as I continue through my current round of testing.

Since it takes me a few days to get through all these tests, I would prefer if you hold off on rebasing until the tests are completed.

My plan is to create a distribution tarball of the code first.  After testing is completed and successful, I will post the results here.  At that point you can perform the rebase.

Then I will create a distribution tarball of the rebased code, and then perform a directory compare of all the code.  If there are no changes, or the changes are easily understood, then I will not do another complete round of testing.

Comment 75 Curtis Gedak 2013-03-04 18:31:53 UTC

Created attachment 238029 [details]
gparted_details.htm log file of move BTRFS with Cancel, Force Cancel, and Cancel Operation pressed - still rolls back file system.

Comment 76 Curtis Gedak 2013-03-04 19:55:12 UTC

Created attachment 238042 [details]
gparted_details.htm log file of move BTRFS with Cancel, wail then Force Cancel, and Cancel Operation press in copy rollback FS - second cancel works..

Some more updates on this problem of force cancelling a move operation.

In comment #75, I pressed Cancel, then immediately pressed Force Cancel and Cancel Operation as soon as these buttons showed up.  In that time, my details log had not yet started to rollback the file system.  (My system is slow and seems to take 5 seconds to commit changes to this old 160 GB IDE drive).


With this latest attachment, I pressed Cancel, then waited for the details log to be in the copy section of rolling back the file system.  At this point I pressed Cancel and Cancel Operation, and this successfully cancelled the rollback file system transaction.

There seems to be some inconsistency in the timing of pressing Force Cancel and Cancel Operation, and what happens as a consequence.

Comment 77 Curtis Gedak 2013-03-04 21:18:59 UTC

Hi Phillip,

There appears to be another regression in this refactored code.

In old GParted, if the mount point for a partition is in /etc/fstab then GParted would permit the user to both unmount and mount the partition using the "Partition --> Unmount" and "Partition --> Mount on --> /mount-point-path" menu options.  Being in /etc/fstab essentially means that the mount point is known.

With the refactored code I can successfully unmount such a partition.  However, the option to mount the partition is greyed out.

Comment 78 Phillip Susi 2013-03-04 23:54:00 UTC

I can't reproduce the unmount problem, but I pushed another fixup to make sure that the cancel is propagated to sub operations that do not yet exist.

Comment 79 Curtis Gedak 2013-03-05 17:09:11 UTC

Created attachment 238146 [details]
gparted_details.htm log file of move BTRFS with Cancel, Force Cancel, and Cancel Operation pressed still rolls back file system.

Hi Phillip,

Thank you for the latest patch.

In this patch I noticed that a "cancelflag" is declared as "char", but later in the code there are integers stored in this variable.

Was this intentional?


From my testing of this latest patch, the problem remains of the file system still proceeding to roll back, even though Cancel is pressed, then Force Cancel, and Cancel Operation are pressed immediately when available.

In the attached log file, it is interesting to note that the text "Operation Canceled" appears in both the output and error entries of the details log.  Prior to this patch the "Operation Canceled" text appeared only in the output entry of the details log (see attachment in comment #75).


RE-MOUNT PROBLEM:

With the latest psusi/refactor branch code I am able to unmount a file system in /etc/fstab, but unable to mount the file system again using GParted (the mount point is not displayed in the GUI).

To help trouble shoot this problem I checked out the HEAD of the master branch and tested.  In this case I can both unmount and mount the file system.  This leads me to believe that there is some change in the psusi/refactor branch that is affecting this behaviour.

I will continue to investigate this problem further.

Comment 80 Phillip Susi 2013-03-05 19:02:28 UTC

Yes, it was intentional.  Old habbit -- I always use the smallest type I can to hold the values needed.

I noticed those duplicate canceled lines you have are children of "---" which indicates an invalid reference was made to an OperationDetail and that one was spawned to fill the void.

There shouldn't be anything at all even remotely connected to the mount point in these changes.  When I tested it I used my fat32 efi system partition.  I don't suppose it could be related to the fs type you are using?

Comment 81 Curtis Gedak 2013-03-06 00:36:02 UTC

Created attachment 238175 [details] [review]
Patch to fix mount problem of known mount points - root cause was 80 char wrap in command output.

The root of the problem was that the output from executed commands was
being wrapped at 80 characters.  Since we are not in control of the
output of commands or the contents of files, this can cause grief when
the output is expected to be all on one line.  Specifically, each line
of output from blkid that was greater than 80 characters was being
split into another line.  This resulted in parsing problems in the
FSInfo class.
    
A similar situation could occur when using single line comment
characters and having these split into multiple lines.

Comment 82 Curtis Gedak 2013-03-06 02:14:12 UTC

Created attachment 238178 [details]
gparted_details.htm log file of move BTRFS with Cancel, Force Cancel, and Cancel Operation pressed - correctly skips file system roll back.

Hi Phillip,

Using the patch you recently committed to the psusi/refactor branch I have successfully tested the Cancel - Force Cancel - Cancel Operation situation.

As can be seen in the attached details log file, GParted now correctly skips rolling back the file system if Force Cancel + Cancel Operation have been pressed.

I just have a little more testing to do before I will post a summary of my results.


The relevant commit used in this recent successful test is the following one in the psusi/refactor branch:

commit 70190d03a032e7665bce6ba2bb303bfe7975a403
Author: Phillip Susi <psusi@ubuntu.com>
Date:   Tue Mar 5 19:52:05 2013 -0500

    fixup! Thread the internal copy algorithm (#685740)
    
    Fix the error message to show up in the right place in the tree, and only
    once, and fix the cancel signal propagating to future details.

Comment 83 Curtis Gedak 2013-03-06 16:51:13 UTC

Created attachment 238213 [details] [review]
Patch set collection 6 includes all previous changes

I have finished my second round of testing, and I have good news.  All
of my tests now succeed.  :-)

This round of testing has been fairly thorough, but then this
enhancement changed the fundamental command execution code, so I think
we needed to be fairly thorough with our testing.

For testing I used patch set collection 6 listed above.  This contains
the psusi/refactor branch as of March 5, 2013 plus the patch in
comment #81.


Phillip,

Please feel free to include the patch from comment #6, and rebase the
psusi/refactor branch.

Also I noticed that the "Force Cancel" text does not appear to be
translatable.  It needs to be enclosed in _("xxx"), and strings with
parameters should have a note to translators on what the string will
look like when filled in.

Would you please update this string in the appropriate places so that
other language translations will be supported?

When you are finished rebasing, please let me know and then I can see
about a few quick tests and then assuming no problems I will commit to
the master branch.

Following that I will prepare an announcement of the next release of
GParted in about a weeks time to give the translators time to do their
work.


Mike,

If you think of anything else we need to check or do before creating a
new release of GParted, then please let us know.

Regards,
Curtis


TESTING RESULTS:

The majority of the following tests were performed on Kubuntu 12.04.


STATUS   DESCRIPTION                           RESULTS
======   ===================================   =======================
pass     Ped_exception_handler - grow gpt      Prompted to fix GPT

pass     Pulsebar scanning devices             Bounces back and forth
pass     Pulsebar applying operations          Bounces back and forth

pass     Copy ext2, linux-swap, ntfs, xfs      All FS copies valid

pass     Move btrfs, ext3, linux-swap, ntfs    All FS moves valid

pass     Copy ext2: 512 to 4096 bytes/sector   FS valid

pass     Copy ext2 with grow on destination    FS copy grew and valid

pass     Label FAT16                           Set and clear works
pass     Label FAT32                           Set and clear works

pass     Cancel copy ext2, ntfs                Orig okay, copy bad
pass     Cancel move ext3                      Orig restored okay
pass     Cancel mkfs ext4                      New FS bad - expected
pass     Cancel with multiple actions          Future actions cancelled
pass     Force Cancel copy                     Copy halts, FS bad - expected
pass     Force Cancel move                     Rollback stop - FS bad - expected

pass     Copy ntfs progress live updates       Live % complete shown
pass     Copy ext3 progress live updates       Live Mib copied shown

STATUS   DESCRIPTION                           RESULTS
======   ===================================   =======================
pass     Save gparted details                  Steps saved

pass     Unmount/mount partition               Partition unmounted, re-mounted
pass     Toggle swapon/swapoff                 Swap enabled/disabled
pass     De/activate LVM2 PV                   LV de/activated

pass     Mkfs and copy safe to cancel          No user data lost on cancel

pass     Unallocated space default selected    True if unallocated available
pass     INSert is Partition --> New           Works if unallocated selected
pass     CTRL+Return is Apply all operations   Works

pass     Remove read-test on move              True.  Cancel rolls back okay.
pass     Check FS after reverting Part Tbl     True.  Cancel rolls back okay.

pass     Operation time values valid           True.  Values valid.

pass     Combined code works for ext2/3/4      Works.

pass     Dialog progress details grow size     Can see status column

pass     All features still available          Yes, see View --> FS support

STATUS   DESCRIPTION                           RESULTS
======   ===================================   =======================
pass     Grow reiserfs, ext4, fat32            Grow works, FS okay
pass     Shrink reiserfs, ext4, fat32          Shrink works, FS okay

pass     Label reiserfs, fat16, ext3, btrfs    Label changed
pass     New UUID reiserfs, fat16, ext3, ntfs  New UUID set

pass     Recover lost file systems             NTFS file system found

pass     Compiles/Runs on supported distros?
           pass                 Debian 6       can grow/shrink ntfs
           pass                 Ubuntu 10.04   Can de/activate lvm2 vg
           pass                Kubuntu 12.04   Main testing platform (above)
           pass                 Ubuntu 12.10   Can label btrfs
           pass                 Fedora 16      Can grow/shrink reiserfs
           pass                 Fedora 18      Can grow/shrink fat32
           pass               openSUSE 11.2    Can grow/shrink ext4
           pass               openSUSE 12.2    Can grow/shrink btrfs

Comment 84 Curtis Gedak 2013-03-06 20:20:33 UTC

Hi Phillip,

When refactoring would you please reword the commit message on the following commit:

    commit dd35ae9cca64ff01851f9ce3a71be6da19607cef
    Author: Phillip Susi <psusi@ubuntu.com>
    Date:   Wed Jan 30 16:23:48 2013 -0500

        Fix dialog progress details view size

so that it appends the bug number to the title (#602635), and the bug description in the message.

    Bug 662722 - Increase default width of "applying..." dialog to include
                 the "Details" status icons

Comment 85 Curtis Gedak 2013-03-06 20:27:29 UTC

Hi Phillip,

It looks like there are two bug reports related to the dialog progress details view size:

Bug 602635 - list of tasks in apply dialog does not expand to the available
             vertical space

Bug 662722 - Increase default width of "applying..." dialog to include the
             "Details" status icons


This enhancement now fills available vertical space and increases the width.

Comment 86 Phillip Susi 2013-03-08 02:42:05 UTC

Squashed, reworded, and rebased onto current master.  If it looks good, you can cleanly pull it into master.

Comment 87 Mike Fleetwood 2013-03-09 17:24:56 UTC

I noticed that before this patch set when cancelling an operation you
get a dialog like this:
    (?) Are you sure you want to cancel the current operation?
    Cancelling an operation may cause SEVERE file system damage.
                      [ Continue Operation ] [ Cancel Operation ]

The Manual also has a section "Applying All Operations", 2. Click Apply,
which mentions this dialog.

This patch set removes this dialog.  Was that intentional?  Does the 
Manual need updating to reflect the removal of the dialog and the 
changing of the [ Cancel ]  button to [ Force Cancel (5) ] and what it
means?

Comment 88 Phillip Susi 2013-03-09 17:41:39 UTC

The dialog wasn't removed; it just now only appears when you click Force Cancel rather than the initial cancel, which is safe.

The manual probably should be updated to mention this.

Comment 89 Curtis Gedak 2013-03-11 15:04:04 UTC

Hi Mike and Phillip,

I agree that we should update the manual to reflect this change in behaviour.  However, I was thinking of making this change after we release this enhancement.  My reasoning is as follows:

1)  Many manual changes will be required _after_ this release.

    The following bug report contains many enhancements and updates to
    the manual:

    Bug #691681 - Improve Partition Info dialog to be Partition
                  properties dialog

    I was planning on adding the "Force Cancel" updates to the manual
    along with all of the updates needed for this bug report.

    The following bug report also includes some minor manual updates:

    Bug #688882 - Improve clearing of file system signatures

2)  Save translation team time by grouping manual updates together

    I have noticed that the translation teams often take longer to
    update the manual.  As such I thought it would be better to group
    the many manual changes together.  This should save the translation
    teams some time by requiring only one pass of updates, rather than
    multiple passes.

I know that technically this "Force Cancel" update should be in this release of the GParted Manual, so if you disagree with my reasoning then please let me know.

In the meantime I will review the changes with an eye towards committing this enhancement to the master branch.

Comment 90 Curtis Gedak 2013-03-11 15:48:48 UTC

Hi Phillip,

It looks like we missed including the patch to remove the 80 char line wrapping from the psusi/refactor branch.

See comment #81.

Ah, I see that I marked the patch as obsolete, so this is my fault that it was missed.

Would you like to fold the patch from comment #81 into the psusi/refactor branch?

Otherwise I can apply the patch after all the psusi/refactor patches.

Comment 91 Mike Fleetwood 2013-03-11 20:51:28 UTC

Re: Consolidating GParted Manual updates

Reasoning is good with me.

Comment 92 Phillip Susi 2013-03-11 23:14:11 UTC

Pushed new branch with the 80 character line wrap removed.

Comment 93 Curtis Gedak 2013-03-12 00:55:03 UTC

The patch set contained in the psusi/refactor branch has been committed to the master branch for inclusion in the next release of GParted (0.15.0).

The relevant git commits can be viewed at the following links:

Remove gdk_threads_enter/exit (#685740)
https://git.gnome.org/browse/gparted/commit/?id=f5a5c9cdb9d24c930871060bceade9cfe0e69053

Use a full fledged nested main loop while waiting and pulsing progress bars (#685740)
https://git.gnome.org/browse/gparted/commit/?id=124342e9791f995a6cd6f7bc09405a859054d6dd

Switch Dialog_Progress to use Glib thread instead of pthread (#601239)
https://git.gnome.org/browse/gparted/commit/?id=ddd92cf86a12611c58c5c911d301dabcac686e05

Remove mtoolsrc file
https://git.gnome.org/browse/gparted/commit/?id=2706f0174aeca1fdad08669c67fab5ac75ebfd5f

Reduce threading (#685740)
https://git.gnome.org/browse/gparted/commit/?id=52a2a9b00a32996921ace055e71d0e09fb33c5fe

Thread the internal copy algorithm (#685740)
https://git.gnome.org/browse/gparted/commit/?id=bd9e16f22fe3dfae73064369aadeda5de10b61be

Add proper cancel support (#601239)
https://git.gnome.org/browse/gparted/commit/?id=89de9a50269c2f92208a549f0e5feb6127ca7d30

Flag mkfs and copy operations as safe to cancel (#601239)
https://git.gnome.org/browse/gparted/commit/?id=723209e59f28ad6bd6cc1b8126180cc758d1d62d

Bind Insert key to new partition
https://git.gnome.org/browse/gparted/commit/?id=9ad53d94a01a58b6b68b3960a1b8a26fc2e8bb4f

Select unallocated partition by default (#667365)
https://git.gnome.org/browse/gparted/commit/?id=3dd769d955764455ade6ddee746a37657bff8f4a

Bind ctrl-enter to apply-operations
https://git.gnome.org/browse/gparted/commit/?id=2a2d21c231bc4344cada71f30a387e07b35e019d

Remove simulation pass ( read test ) on move
https://git.gnome.org/browse/gparted/commit/?id=b9b4b2e55d1dadd5a5c0fef552dbbd2499561182

Check fs after reverting partition table
https://git.gnome.org/browse/gparted/commit/?id=d62bf0a0f324d206339a7bae3c7736446109ac32

Don't revert more than needed
https://git.gnome.org/browse/gparted/commit/?id=b1ede01506b4c057d7d408ce150ccc1b0437678b

Clean up OperationDetail timer initialization
https://git.gnome.org/browse/gparted/commit/?id=7af47136ffe59d56abf0ee4979c1672fe3f10033

Combine duplicate code for ext[234]
https://git.gnome.org/browse/gparted/commit/?id=38dc55d49c062cb8e93c797f9bf166d3abc3456e

Fix dialog progress details view size (#602635) (#662722)
https://git.gnome.org/browse/gparted/commit/?id=4c249b4d65bf35cc03e9897a2bf66ab0f33bfc39

Cleanup duplicate fs code
https://git.gnome.org/browse/gparted/commit/?id=e4210ba08d42a5074033aedd6a91e9967a16fc8f

Pass Partition instead of just its path to FileSystem::copy()
https://git.gnome.org/browse/gparted/commit/?id=a92380b5034f16627522e8cc399dceda608249be

Comment 94 Curtis Gedak 2013-03-12 01:42:21 UTC

Two more commits that are relevant to this enhancement are:

Update translation file list in POTFILES.in
https://git.gnome.org/browse/gparted/commit/?id=53ad363a4ff3f2ada71c57648be3e6782adb7890

Mark string "Force Cancel" for translation
https://git.gnome.org/browse/gparted/commit/?id=6ea69b0ff8e12dad826940c0822f0786a8be9159

Comment 95 Mike Fleetwood 2013-03-15 22:14:31 UTC

Hi,

Select unallocated partition by default (#667365)
https://git.gnome.org/browse/gparted/commit/?id=3dd769d955764455ade6ddee746a37657bff8f4a

I've noticed that this patch caused the main window to flash as it's
redrawn twice when Device Refresh completes.

Comment 96 Mike Fleetwood 2013-03-16 09:16:41 UTC

Re: Free space selection and refresh flicker

Further testing:

T1) Empty disk, only free space.
    [PASS ] Free space selected.
    [ISSUE] Double refresh flicker.

T2) 1 primary partition filling the disk.
    [PASS ] No partition selected as no free space.
    [PASS ] No double refresh flicker.

T3) 1 primary partition at the start of the disk with free space at the
    end.
    [PASS ] Free space selected.
    [ISSUE] Double refresh flicker.

T4) Only 1 extended partition filling the disk.
    [FAIL ] Free space in extended partition not selected.
    [PASS ] No double refresh flicker.

T5) Only 1 extended partition at the start of the disk with more free
    [FAIL ] Selected smaller free space outside extended partition.
    [ISSUE] Double refresh flicker.


It looks like that:

1) It only considers the largest primary partitions when looking for
   free space to select.

2) Whenever free space is selected there is a double refresh flicker.

Comment 97 Mike Fleetwood 2013-03-16 09:20:05 UTC

Complete (T5) test case without truncating the description.

T5) Only 1 extended partition at the start of the disk with more free
    inside the extended partition than outside.
    [FAIL ] Selected smaller free space outside extended partition.
    [ISSUE] Double refresh flicker.

Comment 98 Curtis Gedak 2013-03-16 15:36:29 UTC

Thank you Mike for discovering and reporting the problem with "select unallocated partition by default".  I can confirm problem (T5) as noted in comment #97.  Unfortunately my eyes haven't quite detected the flicker problem, but that does not mean that the problem does not exist.


Mike and Phillip,

If we are unable to resolve (including testing) this problem before our planned release on March 19th then I think we should revert this change.

My reasoning is that this enhancement makes a change to the default behaviour of GParted, but has some rough edges to be ironed out before being included in a stable release.  Also this enhancement is not a requirement for the other enhancements in the refactor patch set so I do not think we should hold up the other improvements planned for the release.

What are your thoughts?

Comment 99 Mike Fleetwood 2013-03-16 16:31:03 UTC

I agree with your logic.
(If it was only the flicker I would probably release GParted as it is,
but with it also not selecting free space in extended partitions I think
it needs fixing or reverting).

For the flicker from double refresh, I observed this on Fedora 14, 17
and Debian 6.  You might be blessed with a machine that is too fast ;-)
On my fast desktop it is almost unnoticeable too.  Try running GParted on
a remote machine, thus displaying X over the network.  It slows down the
refresh making it easier to see.

Comment 100 Phillip Susi 2013-03-17 02:18:57 UTC

Created attachment 239050 [details] [review]
0001-Consider-logical-partitions-when-selecting-default-f.patch

This should fix it the logical part.  I can't duplicate the flicker even when single stepping through the code in the debugger.

Comment 101 Mike Fleetwood 2013-03-17 13:28:45 UTC

The lastest patch (from comment #100) fails multiple tests.

Using tests from comment #96 above:
T1) [PASS]
T2) [PASS]
T3) [FAIL] No free space selected.
T4) [FAIL]
T5) [FAIL] Selects extended partition not the free space itself.

Additional test:

T6) 8G drive [1: Prim 2G, 2: Ext 4G [5: Log 3G, 1G free], 2G free]
    [FAIL] Sometimes selects partition 5, the largest extended.
           Sometimes selects none.
           Possibly related to whether: drive change, apply operation,
           or [Ctrl-R] is causing the refresh.

Comment 102 Mike Fleetwood 2013-03-17 13:30:10 UTC

Created attachment 239063 [details] [review]
Select largest free partition (v3)

Here's a patch (set) which works for me.  Passes all six tests.
Reverted the currently applied patch first so there's a clean sheet to
review my change against.
NB: The added trailing white space is from the revert.

This still gives me double refresh flicker when it selects the largest
free partition though.  I don't know I have free time to look into this
before GParted's release on Tuesday.

Thanks,
Mike

Comment 103 Curtis Gedak 2013-03-17 17:04:36 UTC

PATCH FROM comment #100
-----------------------

My testing of the patch in comment #100 reveals some problems.

a)  On my completely full disk with many partitions, the patch incorrectly selects the largest logical partition (not unallocated space).

b)  On a disk with one 40 GiB primary partition and 109 GiB unallocated space, the patch does not select the unallocated space.

As such I do not recommend that we apply this patch.


PATCH FROM comment #102
-----------------------

My testing of the patch in comment #102 shows that it resolves the problem of selecting the largest unallocated space regardless of whether inside or outside of an extended partition.

In the case where two unallocated partitions are identical in size, then this patch properly selects one.

In the situation where there is no unallocated space, no partition is selected.

Based on my testing, I am ready to commit the patch set in comment #102 to master to be included in the GParted release on Tuesday.


Phillip,

Have you had a chance to test the patch in comment #102?

Comment 104 Phillip Susi 2013-03-17 21:33:15 UTC

Created attachment 239076 [details] [review]
0001-Consider-logical-partitions-when-selecting-default-f.patch

It was working on startup, or if you clicked one partition, then manually refreshed, but if you began with an empty disk, then created the partitions, it remembered the whole disk empty partition and tried to use that, even though it wasn't there anymore.  I just needed a selected_partition.Reset().

Comment 105 Mike Fleetwood 2013-03-18 14:36:56 UTC

Tested patch from comment #104.

(T1) to (T5) [PASS]
(T6) [FAIL] Selects largest extended partition even when it's not free.

Comment 106 Phillip Susi 2013-03-18 15:07:24 UTC

Woops... Mike's patch looks pretty similar only without forgetting to check the extended is free, so if you've already tested it and it looks good, then go ahead and apply it.

Comment 107 Curtis Gedak 2013-03-18 16:24:39 UTC

Thank you Mike and Phillip for working to resolve issues with the enhancement to select the largest unallocated space by default.

I have committed the patch from comment #102.  This leaves the minor problem of double refresh/flicker to be resolved in a later release of GParted.

The relevant git commits can be viewed at the following links:

Revert "Select unallocated partition by default (#667365)"
https://git.gnome.org/browse/gparted/commit/?id=92f4947618ebaa72ea9345d86859ca069efd4eee

Select largest unallocated partition by default (#667365)
https://git.gnome.org/browse/gparted/commit/?id=5b53c12f6ee12312a9bacb44b8b05b12a540d3d6

Comment 108 Curtis Gedak 2013-03-19 16:54:33 UTC

The code change to address this report has been included in GParted 0.15.0 released on March 19, 2013.