Bug 777973 – Segmentation fault on bad disk

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 777973 - Segmentation fault on bad disk


Summary:	Segmentation fault on bad disk


Status:	RESOLVED FIXED

Product:	gparted
Classification:	Other
Component:	application
Version:	0.27.0
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Mike Fleetwood
QA Contact:	gparted maintainers alias

URL:
Whiteboard:

Duplicates:	780104 780354 (view as bug list)
Depends on:
Blocks:

Reported:	2017-01-31 09:15 UTC by mark.vanrossum
Modified:	2017-08-07 17:20 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
strace (43.72 KB, text/plain) 2017-01-31 09:15 UTC, mark.vanrossum		Details
backtrace wo. debuginfo (4.52 KB, text/plain) 2017-02-01 09:29 UTC, mark.vanrossum		Details
backtrace w debuginfo (6.52 KB, text/x-log) 2017-02-01 09:29 UTC, mark.vanrossum		Details
fsck.out (62.40 KB, application/x-compressed-tar) 2017-02-01 11:31 UTC, mark.vanrossum		Details
Prevent core dump reading non UTF-8 data from external command (draft 1) (5.68 KB, patch) 2017-03-05 11:45 UTC, Mike Fleetwood	none	Details \| Review
Prevent core dump reading non UTF-8 data from external command (draft 2) (33.03 KB, patch) 2017-04-20 17:12 UTC, Mike Fleetwood	none	Details \| Review
Prevent core dump reading non UTF-8 data from external command (v1) (68.24 KB, patch) 2017-05-26 20:34 UTC, Mike Fleetwood	none	Details \| Review

Description mark.vanrossum 2017-01-31 09:15:24 UTC

Created attachment 344634 [details]
strace

Hi 
I'm using gparted-0.27.0-1.fc25.x86_64
on a  mounted fat16 partition via USB (phone microsd)

This leads to a crash:
/usr/sbin/gparted: line 193: 19962 Segmentation fault      (core dumped) $BASE_CMD

I put the strace below.
Both gparted and gparted /dev/sde crash 
gparted /dev/sda is fine.

parted /dev/sde works fine and reports: 

Model: Nokia Nokia 301 (scsi)
Disk /dev/sde: 7961MB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1560MB  4140MB  2579MB  primary  fat16


gparted output:

Created symlink /run/systemd/system/-.mount → /dev/null.
Created symlink /run/systemd/system/boot.mount → /dev/null.
Created symlink /run/systemd/system/home.mount → /dev/null.
Created symlink /run/systemd/system/proc-fs-nfsd.mount → /dev/null.
Created symlink /run/systemd/system/run-user-1000.mount → /dev/null.
Created symlink /run/systemd/system/tmp.mount → /dev/null.
Created symlink /run/systemd/system/var-lib-machines.mount → /dev/null.
Created symlink /run/systemd/system/var-lib-nfs-rpc_pipefs.mount → /dev/null.
======================
libparted : 3.2
======================
/usr/sbin/gparted: line 193: 19962 Segmentation fault      (core dumped) $BASE_CMD
Removed /run/systemd/system/-.mount.
Removed /run/systemd/system/boot.mount.
Removed /run/systemd/system/home.mount.
Removed /run/systemd/system/proc-fs-nfsd.mount.
Removed /run/systemd/system/run-user-1000.mount.
Removed /run/systemd/system/tmp.mount.
Removed /run/systemd/system/var-lib-machines.mount.
Removed /run/systemd/system/var-lib-nfs-rpc_pipefs.mount.

Comment 1 Mike Fleetwood 2017-01-31 12:37:41 UTC

/usr/sbin/gparted is actually a shell script which runs
/usr/sbin/gpartedbin, the binary.  I assume that it is gpartedbin which
is core dumping.

Please follow the instructions below to produce a core dump and back
trace, running 'gparted /dev/sde'.

Thanks,
Mike


How to capture a backtrace from a core dump
-------------------------------------------

1)  Turn off any OS core dump capturing, ensuring:
        cat /proc/sys/kernel/core_pattern
    reports just "core"

    Some methods to do this, depending on distro version, are:
    *   service abrt-ccpp stop      (RedHat/Fedora pre systemd)
    *   systemctl stop abrtd        (RedHat/Fedora with systemd)
    *   sudo service apport stop    (Ubuntu)

2)  Increase core dump limit and run GParted as root
    Either:
        ulimit -c unlimited
        sudo gparted
    or:
        su - root
        ulimit -c unlimited
        gparted

3)  Perform crashing action

4)  Capture backtrace

        ls -lrt core*
        which gpartedbin
        gdb `which gpartedbin` {COREFILE} --batch --quiet \
            -ex 'info threads' -ex 'thread apply all backtrace' \
            -ex quit > backtrace.log

Please paste the terminal output when running gparted and the contents
of the backtrace.log file.

Comment 2 mark.vanrossum 2017-02-01 09:28:33 UTC

Here are the backtraces both before and after
debuginfo-install gparted

Comment 3 mark.vanrossum 2017-02-01 09:29:29 UTC

Created attachment 344692 [details]
backtrace wo. debuginfo

Comment 4 mark.vanrossum 2017-02-01 09:29:54 UTC

Created attachment 344693 [details]
backtrace w debuginfo

Comment 5 mark.vanrossum 2017-02-01 09:33:22 UTC

PS Small addition to your otherwise excellent instructions.

In Fedora core dumps are saved by systemd in /var/lib/systemd/coredump/
even if abrtd is disabled.
See

man  systemd-coredump

Comment 6 Mike Fleetwood 2017-02-01 10:35:15 UTC

GParted is core dumping below
GParted::PipeCapture::OnReadable(Glib::IOCondition) when it is running
the command:
    fsck.fat -n -v /dev/sde1
to read the space usage of the file system.

+ Trace 237103

Thread 1 (Thread 0x7f6abfd59f40 (LWP 26715))

#0 Glib::ustring_Iterator<__gnu_cxx::__normal_iterator<char*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::operator++()
at /usr/include/glibmm-2.4/glibmm/ustring.h line 957
#1 Glib::ustring_Iterator<__gnu_cxx::__normal_iterator<char*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::operator++(int)
at /usr/include/glibmm-2.4/glibmm/ustring.h line 965
#2 GParted::PipeCapture::OnReadable(Glib::IOCondition)
at PipeCapture.cc line 63
#3 GParted::PipeCapture::_OnReadable(_GIOChannel*, GIOCondition, void*)
at PipeCapture.cc line 45

Comment 7 Mike Fleetwood 2017-02-01 10:47:43 UTC

Hi,

You say the disk is bad.  Do you just mean GParted crashes when trying
to read it, or is there some way you know it is bad?  If so what is
wrong with it?

Please run the following commands as root and copy the terminal output:
    blkid /dev/sde1
    fsck.fat -n -v /dev/sde1

Thanks,
Mike

Comment 8 mark.vanrossum 2017-02-01 10:52:48 UTC

blkid /dev/sde1
/dev/sde1: SEC_TYPE="msdos" UUID="0000-8E65" TYPE="vfat" PARTUUID="7f7c1da5-01"

    

 fsck.fat -n -v /dev/sde1
fsck.fat 4.1 (2017-01-24)
Checking we can access the last sector of the filesystem
0x25: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt.
 Automatically removing dirty bit.
Boot sector contents:
System ID "mkfs.fat"
Media byte 0xf8 (hard disk)
       512 bytes per logical sector
     65536 bytes per cluster
         1 reserved sector
First FAT starts at byte 512 (sector 1)
         2 FATs, 16 bit entries
    131072 bytes per FAT (= 256 sectors)
Root directory starts at byte 262656 (sector 513)
      2048 root directory entries
Data area starts at byte 328192 (sector 641)
     39354 data clusters (2579103744 bytes)
16 sectors/track, 4 heads
   3047424 hidden sectors
   5038080 sectors total
FATs differ but appear to be intact. Using first FAT.
/LOST.DIR/D3"
  Bad short file name (D3").
  Auto-renaming it.
  Renamed to FSCK0000.000

which goes on and on, printing many non ASCII characters in the bad names.

Comment 9 Mike Fleetwood 2017-02-01 11:22:11 UTC

My theory is that GParted is choking on the output from fsck.fat.
Please provide that output so I can inject it into GParted to try to
reproduce the fault.

    fsck.fat -n -v /dev/sde1 1> fsck.out 2> fsck.err
    echo $?
    tar czf fsck.tgz fsck.out fsck.err

Please report exit status of fsck.fat and upload fsck.tgz.

Thanks,
Mike

Comment 10 mark.vanrossum 2017-02-01 11:30:57 UTC

Seems like a reasonable hyypothesis.

See attached. Note I had to interrupt fsck as it keeps going forever.
Therefor fsck.err is empty.

Comment 11 mark.vanrossum 2017-02-01 11:31:38 UTC

Created attachment 344696 [details]
fsck.out

Comment 12 Mike Fleetwood 2017-02-02 14:56:07 UTC

I have reproduced the core dump in GParted by making fsck.fat output the
same captured output.

@mvanross
The file names of all the files appears to be random binary data and
fsck.fat is reporting every file as corrupt.  Do you know why that might
be?

Comment 13 Mike Fleetwood 2017-02-07 19:11:04 UTC

@manross
1) Please provide complete output from fsck, and the exist status, as
   described in comment #9 above.  This is so that I can be confident
   that any fix I produce will work.  It may take a while to run as it
   is reporting errors for every file on the file system.  However it
   will get to the end.
2) The directory portion of all file names are ASCII text while the file
   name portion appears to be binary data.  Do you know why that might
   be?  Known that might help guide the direction of the fix.

Thanks,
Mike

Comment 14 mark.vanrossum 2017-02-08 10:36:35 UTC

Hi

I now accessed this microSD card via my card reader. 
I ran 
fsck.fat -n -v /dev/mmcblk0p1 1> fsck.out 2> fsck.err

It took about 5hrs, but now gparted does not crash on it anymore.

echo $?
130

zip log files attached.

I don't know why the file names are binary. I would not be surprised if the card is basically completely rubish.

Note, I'm not necessarily looking into retrieving the data of card, 
but my bug report is that gparted should not dump core....

Comment 15 mark.vanrossum 2017-02-08 10:39:49 UTC

Oops, too large to attach.

It is at

homepages.inf.ed.ac.uk/mvanross/fsck.tgz

Comment 16 Mike Fleetwood 2017-03-05 11:45:58 UTC

Created attachment 347262 [details] [review]
Prevent core dump reading non UTF-8 data from external command (draft 1)

Hi Curtis,

Here's a draft patchset for this.  The patch stops GParted crashing and
is good enough for committing; however it has revealed another issue.
To read all 122 MiB of the above fsck output via PipeCapture takes > 2
days of 100% CPU time with the GParted UI completely unresponsive.

Time to read above fsck output (comment 15) from corrupted FAT file
system:
* First 1 MiB    6.2 sec
* First 10 MiB   278 sec
* All 122 MiB    > 2 days

The problem is that as the capture buffer gets bigger
Glib::ustring::replace() gets exponentially slower.  I haven't succeeded
in writing a working replacement yet.

Mike

Comment 17 dhekir 2017-03-16 13:23:59 UTC

*** Bug 780104 has been marked as a duplicate of this bug. ***

Comment 18 Mike Fleetwood 2017-03-22 13:23:25 UTC

*** Bug 780354 has been marked as a duplicate of this bug. ***

Comment 19 Mike Fleetwood 2017-04-20 17:12:43 UTC

Created attachment 350154 [details] [review]
Prevent core dump reading non UTF-8 data from external command (draft 2)

Hi Curtis,

Here's draft patchset 2 for this.  Additionally to patchset draft 1 from
comment 16, it makes reading the fsck output really fast.  For all
122 MiB of fsck output:
* Before: > 2 days of 100% CPU time
* After:  6.5 seeconds
What I learned was that Glib::ustring is not capable of storing large
amounts of data and being edited in reasonable amounts of time.

Anyway as Mark pointed out in comment 14 above, fsck.fat took 5 hours to
produce all the output so GParted would be reading and waiting that long
during the initial startup.  Except now with the UI being drawn and the
status bar pulsing normally.  However 5 hours is too long for a user to
be expected to wait.  This can't be solved without replacing the use of
fsck.fat with something else.  I'm NOT intending to do this.

Given that PipeCapture class and OnReadable() method have become more
complicated I am looking into writing some unit tests for the class.
This is why this patchset is considered draft.  I want to add a unit
test framework first, then add unit tests driving these changes.

Thanks,
Mike

Comment 20 Curtis Gedak 2017-04-20 19:23:55 UTC

Hi Mike,

My initial peek at patch set (draft 2) in comment #19 looks good.

If I understand the changes correctly, there should be no change in the captured output that is shown in the GUI.  However, the new code should not cause a segmentation fault when it encounters invalid UTF-8 characters.  Further the new code should be much faster in operation, especially for long drawn out operations.

I plan to test the patch set with a number of different operations to ensure that the captured output remains correct.

Thanks,
Curtis

Comment 21 Mike Fleetwood 2017-04-21 09:59:49 UTC

Hi Curtis,

Yes, PipeCapture still has exactly the same interface it had before,
including updating the Glib::ustring passed to the constructor and
firing all callbacks registered against the signal_update slot.

As an aside if you want to see how slow the Gtk widgets are at
displaying a lot of data, try this:
1) Create a FAT partition.
2) Replace /sbin/fsck.fat with a shell script which outputs 1 MiB of
   ascii text.
      mv /sbin/fsck.fat /sbin/fsck.fat.real
      dd if=/dev/urandom bs=1M of=/tmp/random-1M.bin count=1
      base64 < /tmp/random-1M.bin > /tmp/random-1M.base64
      truncate -s 1M /tmp/random-1M.base64
      cat << EOF > /sbin/fsck.fat
      #!/bin/sh
      cat /tmp/random-1M.base64
      EOF
      chmod a+x /sbin/fsck.fat
3) Load GParted.  No additional waiting as the 1 MiB of output is read
   to get the file system usage.  (Ignore the warning of no actual usage
   figures).
4) Run a file system check on the FAT partition.
   Expand the operation details to see the output of the fsck command.
   Try resizing the window and moving the scroll bars about.
For me the UI is taking about 1 or 2 seconds to update display for this
"small" 1 MiB amount of output.

Thanks,
Mike

Comment 22 Mike Fleetwood 2017-05-26 20:34:36 UTC

Created attachment 352673 [details] [review]
Prevent core dump reading non UTF-8 data from external command (v1)

Hi Curtis,

Here's patchset v1 ready for reviewing and committing.  Compared to
draft 2 it adds 11 new patches developing unit tests for PipeCapture.

Patches are:
 NEW      [01/18] Add initial unit tests for PipeCapture
 NEW      [02/18] Initialise Glib threading system in test_PipeCapture
 NEW      [03/18] Add 1 MiB ASCII PipeCapture test
 NEW      [04/18] Add binary data string difference reporting to Pipe...
 NEW      [05/18] Add check of PipeCapture update callback
 NEW      [06/18] Add test for bug with PipeCapture crashing reading ...
 EXISTING [07/18] Prevent core dump reading none UTF-8 data from exte...
 EXISTING [08/18] Move initial clearing of output capture buffers int...
 NEW      [09/18] Test that PipeCapture clears capture buffer before ...
 EXISTING [10/18] Improve the performance of PipeCapture for large ou...
 NEW      [11/18] Update PipeCaptureTest.MinimalBinaryCrash777973 exp...
 NEW      [12/18] Add test that PipeCapture can read embedded NUL cha...
 EXISTING [13/18] Workaround g_utf8_find_next_char() not incrementing...
 NEW      [14/18] Add test that PipeCapture can read NUL byte in midd...
 EXISTING [15/18] Workaround g_utf8_get_char_validate() bug with embe...
 EXISTING [16/18] Increase PipeCapture maximum read size to 64K
 EXISTING [17/18] Further improve speed of PipeCapture for non-watched
 NEW      [18/18] Add PipeCapture line discipline tests

Generally I'm adding the tests first to highlight the fault and then
fixing the code to make the test past.

The only changes to the existing patches is a couple of minor comment
updates, otherwise they are the same as before.
 
Thanks,
Mike

Comment 23 Curtis Gedak 2017-06-02 19:40:24 UTC

Hi Mike,

Thank you for your in-depth work to resolve this segmentation fault,
and further to significantly improve performance and also add testing
of the PipeCapture class.

From a visual review, patch set v1 from comment #22 looks good to me.

Using the patch set in a distribution tarball, I have successfully
run:

    ./configure --disable-scrollkeeper
    make
    make check
    make distcheck

on the following distros:

    debian    8
    fedora   24
    kubuntu  16.04
    opensuse 42.2

My regression testing on kubuntu 16.04 of actions such as: create,
grow, shrink, move, copy, check, label, set UUID, and delete, has also
gone well.

If there are no objections then I plan to commit this patch set in the
next day or so.

Thanks,
Curtis

Comment 24 Mike Fleetwood 2017-06-03 08:07:04 UTC

Commit when ready.

Mike

Comment 25 Curtis Gedak 2017-06-03 15:48:06 UTC

Hi Mike,

Thank you again for this patch set.

Before committing I noticed and fixed one minor typo in the last
commit message:

Changed FROM:
  Test that backspace and carriage return are processes correctly and
                                                      ^
TO:
  Test that backspace and carriage return are processed correctly and

The patch set v1 from comment #22 has been committed to the git
repository master branch.

The relevant git commits can be viewed at the following links:

Add initial unit tests for PipeCapture (#777973)
https://git.gnome.org/browse/gparted/commit/?id=b21ee06230adf8b9f4090eef7b46c179f8e26d42

Initialise Glib threading system in test_PipeCapture (#777973)
https://git.gnome.org/browse/gparted/commit/?id=d90702d52608d84ef5c68d66d0c92a3438253ab7

Add 1 MiB ASCII PipeCapture test (#777973)
https://git.gnome.org/browse/gparted/commit/?id=eb2d571a6c5c8d5108e5dd2a7e7bf3f1c913cb35

Add binary data string difference reporting to PipeCapture tests (#777973)
https://git.gnome.org/browse/gparted/commit/?id=dc1c49ba6257045beab18400d6f1c841918614cc

Add check of PipeCapture update callback (#777973)
https://git.gnome.org/browse/gparted/commit/?id=8b47de88728419daae4f249ec0f1e2f1c516d2ba

Add test for bug with PipeCapture crashing reading binary data (#777973)
https://git.gnome.org/browse/gparted/commit/?id=1e3324e16b5e63d81289c8d44ce9f166faf9ed47

Prevent core dump reading none UTF-8 data from external command (#777973)
https://git.gnome.org/browse/gparted/commit/?id=a03735ac6f5e7e4d7282f7d0b06a97f337830858

Move initial clearing of output capture buffers into PipeCapture class (#777973)
https://git.gnome.org/browse/gparted/commit/?id=a233e30efeb80c72b7e9943ee7f615aa006c012f

Test that PipeCapture clears capture buffer before it starts (#777973)
https://git.gnome.org/browse/gparted/commit/?id=03c2be9b90e996efaab9663a1e6aa0c7df30be42

Improve the performance of PipeCapture for large output (#777973)
https://git.gnome.org/browse/gparted/commit/?id=b5b3d199f8e82b7185ec2112f46149c397965117

Update PipeCaptureTest.MinimalBinaryCrash777973 expected string (#777973)
https://git.gnome.org/browse/gparted/commit/?id=0a3e8487a07243528219dc8b79be561cc7f845c7

Add test that PipeCapture can read embedded NUL character (#777973)
https://git.gnome.org/browse/gparted/commit/?id=22573b4eede9dab13b6a26d39ebc0f363f567e0e

Workaround g_utf8_find_next_char() not incrementing past NUL char (#777973)
https://git.gnome.org/browse/gparted/commit/?id=3a6a304c64b669f71c530a07d64caa93dc2f23ca

Add test that PipeCapture can read NUL byte in middle of UTF-8 char (#777973)
https://git.gnome.org/browse/gparted/commit/?id=6b82616d2e5dfddea8b73cc65d64bfc7e08d8388

Workaround g_utf8_get_char_validate() bug with embedded NUL bytes (#777973)
https://git.gnome.org/browse/gparted/commit/?id=8dbbb47ce2db0ee733ff909c1ead2f4de9475596

Increase PipeCapture maximum read size to 64K (#777973)
https://git.gnome.org/browse/gparted/commit/?id=6f49f3049dc8fb72ef771074db73b60a79925962

Further improve speed of PipeCapture for non-watched output (#777973)
https://git.gnome.org/browse/gparted/commit/?id=25780c611bf99ce3f819d00c58616fa517a38b0f

Add PipeCapture line discipline tests
https://git.gnome.org/browse/gparted/commit/?id=f8783d69bc848866265910bf1a2de814a4cba821

Curtis

Comment 26 Curtis Gedak 2017-08-07 17:20:22 UTC

This enhancement was included in the GParted 0.29.0 release on August 7, 2017.