After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 786502 - Support reading Unicode labels when file system specific tools aren't available
Support reading Unicode labels when file system specific tools aren't available
Status: RESOLVED FIXED
Product: gparted
Classification: Other
Component: application
GIT HEAD
Other Linux
: Normal normal
: ---
Assigned To: Mike Fleetwood
gparted maintainers alias
Depends on:
Blocks:
 
 
Reported: 2017-08-19 07:30 UTC by Pali
Modified: 2017-10-10 17:24 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
^ and M- notation decoder (547 bytes, text/x-c)
2017-08-19 08:06 UTC, Pali
  Details
[PATCH] Fix parsing Unicode labels (4.76 KB, patch)
2017-08-19 09:21 UTC, Pali
none Details | Review
Support blkid reading Unicode labels (v1) (15.00 KB, patch)
2017-08-29 11:39 UTC, Mike Fleetwood
none Details | Review
Pics of Two GParted Versions with Unicode and Mischievous Labels (95.02 KB, image/png)
2017-09-02 19:15 UTC, Curtis Gedak
  Details

Description Pali 2017-08-19 07:30:20 UTC
As stated in https://bugzilla.gnome.org/show_bug.cgi?id=662537#c22 it is needed to call blkid with -d parameter to tell blkid to output text in UTF-8. Because by default output is in ^ and M- encoding.

So GParted should either call blkid with -d parameter or implements ^ and M- decoding to support unicode labels.
Comment 1 Pali 2017-08-19 08:06:01 UTC
Created attachment 357961 [details]
^ and M- notation decoder

Here is blkid ^ and M- encoder: https://github.com/karelzak/util-linux/blob/master/misc-utils/blkid.c#L117

I wrote simple decoder for it, see attachment. It could be useful for GParted support.
Comment 2 Pali 2017-08-19 09:21:32 UTC
Created attachment 357965 [details] [review]
[PATCH] Fix parsing Unicode labels

I tried to include ^ and M- notation decoder into GParted. See attached patch.
Comment 3 Mike Fleetwood 2017-08-20 09:18:20 UTC
Note that the scope of this bug is actually limited to not being able to
read unicode labels for file systems which don't provide their own label
reading tool.  Namely: exfat, f2fs, hfs, hfsplus, udf, ufs.

Reading unicode labels from btrfs, ext2/3/4, jfs, linux_swap, nilfs2,
ntfs, reiser4, reiserfs and xfs works correctly.  (fat16/32 doesn't
support unicode characters in the label anyway).


I have found 3 problems with blkid and it's implemented encoding scheme
which means it's not reliable in all cases:
1) Blkid on current but older distributions don't encode double quotes,
   where as latest distributions do.
2) The blkid encoding doesn't encode "M-x" in the input stream so when
   decoding it is not possible to distinguish between encoding and a
   genuine "M-x" from the original input.
3) The blkid encoding doesn't encode "^" in the input stream so when
   decoding it is not possible to distinguish between encoding and a
   genuine "^X" from the original input.

[1] CentOS 7
    # blkid -V
    blkid from util-linux 2.23.2 (libblkid 2.23.0, 25-Apr-2013)
    # mkfs.btrfs -L 'my " drive' /dev/sdb1
    # blkid /dev/sdb1
    /dev/sdb1: LABEL="my " drive" UUID=... UUID_SUB=... TYPE="btrfs"

[2] Fedora 26
    # blkid -V
    blkid from util-linux 2.30.1 (libblkid 2.30.1, 20-Jul-2017)
    # mkfs.btrfs -L 'M-P д' /dev/sdb1
    # blkid /dev/sdb1
    /dev/sdb1: LABEL="M-P M-PM-4" UUID=... UUID_SUB=... TYPE="btrfs"

[3] Fedora 26
    # label="^B"
    # echo -n "$label" | hexdump -C
    00000000  5e 42 02                    "^B."
    00000003
    # cat /dev/zero > /dev/sdb1
    # mkfs.btrfs -L "$label" /dev/sdb1
    # blkid /dev/sdb1
    /dev/sdb1: LABEL="^B^B" UUID=... UUID_SUB=... TYPE="btrfs"
Comment 4 Pali 2017-08-20 09:35:45 UTC
(In reply to Mike Fleetwood from comment #3)
> Note that the scope of this bug is actually limited to not being able to
> read unicode labels for file systems which don't provide their own label
> reading tool.  Namely: exfat, f2fs, hfs, hfsplus, udf, ufs.

I saw this problem when tested UDF labels.

> Reading unicode labels from btrfs, ext2/3/4, jfs, linux_swap, nilfs2,
> ntfs, reiser4, reiserfs and xfs works correctly.  (fat16/32 doesn't
> support unicode characters in the label anyway).

Or in case those reading tools are not installed.

> I have found 3 problems with blkid and it's implemented encoding scheme
> which means it's not reliable in all cases:
> 1) Blkid on current but older distributions don't encode double quotes,
>    where as latest distributions do.

Patch in this bug does not address this problem, nor it does not change behaviour how GParted parse output from old blkid.

> 2) The blkid encoding doesn't encode "M-x" in the input stream so when
>    decoding it is not possible to distinguish between encoding and a
>    genuine "M-x" from the original input.
> 3) The blkid encoding doesn't encode "^" in the input stream so when
>    decoding it is not possible to distinguish between encoding and a
>    genuine "^X" from the original input.

But this is a problem...

Currently GParted does not show Unicode label correctly and I see there 3 options:

1) Stay this bug as is, without changing code

2) Call blkid with -d option which outputs in 8bit UTF-8

3) Try to decide ^ and M- like in my patch

Or probably 2) with fallback to 1) or 3) when -d is not supported.
Comment 5 Mike Fleetwood 2017-08-20 19:03:08 UTC
Option 4:  Use 'blkid -o value -s LABEL /dev/$partition'.

In detail, when querying the label in the FS_Info module run above blkid
command to get the label for just the partition of interest.  Output
format value only (-o value) prints only the queried value (-s LABEL)
without any encoding, only adding a trailing new line character.  This
works at least as far back as blkid 1.0.0 (12-Feb-2003).

Testing on Fedora 26 ...

No label cases:
    # cat /dev/zero > /dev/sdb1
    # mkfs.btrfs /dev/sdb1
    # blkid -o value -s LABEL /dev/sdb1 | hexdump -C
    (No output, not even a new line, because no label)

    # cat /dev/zero > /dev/sdb1
    # mkfs.btrfs -L '' /dev/sdb1
    # blkid -o value -s LABEL /dev/sdb1 | hexdump -C
    (Again no output becuase no label)

Label cases:
    # cat /dev/zero > /dev/sdb1
    # mkfs.btrfs -L 'M-P д' /dev/sdb1
    # blkid -o value -S LABEL /dev/sdb1
    M-P д
    # blkid -o value -S LABEL /dev/sdb1 | hexdump -C
    00000000  4d 2d 50 20 d0 b4 0a            |M-P ...|
    00000007

    # cat /dev/zero > /dev/sdb1
    # label="^B"
    # echo -n "$label" | hexdump -C
    00000000  5e 42 02                    "^B."
    00000003
    # mkfs.btrfs -L "$label" /dev/sdb1
    # blkid -o value -S LABEL /dev/sdb1
    ^B
    # blkid -o value -S LABEL /dev/sdb1 | hexdump -C
    00000000  5e 42 02 0a                     |^B..|
    00000004

I have also tested this works correctly of CentOS 5 mkfs.ext4 and
blkid 1.0.0.  This way appears to get the label correctly for all
versions of blkid.  Just remember to cut off the final character if it
is a new line character.
Comment 6 Pali 2017-08-20 20:12:25 UTC
(In reply to Mike Fleetwood from comment #5)
> Option 4:  Use 'blkid -o value -s LABEL /dev/$partition'.

Seems like a good idea. And call it only when file system specific tool is not available...
Comment 7 Mike Fleetwood 2017-08-21 07:25:17 UTC
(In reply to Pali from comment #6)
> And call it only when file system specific tool is not available...

GParted already only calls FS_Info::get_label() after any file system
specific tool has been tried.

https://git.gnome.org/browse/gparted/tree/src/GParted_Core.cc?id=GPARTED_0_29_0#n1287
Comment 8 Mike Fleetwood 2017-08-29 11:38:23 UTC
Update bug summary to qualify the scope of the issue.
Comment 9 Mike Fleetwood 2017-08-29 11:39:43 UTC
Created attachment 358674 [details] [review]
Support blkid reading Unicode labels (v1)

Hi Curtis,

Here's patchset v1 to fix this.  It uses the method outlined in
comment #5 above.  It supports Unicode labels when read via blkid so
in cases when the FS specific tool isn't install or doesn't exist.  It
then extends that to support GParted recognising the use of Unicode
LABEL=<label> in /etc/fstab to specify block devices by FS label.

Successfully tested on Fedora 26 and CentOS 6 with various FSs such as
hfs+ (and udf on Fedora 26) which don't provide their own tool for
reading Unicode labels.

Thanks,
Mike
Comment 10 Curtis Gedak 2017-09-01 18:03:16 UTC
Hi Mike & Pali,

Thank you for your work on Unicode labels.

My review of patch set v1 from comment #9 has gone well.

I encountered only one minor typo in the P3/3 commit message:

   get_labe() returns the cached label, loading it into the cache first
          ^^
   get_label() returns the cached label, loading it into the cache first

I have made this change in my copy of the code in anticipation of
pushing these commit to our git master.

Regression testing has been successful on the following distros:

  Fedora   25
  Debian    9
  Kubuntu  16.04
  openSUSE 42.3

If there are no objections then I will commit patch set v1 in the next
day or so.

Curtis
Comment 11 Pali 2017-09-02 10:11:53 UTC
Looks good, thanks Mike! If it is working fine for reading Unicode UDF labels, I'm happy with it.

Anyway, in this patch (but also in whole GParted code) I see one incorrect pattern in usage of Utils::execute_command. Caller does not escape correctly user input argument which could lead to execution of arbitrary code.

I can imagine that somebody could try to set filesystem label to:
"; sh /tmp/my_script & true "

It could lead to execution of the my_script.
Comment 12 Mike Fleetwood 2017-09-02 11:25:57 UTC
(In reply to Pali from comment #11)
> Anyway, in this patch (but also in whole GParted code) I see one incorrect
> pattern in usage of Utils::execute_command. Caller does not escape correctly
> user input argument which could lead to execution of arbitrary code.
> 
> I can imagine that somebody could try to set filesystem label to:
> "; sh /tmp/my_script & true "
> 
> It could lead to execution of the my_script.

Was there a *specific* scenario you are concerned about?
Remember that because of the very nature of what the GParted application
is, it allows the user destroy their entire hard drive by formatting a
new file system over the top or copying a second hard drive over the
first.
Comment 13 Pali 2017-09-02 11:32:35 UTC
I know that GParted under root can do anything with disk. But still it is a good idea to start external process correctly. Probably somebody could have good reason to include either " or ; or \ character into filesystem label. And currently it is a problem. Some filesystems (e.g. ntfs or udf) support storing arbitrary Unicode sequence so such shell special characters are also fully valid for label.
Comment 14 Curtis Gedak 2017-09-02 15:42:55 UTC
Hi Pali,

As a proof of concept, can you test to demonstrate the label handling issue you raised in comment #11?

I think the label handling issue is prevented by the call to Glib::shell_parse_argv() [1] method within Utils::execute_command() [2] method.

[1] https://people.gnome.org/~gcampagna/docs/GLib-2.0/GLib.shell_parse_argv.html

[2] https://git.gnome.org/browse/gparted/tree/src/Utils.cc?h=GPARTED_0_29_0#n595

Curtis
Comment 15 Curtis Gedak 2017-09-02 16:13:47 UTC
Hi Pali & Mike,

As further clarification, when a user has root access, they have the complete ability to destroy the disk.  If they do this to themselves by creating a label such as in comment #11, then they deserve what they get.

An issue only arises if the the person didn't create the label but suffered disk alteration as a consequence of some other action.  A sample scenario would be if the person inserted a USB Flash Drive with a specially crafted label made by someone else.  If the label was executed, then that would be an issue.

Curtis
Comment 16 Pali 2017-09-02 16:32:45 UTC
I did quick check and problem happen when user try to change label to:
" -A "

For FAT32 system it throws error message in GParted log window that mlabel was called with invalid option -A. Thanks to that shell escaping code execution of script seems not to be possible. But it is possible to inject custom parameters to mkfs/label tools.
Comment 17 Curtis Gedak 2017-09-02 16:41:20 UTC
Pali,

Would you like to develop a patch for the test issue you identified?

Curtis
Comment 18 Curtis Gedak 2017-09-02 19:15:31 UTC
Created attachment 359003 [details]
Pics of Two GParted Versions with Unicode and Mischievous Labels

Hi Mike & Pali,

I've done some more testing with Debian 9 with Unicode characters and
the label issue.

For the label issue I created a file and mischievous label on an ext4
file system in partition /dev/sdb1 as follows:


# Create 1 letter shell script at top of root file system with:

$ cat << _EOF_ > z
> #!/bin/sh
> touch /myfile.txt
> _EOF_
$ chmod 755 z
$ sudo mv z /
$ ls -l /z
-rwxr-xr-x 1 user user 28 Sep  2 12:58 /z
$


# Create mischievous label

$ sudo e2label /dev/sdb1 '"; /z & true "'


Run both GParted 0.28.1 and latest git HEAD with patch set v1 from
comment #9.  Both versions only display the mischievous label, and do
not execute the label.  See attached picture.

After running both versions, there is no /myfile.txt file at the top
of the root file system.

$ ls -l /my*
ls: cannot access '/my*': No such file or directory

Curtis
Comment 19 Pali 2017-09-02 19:31:12 UTC
Now I'm looking at that Utils::execute_command. It converts ustring command to array of argv via Glib::shell_parse_argv. Glib::shell_parse_argv is just wrapper around g_shell_parse_argv and that uses tokenize_command_line for splitting string into argv array, which is later unescaped (via g_shell_unquote). And tokenize_command_line is really funny function. Looking at whole implementation I think that any user input should be first passed into g_shell_quote function (or via Glib:: equivalent if exists).

https://fossies.org/dox/glibmm-2.55.1/shell_8cc_source.html#l00035
https://sourcecodebrowser.com/glib2.0/2.24.0/gshell_8c.html#a31cb4a74ef238d5488b7d14c96cdbee9
https://sourcecodebrowser.com/glib2.0/2.24.0/gshell_8c.html#a105ba01fb62d0415457eaa58f5f5021a
https://sourcecodebrowser.com/glib2.0/2.24.0/gshell_8c.html#aab6d2157aba1c972155f21a3136bb8ed
Comment 20 Curtis Gedak 2017-09-02 20:35:16 UTC
Thank you Mike and Pali for your work to improve Unicode support for
volume labels.

Because patch set v1 from comment #9 addresses the issue raised in
this bug report, I am committing the code.

The issue regarding creating labels such as:
"; sh /tmp/my_script & true "
is a separate issue and should be tracked in a separate bug report.


Patch set v1 has been committed to the git repository for inclusion in
the next release of GParted.

The relevant git commits can be viewed at the following links:

Support reading Unicode character labels via blkid (#786502)
https://git.gnome.org/browse/gparted/commit/?id=1c984ae06d9fd1f4d28ae960c1cdb0a3442ccf3f

Update FS_Info cache with Unicode safe labels read from blkid (#786502)
https://git.gnome.org/browse/gparted/commit/?id=b022c1e3a949640214d8b6e4a1aa395a53858bb4

Support /etc/fstab using Unicode labels as mount points (#786502)
https://git.gnome.org/browse/gparted/commit/?id=73fe1dbf5cc95eff057bcd7f56ff2b1a710169ec

Curtis
Comment 21 Pali 2017-09-03 08:20:59 UTC
I opened https://bugzilla.gnome.org/show_bug.cgi?id=787203 bug.
Comment 22 Curtis Gedak 2017-10-10 17:24:10 UTC
This enhancement was included in the GParted 0.30.0 release on October 10, 2017.