After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 769453 - Figure out a workaround for some versions of glibc having a broken O_TMPFILE which breaks ostree
Figure out a workaround for some versions of glibc having a broken O_TMPFILE ...
Status: RESOLVED WONTFIX
Product: ostree
Classification: Infrastructure
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: OSTree maintainer(s)
OSTree maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2016-08-02 21:12 UTC by Bastien Nocera
Modified: 2018-08-17 18:59 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
tmpfile.c - test program for O_TMPFILE (653 bytes, text/plain)
2016-08-03 16:03 UTC, Colin Walters
Details

Description Bastien Nocera 2016-08-02 21:12:22 UTC
More than a 100 failures in the test suite:

$ grep O_TMPFILE test-suite.log  | sort | uniq
OSTree:ERROR:tests/test-libarchive-import.c:139:test_data_init: assertion failed (error == NULL): open(O_TMPFILE): Invalid argument (g-io-error-quark, 13)
error: open(O_TMPFILE): Invalid argument
$ grep O_TMPFILE test-suite.log  | sort | wc -l
51

$ mount
ubi0:rootfs on / type ubifs (rw,relatime)
<snip>

I guess that ubifs doesn't implement ->tmpfile. Still, we shouldn't be failing this hard when an optional filesystem feature isn't implemented...
Comment 1 Bastien Nocera 2016-08-02 22:20:51 UTC
In tap-test it says:
# Run a test in tap mode, ensuring we have a temporary directory.  We
# always use /var/tmp because we might want to use user xattrs, which
# aren't available on tmpfs.

Well, if the kernel is compiled with CONFIG_TMPFS_XATTR, then tmpfs does have xattrs support *and* O_TMPFILE support. Which would at least help for the test suite. It doesn't really help for running ostree if O_TMPFILE support isn't going to be optional...
Comment 2 Bastien Nocera 2016-08-02 23:56:29 UTC
This is caused by:
https://github.com/ostreedev/ostree/pull/369

Using libglnx 4f83b70f690f437b983333a7c43def3ed6ca2d46 and ostree 1db0c8dcd642e7803a6d90b502664107ce11ec55 cuts down on the suite failures, though still not great.
Comment 3 Colin Walters 2016-08-03 00:02:43 UTC
That's strange; according to `man open` we should get one of:

```
       EISDIR pathname refers to an existing directory, O_TMPFILE and one of O_WRONLY or O_RDWR were specified in flags, but this ker‐
              nel version does not provide the O_TMPFILE functionality.
       ENOENT pathname  refers  to  a nonexistent directory, O_TMPFILE and one of O_WRONLY or O_RDWR were specified in flags, but this
              kernel version does not provide the O_TMPFILE functionality.

       EOPNOTSUPP
              The filesystem containing pathname does not support O_TMPFILE.

```

We're getting EINVAL though, which I guess it's not too surprising some random kernel versions throw back at us.  Does this patch help?

diff --git a/glnx-fdio.c b/glnx-fdio.c
index f4648ed..01e28cc 100644
--- a/glnx-fdio.c
+++ b/glnx-fdio.c
@@ -124,7 +124,8 @@ glnx_open_tmpfile_linkable_at (int dfd,
    * in full. */
 #ifdef O_TMPFILE
   fd = openat (dfd, subpath, O_TMPFILE|flags, 0600);
-  if (fd == -1 && !(errno == ENOSYS || errno == EISDIR || errno == EOPNOTSUPP))
+  /* See https://bugzilla.gnome.org/show_bug.cgi?id=769453 for the rationale behind EINVAL */
+  if (fd == -1 && !(errno == ENOSYS || errno == EISDIR || errno == EOPNOTSUPP || errno == EINVAL))
     {
       glnx_set_prefix_error_from_errno (error, "%s", "open(O_TMPFILE)");
       return FALSE;
Comment 4 Bastien Nocera 2016-08-03 01:09:03 UTC
We're not getting EINVAL with this test program:
----8<----
#include <stdio.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>

#ifndef O_TMPFILE
#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
#define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT)
#endif

int main (int argc, char **argv)
{
        int fd, dfd;
        const char *subpath = ".";

        dfd = open("/var/tmp", O_RDONLY | O_NONBLOCK | O_DIRECTORY | O_CLOEXEC | O_NOCTTY);
        if (dfd < 0) {
                printf ("dfd: %d\n", dfd);
                return 1;
        }
        fd = openat (dfd, subpath, O_TMPFILE, 0600);
        printf ("fd: %d (errno: %d)\n", fd, errno);

        return 0;
}
----8<----

$ ./test 
fd: -1 (errno: 22)
(EINVAL)

Change the openat flags to O_TMPFILE | 0x1 and then:
$ ./test 
fd: -1 (errno: 95)
(ENOTSUP)

But the test suite would print out, when openat failed:
flags: 0x80001
subpath: "."
dfd: 3

So I'm a bit lost there. I guess some flags might be getting lost?!
Comment 5 Bastien Nocera 2016-08-03 01:11:10 UTC
FWIW, might be easier if you create a ubifs filesystem, and mount that to test.
Comment 6 Bastien Nocera 2016-08-03 14:34:14 UTC
(In reply to Colin Walters from comment #3)
<snip>

> diff --git a/glnx-fdio.c b/glnx-fdio.c
> index f4648ed..01e28cc 100644
> --- a/glnx-fdio.c
> +++ b/glnx-fdio.c
> @@ -124,7 +124,8 @@ glnx_open_tmpfile_linkable_at (int dfd,
>     * in full. */
>  #ifdef O_TMPFILE
>    fd = openat (dfd, subpath, O_TMPFILE|flags, 0600);
> -  if (fd == -1 && !(errno == ENOSYS || errno == EISDIR || errno ==
> EOPNOTSUPP))
> +  /* See https://bugzilla.gnome.org/show_bug.cgi?id=769453 for the
> rationale behind EINVAL */
> +  if (fd == -1 && !(errno == ENOSYS || errno == EISDIR || errno ==
> EOPNOTSUPP || errno == EINVAL))
>      {
>        glnx_set_prefix_error_from_errno (error, "%s", "open(O_TMPFILE)");
>        return FALSE;

That works well enough for the test suite to come back to pre-O_TMPFILE levels of success.
Comment 7 Colin Walters 2016-08-03 14:48:25 UTC
You need O_WRONLY | O_TMPFILE (and normally O_WRONLY | O_CLOEXEC | O_TMPFILE)
Comment 8 Bastien Nocera 2016-08-03 14:59:03 UTC
This is the bug:
#ifndef O_TMPFILE
#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
#endif

"O_WRONLY | O_CLOEXEC | O_TMPFILE" will expand to "O_WRONLY | O_CLOEXEC | __O_TMPFILE | O_DIRECTORY" which openat won't like.
Comment 9 Bastien Nocera 2016-08-03 15:07:42 UTC
(In reply to Bastien Nocera from comment #8)
> This is the bug:
> #ifndef O_TMPFILE
> #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY)
> #endif
> 
> "O_WRONLY | O_CLOEXEC | O_TMPFILE" will expand to "O_WRONLY | O_CLOEXEC |
> __O_TMPFILE | O_DIRECTORY" which openat won't like.

Right, that's obviously not it, as this is only for x86, and this was supposed to make calls fail.
Comment 10 Colin Walters 2016-08-03 16:00:58 UTC
Regarding ubifs in particular, this looks a bit annoying to test:
http://www.linux-mtd.infradead.org/faq/ubifs.html#L_loop_mount
Comment 11 Colin Walters 2016-08-03 16:03:31 UTC
Created attachment 332633 [details]
tmpfile.c - test program for O_TMPFILE

Fixed version of O_TMPFILE test program.
Comment 12 Bastien Nocera 2016-08-04 14:32:29 UTC
EINVAL is a possible return value when flags don't match what the kernel expects:
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/fs/open.c#n883

About the kernel definition of the flags:
/* The precise definition of __O_TMPFILE is arch specific, so let's
 * just define this on x86 where we know the value. */

$ git grep O_TMPFILE | grep arch | grep fcntl
arch/alpha/include/uapi/asm/fcntl.h:#define __O_TMPFILE	0100000000
arch/parisc/include/uapi/asm/fcntl.h:#define __O_TMPFILE	040000000
arch/sparc/include/uapi/asm/fcntl.h:#define __O_TMPFILE	0x2000000

__O_TMPFILE is the same on all platforms *except* Sparc, PA-RISC and Alpha. So it would be best to do #ifndef ... rather than #ifdef for x86/x86-64.

*But* in /usr/include/arm-linux-gnueabihf/bits/fcntl-linux.h on this Debian system:
#ifndef __O_TMPFILE
# define __O_TMPFILE   020200000
#endif

and:

#ifdef __USE_GNU
<snip>
# define O_TMPFILE      __O_TMPFILE     /* Atomically create nameless file.  */  
#endif

So the user-space O_TMPFILE is __O_TMPFILE | 02000000 or __O_TMPFILE | __O_CLOEXEC.

And your nice guard:
  /* Don't allow O_EXCL, as that has a special meaning for O_TMPFILE */
  g_return_val_if_fail ((flags & O_EXCL) == 0, FALSE);

Doesn't do anything. Would be great to get somebody from the glibc to come in and comment...
Comment 13 Colin Walters 2016-08-04 15:26:14 UTC
https://sourceware.org/bugzilla/show_bug.cgi?id=17912 ?
Comment 14 Bastien Nocera 2016-08-04 15:32:00 UTC
I used a loopback mounted ext4 in /mnt and changed all the references about /var/tmp to /mnt (where I mounted my loop back image), and got flooded with EINVAL again.

I think that the constants problem might be the right track.
Comment 15 Bastien Nocera 2016-08-04 17:50:32 UTC
Upgrading from libc6-dev 2.19-18+deb8u4 to libc6-dev 2.23-4, reverting any changes to libglnx, and forcing my ext4 tmp dir, stopped the EINVAL problems.

After that, I tested back on the ubifs mount (reverted all my /var/tmp changes), and it seems to fallback correctly to the non-O_TMPFILE codepath.

So we need a way to detect O_TMPFILE with the wrong values, and error out, as it's not like the glibc has a pkg-config file.

Maybe a new test case would be enough?
Comment 16 Colin Walters 2016-08-04 19:29:44 UTC
I think we could scrape the libc headers in autoconf, but we'd likely need to carry the architecture list where the values differ, and we'd have to test the autoconf.

My inclination here is to basically tell people to update their libcs, and if they can't --disable-otmpfile.
Comment 17 Bastien Nocera 2016-08-04 19:55:08 UTC
(In reply to Colin Walters from comment #16)
> I think we could scrape the libc headers in autoconf, but we'd likely need
> to carry the architecture list where the values differ, and we'd have to
> test the autoconf.

I think we'd just need to check that __O_TMPFILE & O_DIRECTORY == O_DIRECTORY, no? That's doable in configure, I'd hope.

> My inclination here is to basically tell people to update their libcs, and
> if they can't --disable-otmpfile.

That's a good advice to put in the error message.
Comment 18 Bastien Nocera 2016-08-06 12:59:16 UTC
Comment 17 didn't quite get that right.

Ok, I got something a bit better.

We do want O_TMPFILE in the libc headers to be the O_TMPFILE (the kernel value), or'ed with O_DIRECTORY. In some versions, O_TMPFILE and __O_TMPFILE both will already include O_DIRECTORY, making it hard to compare.

So we need to check whether O_TMPFILE might include one or two bits set (kernel O_TMPFILE value | O_DIRECTORY), which we will "or" with O_DIRECTORY. If O_TMPFILE | O_DIRECTORY is correctly set, we'll end up with 2 bits being set. If it isn't, we'll have 3 bits being set, because the original O_TMPFILE will have a wrong value for O_DIRECTORY.

So, on the ARM with the old libc:
O_TMPFILE = 020200000
O_DIRECTORY = 040000

On x86-64, Fedora 24:
O_TMPFILE = 020200000
O_DIRECTORY = 0200000

I ended up with that test case:
--8<--
#define _GNU_SOURCE 1

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <assert.h>
#include <stdio.h>

#define N_EXPECTED_BITS 2
#define MAX_NUM_BITS    23
#define O_TMPFILE_DIR   (O_TMPFILE | O_DIRECTORY)

int main (int argc, char **argv)
{
        int i, n;

        n = 0;
        for (i = 0; i < MAX_NUM_BITS; i++) {
                if ((O_TMPFILE_DIR >> i) & 0x1)
                        n++;
        }
        if (n != N_EXPECTED_BITS) {
                fprintf (stderr, "Expected %d bits, got %d\n", N_EXPECTED_BITS, n);
                assert (n == N_EXPECTED_BITS);
        }

        return 0;
}
--8<--

On x86-64, there's no output.
On ARM with the broken libc:
$ ./test
Expected 2 bits, got 3
test: test.c:24: main: Assertion `n == 2' failed.
Aborted (core dumped)

Maybe we should run that as the first test in the test suite, and print out the advice mentioned in comment 16.
Comment 19 André Klapper 2018-08-17 18:59:07 UTC
OSTree has moved to Github a while ago.
Furthermore, GNOME Bugzilla will be shut down and replaced by gitlab.gnome.org.

If the problem reported in this Bugzilla ticket is still valid, please report it to https://github.com/ostreedev/ostree/issues instead. Thank you!

Closing this report as WONTFIX as part of Bugzilla Housekeeping.