After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 699079 - Prototype support for installed tests
Prototype support for installed tests
Status: RESOLVED FIXED
Product: glib
Classification: Platform
Component: build
unspecified
Other Linux
: Normal normal
: ---
Assigned To: gtkdev
gtkdev
Depends on:
Blocks:
 
 
Reported: 2013-04-27 21:13 UTC by Colin Walters
Modified: 2013-05-16 12:53 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
0001-WIP-Prototype-support-for-installed-tests.patch (3.37 KB, patch)
2013-04-27 21:14 UTC, Colin Walters
none Details | Review
Updated patch which matched updated specification (3.38 KB, patch)
2013-04-27 23:05 UTC, Colin Walters
none Details | Review
0001-WIP-Prototype-support-for-installed-tests.patch (19.12 KB, patch)
2013-04-28 22:14 UTC, Colin Walters
none Details | Review
0001-WIP-Prototype-support-for-installed-tests.patch (19.15 KB, patch)
2013-05-01 21:10 UTC, Colin Walters
reviewed Details | Review
0001-glib-tests-mappedfile-Copy-test-file-before-writing-.patch (2.12 KB, patch)
2013-05-03 23:00 UTC, Colin Walters
none Details | Review
0001-glib-tests-Use-explicit-file-listings.patch (2.61 KB, patch)
2013-05-04 16:40 UTC, Colin Walters
none Details | Review
0001-glib-tests-Add-last-two-.expected-files-for-markup-t.patch (4.35 KB, patch)
2013-05-04 20:53 UTC, Colin Walters
none Details | Review
0001-glib-tests-Drop-unnecessary-SRCDIR-definitions.patch (2.25 KB, patch)
2013-05-06 13:26 UTC, Colin Walters
none Details | Review
0001-glib-tests-mappedfile-Copy-test-file-before-writing-.patch (3.58 KB, patch)
2013-05-06 15:43 UTC, Colin Walters
accepted-commit_now Details | Review
0002-glib-tests-Use-explicit-file-listings.patch (2.61 KB, patch)
2013-05-06 15:44 UTC, Colin Walters
accepted-commit_now Details | Review
0003-glib-tests-Add-last-two-.expected-files-for-markup-t.patch (4.35 KB, patch)
2013-05-06 15:44 UTC, Colin Walters
accepted-commit_now Details | Review
0004-glib-tests-Drop-unnecessary-SRCDIR-definitions.patch (2.25 KB, patch)
2013-05-06 15:44 UTC, Colin Walters
accepted-commit_now Details | Review
0005-Add-enable-installed-tests-configure-option.patch (11.69 KB, patch)
2013-05-06 15:45 UTC, Colin Walters
accepted-commit_now Details | Review

Description Colin Walters 2013-04-27 21:13:59 UTC
See https://live.gnome.org/GnomeGoals/InstalledTests for more
information.

There is yet more to do for this; First, obviously it only converts
glib/tests, not gio/tests.  Second, some of the tests require data
files that we're not installing yet.

But running $prefix/libexec/glib/uri works, for example.
---
 configure.ac           |    7 ++++++-
 glib/tests/Makefile.am |   41 +++++++++++++++++++++++++++++++++--------
 2 files changed, 39 insertions(+), 9 deletions(-)
Comment 1 Colin Walters 2013-04-27 21:14:49 UTC
Created attachment 242685 [details] [review]
0001-WIP-Prototype-support-for-installed-tests.patch

This is just an initial patch for comments about the general architecture and to figure out details before I move on farther.
Comment 2 Colin Walters 2013-04-27 21:18:47 UTC
Note this patch keeps support for running the tests as "make check".  In fact that's the default.  The gnome-ostree builder however will build glib with:
--disable-modular-tests --enable-installed-tests
Comment 3 Colin Walters 2013-04-27 23:05:40 UTC
Created attachment 242690 [details] [review]
Updated patch which matched updated specification

I made a few tweaks to the spec, this updates the glib WIP patch to match.
Comment 4 Colin Walters 2013-04-28 22:14:09 UTC
Created attachment 242754 [details] [review]
0001-WIP-Prototype-support-for-installed-tests.patch

This patch makes all the glib tests work both installed and uninstalled.

It's bigger than necessary because it was easier to explicitly list installed files rather than doing the dist-hook thing.
Comment 5 Colin Walters 2013-05-01 21:10:51 UTC
Created attachment 243009 [details] [review]
0001-WIP-Prototype-support-for-installed-tests.patch

A minor update to match the latest spec:

* We now install into $(libexecdir)/glib/installed-tests so the gnome-ostree build system can automatically pull out the test binaries too
Comment 6 Dan Winship 2013-05-02 14:14:22 UTC
Comment on attachment 243009 [details] [review]
0001-WIP-Prototype-support-for-installed-tests.patch

>+# These are "white box" tests that reuse parts of the GLib source, so
>+# we can't install them, since they're tied to an exact version of
>+# GLib.
>+TEST_PROGS = \
>+	1bit-mutex	\
>+	1bit-emufutex	\
>+	gwakeup		\
>+	$(NULL)

This would only be an issue if you installed a new version of glib on your test box, but *didn't* also install a new version of the test programs. Would you really do that?

And even if you did, is there any advantage to not installing the tests anyway? If you install 1bit-emufutex, and we break emufutex support, and then you install the new broken glib but don't install the new tests, then the installed 1bit-emufutex will still pass and we won't notice that emufutex support is broken. But so what? If you *don't* install 1bit-emufutex, then we won't notice that emufutex support was broken either. So it seems like installing them is strictly better than not installing them.

If we are keeping the black-box/white-box split:

  - 1bit-mutex is actually black-box; the #include <glib/gbitlock.c> only happens when
    building 1bit-emufutex.

  - You're missing gwakeup-fallback. However, bug 680102 has a patch that (along with
    some other things that we don't want) gets rid of the need to recompile gwakeup.c
    with different defines, and using glib-private would get rid of the need to include
    gwakeup.c at all. (glib-private also implies a bit of version-specificness, but
    only with respect to the layout of GLibPrivateVTable, so if we got meaningfully
    out-of-sync, it would crash rather than giving incorrect test results...)

>+  testhome = g_getenv ("G_TEST_HOME");
>+  if (!testhome)
>+    testhome = INSTTESTDIR;
>+  datapath = g_build_filename (testhome, "bookmarks", NULL);

we might want a helper in gtestutils for this?

also, you leak datapath

>   file = g_mapped_file_new ("/dev/null", FALSE, &error);
>-  g_assert_error (error, G_FILE_ERROR, G_FILE_ERROR_INVAL);
>+  g_assert (error != NULL);

unrelated?

>-  if (access (SRCDIR "/4096-random-bytes", W_OK) != 0)
>+  path = g_build_filename (datapath, "4096-random-bytes", NULL);
>+  if (access (path, W_OK) != 0)
>     {
>       g_test_message ("Skipping writable mapping test");

We will probably want to rewrite this to copy the file into $XDG_RUNTIME_DIR or something?

(Also, to use g_file_test())

(Also, we should eventually have a more explicit way for tests to indicate that they are skipping part of themselves, so the test environment can be aware of this and note it explicitly in its logs so we don't think we're testing more than we really are.)

>-      g_assert_cmpstr (string->str, ==, expected);
>+      if (strcmp (expected, "[none]\n") != 0)
>+        g_assert_cmpstr (string->str, ==, expected);

?

Ah, this is because your Makefile rule requires there to be a .expected file in every case, but in this case, the contents of the .expected file don't match what's actually output? Blah. (What is string->str in this case, and why can't you put that into the .expected?)
Comment 7 Allison Karlitskaya (desrt) 2013-05-02 14:28:25 UTC
Installing glib's tests makes no sense at all except from the standpoint that they can catch regressions in things below GLib.  If the interest was in catching problems in GLib itself, that's what "make check" is for.

The futex and wakeup tests are particularly likely to shake out issues in the eventfd and futex interfaces in the kernel, so I think it makes particular sense to install these ones (vs. other tests that are testing glib "more purely" like a hashtable test, for example).
Comment 8 Colin Walters 2013-05-02 14:50:41 UTC
(In reply to comment #7)
> Installing glib's tests makes no sense at all except from the standpoint that
> they can catch regressions in things below GLib.  If the interest was in
> catching problems in GLib itself, that's what "make check" is for.

The problem with "make check" as traditionally used are several; the particularly problematic one from the perspective of automation is: "what is the expected environment"?  For example IIRC we at least recently had a test that expected there to be .desktop files available in the install root.  That's always true on developer workstations; on builders, not so much.

Clutter for example really wants a full GLX capable environment.  It's not something where you can just toss up Xvfb inside a pbuilder root and go.  The gnome-ostree CI system solves this by running tests in a completely functional VM.

Another issue with "make check" is the lack of external control over what tests are run.  A system integrator can only choose "all" or "none".  But a smart system could do stuff like immediately running tests relevant to the code changes in the complete system *first*.  

> The futex and wakeup tests are particularly likely to shake out issues in the
> eventfd and futex interfaces in the kernel, so I think it makes particular
> sense to install these ones (vs. other tests that are testing glib "more
> purely" like a hashtable test, for example).

I'd really like to install and run all of them myself.
Comment 9 Colin Walters 2013-05-02 15:14:11 UTC
(In reply to comment #6)

> This would only be an issue if you installed a new version of glib on your test
> box, but *didn't* also install a new version of the test programs. Would you
> really do that?

I'd like our model to support running old installed test binaries against newer libraries.  So broadly speaking, anything that #includes part of the source code as opposed to public interfaces is liable to break.

>   - 1bit-mutex is actually black-box; the #include <glib/gbitlock.c> only
> happens when
>     building 1bit-emufutex.

My understanding (http://en.wikipedia.org/wiki/Black-box_testing) is that an #include of the source makes this white box.

I'll work on addressing the other comments.
Comment 10 Dan Winship 2013-05-02 15:30:37 UTC
(In reply to comment #9)
> >   - 1bit-mutex is actually black-box; the #include <glib/gbitlock.c> only
> > happens when
> >     building 1bit-emufutex.
> 
> My understanding (http://en.wikipedia.org/wiki/Black-box_testing) is that an
> #include of the source makes this white box.

But it doesn't include it. The #include is #ifdeffed out when building 1bit-mutex.
Comment 11 Colin Walters 2013-05-03 23:00:25 UTC
Created attachment 243258 [details] [review]
0001-glib-tests-mappedfile-Copy-test-file-before-writing-.patch
Comment 12 Colin Walters 2013-05-04 16:40:29 UTC
Created attachment 243295 [details] [review]
0001-glib-tests-Use-explicit-file-listings.patch
Comment 13 Colin Walters 2013-05-04 20:53:27 UTC
Created attachment 243313 [details] [review]
0001-glib-tests-Add-last-two-.expected-files-for-markup-t.patch
Comment 14 Colin Walters 2013-05-06 13:26:04 UTC
Created attachment 243382 [details] [review]
0001-glib-tests-Drop-unnecessary-SRCDIR-definitions.patch
Comment 15 Colin Walters 2013-05-06 15:43:50 UTC
Created attachment 243390 [details] [review]
0001-glib-tests-mappedfile-Copy-test-file-before-writing-.patch
Comment 16 Colin Walters 2013-05-06 15:44:08 UTC
Created attachment 243391 [details] [review]
0002-glib-tests-Use-explicit-file-listings.patch
Comment 17 Colin Walters 2013-05-06 15:44:26 UTC
Created attachment 243392 [details] [review]
0003-glib-tests-Add-last-two-.expected-files-for-markup-t.patch
Comment 18 Colin Walters 2013-05-06 15:44:45 UTC
Created attachment 243393 [details] [review]
0004-glib-tests-Drop-unnecessary-SRCDIR-definitions.patch
Comment 19 Colin Walters 2013-05-06 15:45:00 UTC
Created attachment 243394 [details] [review]
0005-Add-enable-installed-tests-configure-option.patch
Comment 20 Simon McVittie 2013-05-10 10:22:00 UTC
You might find patchsets from Maemo's "CITA" interesting (albeit rather outdated): that was the reason I implemented --enable-installed-tests in dbus. I believe Jolla are still using something analogous to CITA, possibly CITA itself.

https://gitorious.org/gnome-essentials/glib/blobs/maemo/debian/cita.mk
https://gitorious.org/gnome-essentials/glib/trees/maemo/debian/patches
https://gitorious.org/gnome-essentials/glib/trees/maemo/debian/patches/series

There might be some other installed tests elsewhere in <https://gitorious.org/gnome-essentials>, notably for libsoup. Again, they're for ancient versions by current standards, but might be helpful.
Comment 21 Matthias Clasen 2013-05-14 23:04:13 UTC
Review of attachment 243390 [details] [review]:

looks good
Comment 22 Matthias Clasen 2013-05-14 23:16:56 UTC
Review of attachment 243391 [details] [review]:

Ok. I might slightly prefer to list the markup tests explicitly too, e.g. two on a row:

markup_test_files = \
  markups/test1.gmarkup markups/test1.expected \
 ...

But not a big deal
Comment 23 Matthias Clasen 2013-05-14 23:19:01 UTC
Review of attachment 243392 [details] [review]:

the fixup should just go in the previous commit. The rest looks fine

::: glib/tests/Makefile.am
@@ +36,3 @@
 	fail-41 fail-42 fail-43 fail-44 fail-45 \
 	fail-46 fail-47 fail-48 fail-49 \
 	valid-1 valid-2 valid-3 valid-4 valid-5 \

This is just a fixup for the previous commit ?
Comment 24 Matthias Clasen 2013-05-14 23:19:56 UTC
Review of attachment 243393 [details] [review]:

ok
Comment 25 Matthias Clasen 2013-05-14 23:27:52 UTC
Review of attachment 243394 [details] [review]:

looks ok to me
Comment 26 Colin Walters 2013-05-15 13:06:54 UTC
(In reply to comment #20)
> You might find patchsets from Maemo's "CITA" interesting (albeit rather
> outdated): that was the reason I implemented --enable-installed-tests in dbus.
> I believe Jolla are still using something analogous to CITA, possibly CITA
> itself.
> 
> https://gitorious.org/gnome-essentials/glib/blobs/maemo/debian/cita.mk

Wow, shell script in XML in shell script in Makefile =)

So...from what I can tell, 90% of the code portion of this is basically build goop that goes away when the test build/installation is a proper part of the upstream build process.

Your notes on which GLib tests require data in the source tree will be useful though for sure.

At a very high level I'm definitely curious about your experiences with CITA, both the upsides and the downsides.  For example, in what environment did CITA run tests?  Was it a pbuilder/mock type chroot?  Did tests run as root or not?  Etc.
Comment 27 Colin Walters 2013-05-15 13:28:28 UTC
(In reply to comment #22)
> Review of attachment 243391 [details] [review]:
> 
> Ok. I might slightly prefer to list the markup tests explicitly too, e.g. two
> on a row:
> 
> markup_test_files = \
>   markups/test1.gmarkup markups/test1.expected \
>  ...

The reason I did it this way is that I can then generate the test cases from the markup test names, as well as the two file names.
Comment 28 Simon McVittie 2013-05-15 14:36:28 UTC
(In reply to comment #26)
> So...from what I can tell, 90% of the code portion of this is basically build
> goop that goes away when the test build/installation is a proper part of the
> upstream build process.

Yes. Don't blame me, I didn't design the framework, just had to work with it :-)

(A simple, realistic metadata format, like "the tests are .desktop files in ${datadir}/installed-tests", would help.)

> Your notes on which GLib tests require data in the source tree will be useful
> though for sure.

Yeah, that's what I really wanted to point you to - tests.xml is an entry point.

> At a very high level I'm definitely curious about your experiences with CITA,
> both the upsides and the downsides.  For example, in what environment did CITA
> run tests?  Was it a pbuilder/mock type chroot?  Did tests run as root or not?

It was run on real (embedded) hardware with some sort of network or USB remote-control arrangement, probably as root (it didn't matter for the packages I dealt with, but they were also testing system-level stuff).

Problem areas:

* Ordinary developers didn't have access to a faithful replica of the test environment, which made debugging rather hard. If testing is done as an acceptance test, developers need to be able to run it on their own laptops (maybe with the help of a VM).

* Expecting all tests from version 1.2.3 to pass on version 1.2.4 is not necessarily realistic: the people running CITA tried (because, yes, it's good in principle), but I think they gave up after a while. I think every project I've worked on has at least a couple of tests that rely on implementation details.

* Tests that are run automatically need to record a verbose log for later debugging, in case you can't reproduce it easily. In CITA I think we just made them always-verbose, but telepathy-glib now has a script for "record the log in a temporary file, then spew it all out to stderr on failure" which works quite well in practice.

* Black-box-only testing isn't necessarily going to have the same coverage as "make check". telepathy-mission-control has installable tests (which are also used for its "make installcheck"), and they're less comprehensive than its "make check".
Comment 29 Colin Walters 2013-05-15 17:06:39 UTC
(In reply to comment #28)
> (In reply to comment #26)
> > So...from what I can tell, 90% of the code portion of this is basically build
> > goop that goes away when the test build/installation is a proper part of the
> > upstream build process.
> 
> Yes. Don't blame me, I didn't design the framework, just had to work with it
> :-)

Understood =)

> (A simple, realistic metadata format, like "the tests are .desktop files in
> ${datadir}/installed-tests", would help.)

Yeah, that's what the InstalledTests goal has.

> It was run on real (embedded) hardware with some sort of network or USB
> remote-control arrangement, probably as root (it didn't matter for the packages
> I dealt with, but they were also testing system-level stuff).

Interesting.
 
> * Ordinary developers didn't have access to a faithful replica of the test
> environment, which made debugging rather hard. If testing is done as an
> acceptance test, developers need to be able to run it on their own laptops
> (maybe with the help of a VM).

Right; they were targeting ARM hardware, and developers were on x86 laptops; that's a tough scenario.

In the case of gnome-ostree, we're only doing x86_64 now, and it's designed from the ground up for VMs. (I'm actually considering making the VM case the only "supported" case upstream).  That helps a *lot* over CITA I think.

> * Expecting all tests from version 1.2.3 to pass on version 1.2.4 is not
> necessarily realistic: the people running CITA tried (because, yes, it's good
> in principle), but I think they gave up after a while. I think every project
> I've worked on has at least a couple of tests that rely on implementation
> details.

Interesting.  There are multiple models here; we could assume tests must run on the exact version of the originating target code by default, and add some metadata tag like ForwardCompatible=yes or something.
 
> * Tests that are run automatically need to record a verbose log for later
> debugging, in case you can't reproduce it easily. In CITA I think we just made
> them always-verbose, but telepathy-glib now has a script for "record the log in
> a temporary file, then spew it all out to stderr on failure" which works quite
> well in practice.

So far it's been pretty easy to track down from the current output to test failures, but I don't have much real-world experience with the current GNOME test corpus.

> * Black-box-only testing isn't necessarily going to have the same coverage as
> "make check". telepathy-mission-control has installable tests (which are also
> used for its "make installcheck"), and they're less comprehensive than its
> "make check".

Yes.  This is an important point.  I think practically speaking in the short term, keeping white box tests as "make check" is sensible.

This still has the same problems that "make check" did for black box for builders, but at the moment I'm happy enough to get basic/broad coverage of black box tests under automation.