GNOME Bugzilla – Bug 754387
test-suite failure: ERROR: cve-2015-4491 - missing test plan
Last modified: 2016-04-11 00:21:37 UTC
Version: 2.31.6 Since the update to 2.31.6, I'm getting the following test-suite failure during "make check": ERROR: cve-2015-4491 - missing test plan ERROR: cve-2015-4491 - exited with status 133 (terminated by signal 5?) Complete test-suite.log: ============================================= gdk-pixbuf 2.31.6: tests/test-suite.log ============================================= # TOTAL: 77 # PASS: 75 # SKIP: 0 # XFAIL: 0 # FAIL: 0 # XPASS: 0 # ERROR: 2 .. contents:: :depth: 2 ERROR: cve-2015-4491 ==================== # random seed: R02S6c485ba3929bcdd756dc05737afee2ab # Start of pixbuf tests (/home/michael/debian/build-area/gdk-pixbuf-2.31.6/tests/.libs/lt-cve-2015-4491:22262): GLib-ERROR **: /tmp/buildd/glib2.0-2.44.1/./glib/gmem.c:103: failed to allocate 6039798016 bytes # Start of cve-2015-4491 tests Trace/breakpoint trap # GLib-FATAL-ERROR: /tmp/buildd/glib2.0-2.44.1/./glib/gmem.c:103: failed to allocate 6039798016 bytes ERROR: cve-2015-4491 - missing test plan ERROR: cve-2015-4491 - exited with status 133 (terminated by signal 5?)
try 2.31.7
Same problem with 2.31.7: # random seed: R02Sa65e62755ec50970d21846c60e6a0230 # Start of pixbuf tests (/home/michael/debian/build-area/gdk-pixbuf-2.31.7/tests/.libs/lt-cve-2015-4491:16691): GLib-ERROR **: /tmp/buildd/glib2.0-2.44.1/./glib/gmem.c:103: failed to allocate 6039798016 bytes # Start of cve-2015-4491 tests Trace/breakpoint trap # GLib-FATAL-ERROR: /tmp/buildd/glib2.0-2.44.1/./glib/gmem.c:103: failed to allocate 6039798016 bytes ERROR: cve-2015-4491 - missing test plan ERROR: cve-2015-4491 - exited with status 133 (terminated by signal 5?) Re-opening bug reportg
fwiw, this is blocking the update of gdk-pixbuf in Debian, since the failing test-suite prevents the package from building successfully.
I'm sure you know how to work around a failing test.
The test attempts to create a large pixmap, and fails in your case because it runs out of memory. It seems to need about 6 GB, so I would suggest disabling the test if there is not that much RAM available on the test machine (or upgrading the test machine).
(In reply to David King from comment #5) > The test attempts to create a large pixmap, and fails in your case because > it runs out of memory. It seems to need about 6 GB, so I would suggest > disabling the test if there is not that much RAM available on the test > machine (or upgrading the test machine). No, I don't think that's the case. I do have 8GB of RAM + a few GB of swap. To me it looks like there is a leak somewhere so no matter how much memory I throw at it, the test will always (eventually) run out of memory and fail. [ 292.756619] Out of memory: Kill process 12340 (lt-cve-2015-449) score 966 or sacrifice child [ 292.756916] Killed process 12340 (lt-cve-2015-449) total-vm:11838928kB, anon-rss:7764984kB, file-rss:8kB I also don't think disabling a test which was added for a CVE fix is the right approach here. Something is wrong here. And I'm not sure if the test is faulty or if it's a legitimate bug in gdk-pixbuf (or someplace else).
I feel bad because I uploaded this to Debian without noticing the bug - guess I have enough memory on my machine. I made a memory limited container to reproduce the bug (running the testsuite like this `MALLOC_PERTURB_=$((${RANDOM:-256} % 256)) GDK_PIXBUF_MODULE_FILE=../gdk-pixbuf/loaders.cache ../libtool --mode=execute gdb ./cve-2015-4491') -- not sure why the MALLOC_PETURB_ is necessary; the test *passes* without it. and got the following trace. Is there anything to be done about this?
+ Trace 235459
Maybe MALLOC_PERTURB initialising all of that huge chunk of memory forces it to be in RAM or something? Setting malloc (M_PERTURB, 0) in this test works around the problem. Is that acceptable?
There's a similar case when even memory+swap isn't big enough (I think this is the reason anyway - it's on a Debian arm test machine). ERROR:cve-2015-4491.c:33:test_original: assertion failed (err == NULL): Not enough memory to load bitmap image (gdk-pixbuf-error-quark, 1) Should we catch this and skip the test? If you can't allocate enough memory then the bug can't happen anyway, right?
How much memory is required to run the test-suite? I have 8GB of RAM + 4GB of swap?
commit 19f9685dbff7d1f929c61cf99188df917a18811d Author: Benjamin Otte <otte@redhat.com> Date: Sat Sep 19 21:24:34 2015 +0200 pixops: Fail make_weights functions on OOM The weights could grow very large under certain circumstances, in particular in security-relevant conditions, including the testsuite. By allowing the weight allocation to fail, this can be worked around.
Created attachment 311697 [details] [review] Skip tests when we can't run them due to lack of memory Check if we have failed due to insufficient memory and skip if so. Don't use MALLOC_PERTURB since it seems that this forces <size of file> memory to be available which OOMs on many systems.
Some of the tests still fail because they can't allocate in other places. Can we have a patch like this to check for such failures and skip?
How much memory is a system supposed to have to run the tests successfully? I wouldn't consider 8G of RAM + 4G ofswap a weak meachine.
1. Commit 19f9685dbff7d1f929c61cf99188df917a18811d now uses g_try_malloc() and handles the case where it fails. So the tests should now always succeed. If they don't, could you please point out which ones fail and how? 2. The reason I want to keep the test as-is (even if that messes with swap on various build slaves) is that that test is the original test that was submitted as part of the CVE. And I'd like the testsuite to ensure we're not vulnerable to that. 3. The test allocates at most 6GB at once and a total of 16GB over its whole runtime (the rest being 2GB chunks). However, the test never allocated more than those 6GB at once and properly frees them before doing the other large allocations (I checked that with valgrind).
fwiw, with latest git master, including Benjamin's commit, the test-suite now passes for me. tests/cve-2015-4491.log now says: # random seed: R02S4f81e707d6a26cf424966cb0b2eb3264 # Start of pixbuf tests # Start of cve-2015-4491 tests ok 1 /pixbuf/cve-2015-4491/original PASS: cve-2015-4491 1 /pixbuf/cve-2015-4491/original ok 2 /pixbuf/cve-2015-4491/scale-overflow PASS: cve-2015-4491 2 /pixbuf/cve-2015-4491/scale-overflow ok 3 /pixbuf/cve-2015-4491/scale-x-overflow PASS: cve-2015-4491 3 /pixbuf/cve-2015-4491/scale-x-overflow ok 4 /pixbuf/cve-2015-4491/scale-y-overflow PASS: cve-2015-4491 4 /pixbuf/cve-2015-4491/scale-y-overflow # End of cve-2015-4491 tests # End of pixbuf tests 1..4
eder.debian.org (https://db.debian.org/machines.cgi?host=eder), which is similar to our mips build machines, hits these g_test_skips that I put in (master + my patch). SKIP: cve-2015-4491 1 /pixbuf/cve-2015-4491/original # SKIP Your system has insufficient memory to load the test image. SKIP: pixbuf-scale 9 /pixbuf/scale/png/large # SKIP Your system has insufficient memory to load the test image. SKIP: pixbuf-scale 10 /pixbuf/scale/jpeg/large # SKIP format not supported (okay, not this one) SKIP: pixbuf-scale 11 /pixbuf/add-alpha/large # SKIP Your system has insufficient memory to load the test image. SKIP: pixbuf-scale 12 /pixbuf/rotate/large # SKIP Your system has insufficient memory to load the test image. Does the "original" testcase exactly match the CVE bug? If so, then this machine won't be vulnerable and if not then maybe we can make the testcase require less memory (or add another one which takes less and let this one be skipped)?
Yes, the original test matches it exactly. I added a bunch of similar tests in the same source file that try to exploit the vulnerability while using less memory. Those tests seem to pass on your box.
Yeah, I've not got any code to skip those in that patch (they pass since the referenced commit but did also fail before - I think that if we looked we'd see that they are hitting the bail-out case now). Can I commit this then?
Comment on attachment 311697 [details] [review] Skip tests when we can't run them due to lack of memory HA! I misread the test output for the cve test: The 6GB allocation happens when loading the image, not when scaling it. In any case, I've committed most of the patch as commit e7f940102d40997f2e23a0589247cfb189dfaa98: - I renamed and changed the behavior of the function checking for OOM, so that we can do proper cleanup and release memory. - I did not commit the mallopt() call. For a start it's Linux-specific but more importantly from reading M_PERTURB docs I don't get what M_PERTURB has to do with all of this. And I'd like the testsuite to behave like the real app and setting malloc flags is exactly the wrong thing for that. Plus, I don't think it's needed at all. - Various indentation fixage.
Thanks! I'll try an upload to Debian with this patch. I did add M_PERTURB for a reason, even if I didn't fully understand it (comment #8). Real apps wouldn't usually set this - for us it's set in glib-tap,mk for the testsuite only. So I considered it okay to turn this off. Anyway, hopefully the other fix has made this part obsolete but we'll see. I'll close this bug if it builds for all of our architecture.
https://buildd.debian.org/status/package.php?p=gdk-pixbuf&version=2.32.0-1 Seems fine for the architectures we care about. I think the failure for "kfreebsd-amd64" there is an OOM. If it starts to happen on more architectures then we need to look at this more, but it doesn't matter so much for now.
Hi, Isn't it kind of sub-optimal to do a 16x sub-sampled bilinear scale, even when the scale factor is smaller than 1/16 anyway? Doing so here requires an extra 6 GiB memory allocation for make_weights. I'm saying, why not set a threshold in _pixops_scale_real to just use PIXOPS_INTERP_NEAREST if either of the scale factors is really tiny? It is more robust than failing due to memory exhaustion, ought to be a lot faster, and likely would have prevented CVE-2015-4491. Thanks.
Created attachment 317575 [details] [review] why subsample if scaling by such a small factor?
attaching patches 5o a resolved bug is not effective; please open a new bug for this