After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 793682 - tile.xml test failure regression on armhf
tile.xml test failure regression on armhf
Status: RESOLVED FIXED
Product: GEGL
Classification: Other
Component: build
git master
Other Linux
: Normal normal
: ---
Assigned To: Default Gegl Component Owner
Default Gegl Component Owner
Depends on:
Blocks:
 
 
Reported: 2018-02-21 01:04 UTC by Jeremy Bicha
Modified: 2018-03-14 10:41 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Jeremy Bicha 2018-02-21 01:04:54 UTC
Ubuntu has been having trouble getting gegl's build tests to pass on armhf with 0.3.28. Earlier in January, the tests passed find with 0.3.26.

This is preventing gegl 0.3.26 from being promoted to Ubuntu 18.04 (and holding back gnome-photos 3.28 too).

Full build log at
https://launchpad.net/ubuntu/+source/gegl/0.3.28-1/+build/14279042

Build log excerpt
-----------------

PASS pnm-ascii-load.xml
missing reference, assuming SUCCESS
/<<PKGBUILDDIR>>/bin/gegl /<<PKGBUILDDIR>>/tests/compositions/jpg-load-datauri.xml -o /<<PKGBUILDDIR>>/tests/compositions/output/jpg-load-datauri.png
/<<PKGBUILDDIR>>/tools/gegl-imgcmp /<<PKGBUILDDIR>>/tests/compositions/reference/jpg-load-datauri.png /<<PKGBUILDDIR>>/tests/compositions/output/jpg-load-datauri.png
PASS jpg-load-datauri.xml
Missing fast-path babl conversion detected, Implementing missing babl fast paths
accelerates GEGL, GIMP and other software using babl, warnings are printed on
first occurance of formats used where a conversion has to be synthesized
programmatically by babl based on format description

*WARNING* missing babl fast path(s): "R'aG'aB'aA half" to "RaGaBaA float"
missing reference, assuming SUCCESS
/<<PKGBUILDDIR>>/bin/gegl /<<PKGBUILDDIR>>/tests/compositions/tiff-load.xml -o /<<PKGBUILDDIR>>/tests/compositions/output/tiff-load.png
/<<PKGBUILDDIR>>/tools/gegl-imgcmp /<<PKGBUILDDIR>>/tests/compositions/reference/tiff-load.png /<<PKGBUILDDIR>>/tests/compositions/output/tiff-load.png
PASS tiff-load.xml
/<<PKGBUILDDIR>>/bin/gegl /<<PKGBUILDDIR>>/tests/compositions/tile.xml -o /<<PKGBUILDDIR>>/tests/compositions/output/tile.png
Command '['/<<PKGBUILDDIR>>/bin/gegl', '/<<PKGBUILDDIR>>/tests/compositions/tile.xml', '-o', '/<<PKGBUILDDIR>>/tests/compositions/output/tile.png']' returned non-zero exit status -7
FAIL tile.xml
=== Test Results ===
 tests passed:  9
 tests skipped: 0
 tests failed:  1
======  FAIL  ======
Comment 1 Øyvind Kolås (pippin) 2018-02-21 01:14:18 UTC
The gegl binary itself doesn't return -7 as an error - I wonder if this is some form of segfault or bus-error.

Are you able to run tests on a live armhf system with gdb available?

Perhaps with an installed gegl running this in the tests/compositions folder would be enlightening?

$ gdb --args gegl tile.xml -o /tmp/tile.png # ?
Comment 2 Sebastien Bacher 2018-03-08 11:04:37 UTC
backtrace of the segfault on an armhf builder

  • #0 gegl_buffer_iterate_write
    at gegl-buffer-access.c line 613
  • #1 gegl_buffer_set_internal
    at gegl-buffer-access.c line 788
  • #2 _gegl_buffer_set_with_flags
    at gegl-buffer-access.c line 823
  • #3 gegl_buffer_set
    at gegl-buffer-access.c line 1844
  • #4 gegl_buffer_set_pattern
    at gegl-buffer-access.c line 2545
  • #5 0xf5ecf74e in

Comment 3 Sebastien Bacher 2018-03-08 11:07:29 UTC
gdb says that
        lskip_offset = 0

the segfault line is
                          ((uint64_t*)(&tp[lskip_offset]))[0] =
Comment 4 Ell 2018-03-13 10:40:50 UTC
This is probably fixed in master, by:

commit 7bab2a7433889758649eccd77a03305ee652bc27
Author: Ell <ell_se@yahoo.com>
Date:   Tue Mar 13 06:01:10 2018 -0400

    buffer: fix single-column case of gegl_buffer_{get,set}()
    
    In the single-column case of gegl_buffer_get(), compare the output
    rowstride to the output format's bpp, rather than the buffer
    format's bpp, when deciding if to use the optimized path.
    
    In the single-column case of gegl_buffer_set(), only take the
    optimized path when the input rowstride equals the input format's
    bpp, fix the temp-buffer's rowstride in the call to
    _gegl_buffer_set_with_flags(), and return after it to avoid re-
    writing to the buffer.

 gegl/buffer/gegl-buffer-access.c | 30 ++++++++++++++----------------
 1 file changed, 14 insertions(+), 16 deletions(-)

Could you test again?
Comment 5 Jeremy Bicha 2018-03-13 12:33:35 UTC
Ell, the tile.xml test still failed for me on armhf with that patch cherry-picked.

https://launchpad.net/~jbicha/+archive/ubuntu/arch/+sourcepub/8852521/+listing-archive-extra
Comment 6 Sebastien Bacher 2018-03-14 09:19:51 UTC
Confirmed, the backtrace is the same with the patch
Comment 7 Ell 2018-03-14 10:00:43 UTC
Ok, round 2.  This must do the trick:

commit 54519440f74e60380b93c744ec1a0fe73ad58e50
Author: Ell <ell_se@yahoo.com>
Date:   Wed Mar 14 05:19:35 2018 -0400

    buffer: verify alignment in optimized buffer_set/get paths
    
    In the optimized cases of gegl_buffer_iterate_write() and
    gegl_buffer_iterate_read_simple(), make sure all read/write
    accesses to the buffer and the tile are properly aligned for the
    used type, and fall back to the generic version otherwise.
    
    On some architectures, unaligned access can lead to a crash; see
    bug #793682.

 gegl/buffer/gegl-buffer-access.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)
Comment 8 Sebastien Bacher 2018-03-14 10:26:19 UTC
That patches resolves the issue indeed!