Bug 779893 – gegl:gray and gegl:saturation produces slightly different results with cpu and opencl

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 779893 - gegl:gray and gegl:saturation produces slightly different results with cpu and opencl


Summary:	gegl:gray and gegl:saturation produces slightly different results with cpu an...


Status:	RESOLVED WONTFIX

Product:	GEGL
Classification:	Other
Component:	opencl
Version:	git master
Hardware:	Other Linux

Importance:	Normal normal
Target Milestone:	---
Assigned To:	Default Gegl Component Owner
QA Contact:	Default Gegl Component Owner

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2017-03-11 10:28 UTC by Øyvind Kolås (pippin)
Modified:	2017-10-09 14:32 UTC

See Also:
GNOME target:	---
GNOME version:	---

Description Øyvind Kolås (pippin) 2017-03-11 10:28:25 UTC

When introducing bit exact regression tests for GEGL operations, gegl:gray ends up having an off by one in one pixel of the test image. 

In this resulting image: http://gegl.org/images/examples/gegl-gray.png

The pixel at coordinates 37, 9 is 115,115,115,255 for the opencl rendering code paths using beignet and 114,114,114,255 for the non opencl code path. Leading to mismatched md5 checksums of the test images.

Comment 1 Øyvind Kolås (pippin) 2017-03-11 14:10:17 UTC

gegl-saturation has similar off by one errors but for 3 different pixels.

Comment 2 Thomas Manni 2017-03-17 23:50:31 UTC

Concerning the gegl:gray operation, the issue has nothing to do with the CPU/OPENCL operation codes themselves, since they only move pixels from one place to another without alteration.

CPU uses memmove https://git.gnome.org/browse/gegl/tree/operations/common/grey.c#n54
OPENCL uses clEnqueueCopyBuffer https://git.gnome.org/browse/gegl/tree/operations/common/grey.c#n70

CPU / OPENCL divergence observed (gegl-tester shows 2 different hashes) is related to babl conversion and the BABL_TOLERANCE value.

gegl-tester loads docs/example/standard-input.png (via gegl:png-load) to produce a R'G'B'A u8 buffer, which need to be converted to YA float (format asked by the gegl:gray prepare function) so the conversion is R'G'B'A u8 to YA float.

OPENCL code uses rgba_gamma_u8_to_yaf kernel https://git.gnome.org/browse/gegl/tree/opencl/colors-8bit-lut.cl#n362 to perform the conversion and produces the hash 6f1ee8b1802e1f5bf4225884800b55a2

Running gegl-tester with GEGL_USE_OPENCL=no BABL_TOLERANCE=0.0, CPU code conversion uses fish reference and produces a different hash 43ddd80572ab34095298ac7c36368b0c

Running gegl-tester with GEGL_USE_OPENCL=no (with the default BABL_TOLERANCE), CPU code conversion uses fish path (R'G'B'A u8 to RGBA float ; RGBA float to YA float) and produces a same hash as OPENCL code  6f1ee8b1802e1f5bf4225884800b55a2

Comment 3 Øyvind Kolås (pippin) 2017-03-26 02:10:04 UTC

This makes me wonder if perhaps the reference of babl should be single precision float, this might even speed babl up a little bit.

Comment 4 Øyvind Kolås (pippin) 2017-05-04 15:55:52 UTC

gegl:gray is now producing the 43ddd80572ab34095298ac7c36368b0c hash for both CPU and OpenCL implementations on my system.

Comment 5 Øyvind Kolås (pippin) 2017-10-09 14:32:09 UTC

The constants used for R'G'B' -> Y conversion are now computed from first principles, and rounded to ICC s15f16 precision with neutral gray axis, the hashes will have changed again.

Though I have also observed clang vs gcc yield slightly different results, and expect slightly different resutls for different internal floating point representations for intermediate results.

Slightly different results are to be expected, even if it would be really nice if we could have consitent hashes.