GNOME Bugzilla – Bug 746247
div255 implementation is incorrect
Last modified: 2018-11-03 10:47:44 UTC
Created attachment 299452 [details] test div255 app div255 implements this algorithm everywhere: d = (s + 128 + (s+128)>>8) >> 8 which produces a result that is off-by-one for roughly half the 0..65535 input range. A correct implementation is: d = (s + 1 + (s >> 8)) >> 8 Test python app, and a fix which implements the new algorithm attached.
Created attachment 299453 [details] [review] Fix div255w implementation The current implementation of div255w produces incorrect off-by-one values for a large number of values in the input range. Use a better bit-shifting implementation that gets it right, and document that it doesn't work for 0xffff
Created attachment 299454 [details] [review] Update tables in docs
I modified most of the SIMD implementations, but I can only verify SSE, and I don't know how to modify the NEON implementation.
I think NEON needs to do: tmpc = orc_compiler_get_constant (1) vshr.u16 tmp, src, 8 vadd.i16 tmp, src vaddhn.i16 dest, tmp, tmpc but I'm not sure how to encode that.
See also https://bugzilla.gnome.org/show_bug.cgi?id=796846
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/gstreamer/orc/issues/7.