GNOME Bugzilla – Bug 86290
gradients render incorrectly when using some modes (MMX problem)
Last modified: 2004-12-22 21:47:04 UTC
Multiply, Screen, Overlay, Difference, Addition, Subtract, Darken Only, Lighten Only modes of the gradient tool render badly. I think this bug relates to #85203, judging from the noise looks created only.
Created attachment 9415 [details] black>white RGB gradient using the screen mode renders like this
There appears to be somekind of problem with the MMX code that is causing this and since i'm not particularly competent at assembly, I can't say exactly what the problem is. If the gimp is run with --no-mmx argument, this problem goes away. - Toby
When rendering a gradient with mode switched to multiply, the code eventually reaches the following line (indicated by -->) in paint-funcs-mmx.h: #define MMX_PIXEL_OP(x) \ void \ x( \ const guchar *src1, \ const guchar *src2, \ guint count, \ guchar *dst) __attribute((regparm(3))); #define MMX_PIXEL_OP_3A_1A(op) \ MMX_PIXEL_OP(op##_pixels_3a_3a) \ MMX_PIXEL_OP(op##_pixels_1a_1a) #define USE_MMX_PIXEL_OP_3A_1A(op) \ if (HAS_ALPHA (alms->bytes1) && HAS_ALPHA (alms->bytes2)) \ { \ if (alms->bytes1==2 && alms->bytes2==2) \ return op##_pixels_1a_1a(alms->src1, alms->src2, alms->length, *(alms->dest)); \ if (alms->bytes1==4 && alms->bytes2==4) \ return op##_pixels_3a_3a(alms->src1, alms->src2, alms->length, *(alms->dest)); \ } MMX_PIXEL_OP_3A_1A(multiply); static void layer_multiply_mode_mmx (struct apply_layer_mode_struct *alms) { --> USE_MMX_PIXEL_OP_3A_1A(multiply); } It would appear that unless the initial image and the rendered gradient both contain an alpha channel, the appropriate mmx function will not be reached. Do the mmx assembly function really require both images to possess an alpha channel? or can USE_MMX_PIXEL_OP_3A_1A(op) be defined as follows: #define USE_MMX_PIXEL_OP_3A_1A(op) \ if ( (alms->bytes1 == 1) || (alms->bytes1 == 2) \ return op##_pixels_1a_1a(alms->src1, alms->src2, alms->length, *(alms->dest)); \ if ( (alms->bytes1 == 3) || (alms->bytes1 == 4) \ return op##_pixels_3a_3a(alms->src1, alms->src2, alms->length, *(alms->dest));
On seconds thoughts, the definition I gave for USE_MMX_PIXEL_OP_3A_1A won't work. The mmx assembly functions have to been changed so that they accept an image without an alpha channel or the image has to temporarily be promoted to an image with an alpha channel.
Bug #90969 might be a duplicate of this report.
*** Bug 90969 has been marked as a duplicate of this bug. ***
I'm seeing this too. This may be a stupid question, but I'm on an AMD (Athlon XP 1600+ Palomino): /proc/cpuinfo does say the mmx flag is set, but is that right for an AMD? Could this be a kernel bug, that the kernel is reporting mmx when it shouldn't be? This is with Redhat 7.3's kernel-2.4.18-5 AMD kernel with a few minor tweaks; I haven't tried to repro this bug on other CPUs or other kernels, but could try if it would help.
Athlons support MMX just fine. AMD even has added some extensions to MMX (and other parts of the instruction set) which I believe GIMP could benefit from, but (as this bug makes apparent) now is not the time for microoptimization.
The optimizations in paint_func.[ch], (giving rise to paint_func_generic.h, paint_func_mmx.h) culminating in ChangeLog entries for 2001-11-19, do not adequately account for the fact that Mr Monniaux's MMX code (app/arch/i386/mmx/paint_funcs_mmx.S) optimizes only operations between index-with-alpha and/or rgb-with-alpha (1a and 3a in his naming convention). Even with HAVE_ASM_MMX defined, wrapper macros ensured that the assembly code would not be invoked in cases such as those giving rise to this bug, where an RGBA brush canvas is being combined with an RGB drawable. (Toby Smith's second thoughts about his changes to those wrapper macros were astute.) In the original settings furnished by functions removed from paint_funcs.c by this 11-19 check-in, follow-on code generally handled the 'unbalanced' cases that the assembly code did not. In the analogous functions that have been designed to be invoked indirectly (indirect references reside in layer_mode_funcs table and are invoked in combine_sub_region () [paint_funcs.c] no such follow on code seems to exist (see the various implementations in paint_funcs_mmx.h; these are inserted into layer_mode_funcs table when it turns out that HAVE_ASM_MMX is defined). Because this follow-on code was inadvertently optimized away, and the wrapper macros prevent the MMX assembly code from running, in the unbalanced cases, NOTHING operates on the temporary destination buffer inserted in the 'dest' field of the apply_layer_mode_struct, and it conveys garbage to follow-on function combine_inten_and_inten_a_pixels(), where it serves as one of the source regions. Me thinks this 'optimization' was undertaken before what was being optimized was fully understood. I am left considering whether it is best to fully revert this check-in or bring it to completion. As a general rule, I do wish colleagues would do more testing, employing a wider variety of cases, before commiting to Wilbur's guts what was honestly believed to be a 'neat hack'. Did anybody think to test this implementation with something other than a 'Normal' brush?
We should disable the code till the paint core is feature frozen, then revisit the whole issue, possibly reimplmenting things. As I mentioned before, there are more optimizations that can be done by taking advantage of AMD specific features, and probably P3 stuff too. And there is also Altivec (PPC) and VIS (Sparc) instructions on those platforms too. But the first the pure C codebase should be stable.
I should correct myself toward the end of the first (analysis) paragraph: the code was not 'optimized away' -- separate (and disjoint) versions of 'generic C' and 'MMX' layer functions are placed in *_generic.h and *_mmx.h, and either one or the other from each set is inserted into the function lookup table, layer_mode_funcs, in combine_sub_region ( ), as dictated by the state of HAVE_ASM_MMX. This design stems from a refactoring performed for 11-18-2001 and 11-19-2001 check-ins, but what the design of the refactoring fails to account is that the macro USE_MMX_PIXEL_OP_3A_1A(ops) implements a kind of if-then-else logic: if ("The MMX code can handle the depths of the given regions") then "execute MMX code and RETURN to CALLER" else "execute generic C code." By (1) separating the generic and the mmx versions, (2) populating a table that has only one version or the other, and (3) leaving the macro unchanged, then this critical logic changes: if ("The MMX code can handle the depths of the given regions") then "execute MMX code and RETURN to CALLER" else RETURN to CALLER -- leaving out the initialization of one of the region buffers, so that random bit noise, instead of the results of a pixel operation, gets injected into the pipeline. There is nothing particularly wrong with the MMX code -- it's just incomplete. The old setup logic used to accommodate this incompleteness; the refactored logic does not. This correction noted, I think Josh's policy is correct: in time it can be more than C v. MMX, and we should think carefully how to do that. First off, get the design correct, then optimize a correct design in as many flavors of assembly as people can find time to write -- and only if need be. I too am inclined to disable the MMX code near term.
What we need now is a patch that makes the necessary changes to fix the behaviour. This patch should probably add some helpful comments and should keep the MMX code in place so we can reenable it later.
Committed a change to disable MMX specific code, essentially bypassing the MMX wrapper code giving rise to the miscreant behaviour. This bypassed code is still broken, and should not be re-enabled without careful study first. Added "helpful comments" as a study aid -- or at least verbose ones, in paint-funcs-mmx.h.
IMHO, the MMX/SSE/3DNow/Altivec/VIS optimizations should not even be considered for 1.4. A lot of stuff will change when the 16 bits and float channels are introduced on the way to 2.0 (and we know that 16 bits support is high on the "most wanted" list). Once the GEGL stuff is fully integrated in the GIMP, we can start thinking again about optimizing some operations.
Since the core won't survive the switch to GEGL anyway, we can as well use the MMX code in 1.4. I don't want to discourage anyone to fix the code so that the MMX functions are properly called.