Bug 86290 – gradients render incorrectly when using some modes (MMX problem)

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 86290 - gradients render incorrectly when using some modes (MMX problem)


Summary:	gradients render incorrectly when using some modes (MMX problem)


Status:	RESOLVED FIXED

Product:	GIMP
Classification:	Other
Component:	General
Version:	git master
Hardware:	Other Linux

Importance:	Normal major
Target Milestone:	2.0
Assigned To:	GIMP Bugs
QA Contact:	Daniel Egger

URL:
Whiteboard:

Duplicates:	90969 (view as bug list)
Depends on:
Blocks:

Reported:	2002-06-23 22:09 UTC by Jakub Steiner
Modified:	2004-12-22 21:47 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
black>white RGB gradient using the screen mode renders like this (374 bytes, image/png) 2002-06-23 22:10 UTC, Jakub Steiner	Details

Description Jakub Steiner 2002-06-23 22:09:06 UTC

Multiply, Screen, Overlay, Difference, Addition, Subtract, Darken Only,
Lighten Only modes of the gradient tool render badly.

I think this bug relates to #85203, judging from the noise looks created only.

Comment 1 Jakub Steiner 2002-06-23 22:10:59 UTC

Created attachment 9415 [details]
black>white RGB gradient using the screen mode renders like this

Comment 2 Toby Smith 2002-07-13 09:05:36 UTC

There appears to be somekind of problem with the MMX code that is
causing this and since i'm not particularly competent at assembly, I
can't say exactly what the problem is. 
If the gimp is run with --no-mmx argument, this problem goes away.

- Toby

Comment 3 Toby Smith 2002-07-14 03:24:47 UTC

When rendering a gradient with mode switched to multiply, the code
eventually reaches the following line (indicated by -->) in
paint-funcs-mmx.h:

#define MMX_PIXEL_OP(x) \
void \
x( \
  const guchar *src1, \
  const guchar *src2, \
  guint count, \
  guchar *dst) __attribute((regparm(3)));

#define MMX_PIXEL_OP_3A_1A(op) \
  MMX_PIXEL_OP(op##_pixels_3a_3a) \
  MMX_PIXEL_OP(op##_pixels_1a_1a)

#define USE_MMX_PIXEL_OP_3A_1A(op) \
  if (HAS_ALPHA (alms->bytes1) && HAS_ALPHA (alms->bytes2)) \
    { \
      if (alms->bytes1==2 && alms->bytes2==2) \
	return op##_pixels_1a_1a(alms->src1, alms->src2, alms->length,
*(alms->dest)); \
      if (alms->bytes1==4 && alms->bytes2==4) \
	return op##_pixels_3a_3a(alms->src1, alms->src2, alms->length,
*(alms->dest)); \
    } 

MMX_PIXEL_OP_3A_1A(multiply);
static void
layer_multiply_mode_mmx (struct apply_layer_mode_struct *alms)
{
-->  USE_MMX_PIXEL_OP_3A_1A(multiply);
}

It would appear that unless the initial image and the rendered
gradient both contain an alpha channel, the appropriate mmx function
will not be reached. Do the mmx assembly function really require both
images to possess an alpha channel? or can USE_MMX_PIXEL_OP_3A_1A(op)
be defined as follows:

#define USE_MMX_PIXEL_OP_3A_1A(op) \
  if ( (alms->bytes1 == 1) || (alms->bytes1 == 2) \
    return op##_pixels_1a_1a(alms->src1, alms->src2, alms->length,   
  *(alms->dest)); \
  if ( (alms->bytes1 == 3) || (alms->bytes1 == 4) \
    return op##_pixels_3a_3a(alms->src1, alms->src2, alms->length,
*(alms->dest));

Comment 4 Toby Smith 2002-07-14 03:51:39 UTC

On seconds thoughts, the definition I gave for USE_MMX_PIXEL_OP_3A_1A
won't work. The mmx assembly functions have to been changed so that
they accept an image without an alpha channel or the image has to
temporarily be promoted to an image with an alpha channel.

Comment 5 Sven Neumann 2002-08-27 15:28:39 UTC

Bug #90969 might be a duplicate of this report.

Comment 6 Garry R. Osgood 2002-11-02 23:39:21 UTC

*** Bug 90969 has been marked as a duplicate of this bug. ***

Comment 7 Akkana Peck 2002-11-03 01:29:36 UTC

I'm seeing this too.  This may be a stupid question, but I'm on an AMD
(Athlon XP 1600+ Palomino): /proc/cpuinfo does say the mmx flag is
set, but is that right for an AMD?  Could this be a kernel bug, that
the kernel is reporting mmx when it shouldn't be?  This is with Redhat
7.3's kernel-2.4.18-5 AMD kernel with a few minor tweaks; I haven't
tried to repro this bug on other CPUs or other kernels, but could try
if it would help.

Comment 8 Manish Singh 2002-11-03 01:39:03 UTC

Athlons support MMX just fine. AMD even has added some extensions to
MMX (and other parts of the instruction set) which I believe GIMP
could benefit from, but (as this bug makes apparent) now is not the
time for microoptimization.

Comment 9 Garry R. Osgood 2002-11-03 04:55:36 UTC

The optimizations in paint_func.[ch], (giving rise to
paint_func_generic.h, paint_func_mmx.h) culminating in ChangeLog
entries for 2001-11-19, do not adequately account for the fact that Mr
Monniaux's MMX code (app/arch/i386/mmx/paint_funcs_mmx.S) optimizes
only operations between index-with-alpha and/or rgb-with-alpha (1a and
3a in his naming convention). Even with HAVE_ASM_MMX defined, wrapper
macros ensured that the assembly code would not be invoked in cases
such as those giving rise to this bug, where an RGBA brush canvas is
being combined with an RGB drawable. (Toby Smith's second thoughts
about his changes to those wrapper macros were astute.) In the
original settings furnished by functions removed from paint_funcs.c by
this 11-19 check-in, follow-on code generally handled the 'unbalanced'
cases that the assembly code did not. In the analogous functions that
have been designed to be invoked indirectly (indirect references
reside in layer_mode_funcs table and are invoked in combine_sub_region
() [paint_funcs.c] no such follow on code seems to exist (see the
various implementations in paint_funcs_mmx.h; these are inserted into
layer_mode_funcs table when it turns out that HAVE_ASM_MMX is
defined). Because this follow-on code was inadvertently optimized
away, and the wrapper macros prevent the MMX assembly code from
running, in the unbalanced cases, NOTHING operates on the temporary
destination buffer inserted in the 'dest' field of the
apply_layer_mode_struct, and it conveys garbage to follow-on function
combine_inten_and_inten_a_pixels(), where it serves as one of the
source regions.

Me thinks this 'optimization' was undertaken before what was being
optimized was fully understood. I am left considering whether it is
best to fully revert this check-in or bring it to completion. As a
general rule, I do wish colleagues would do more testing, employing a
wider variety of cases, before commiting to Wilbur's guts what was
honestly believed to be a 'neat hack'. Did anybody think to test this
implementation with something other than a 'Normal' brush?

Comment 10 Manish Singh 2002-11-03 10:02:46 UTC

We should disable the code till the paint core is feature frozen, then
revisit the whole issue, possibly reimplmenting things. As I mentioned
before, there are more optimizations that can be done by taking
advantage of AMD specific features, and probably P3 stuff too. And
there is also Altivec (PPC) and VIS (Sparc) instructions on those
platforms too. But the first the pure C codebase should be stable.

Comment 11 Garry R. Osgood 2002-11-03 10:52:45 UTC

I should correct myself toward the end of the first (analysis)
paragraph: the code was not 'optimized away' -- separate (and
disjoint) versions of 'generic C' and 'MMX' layer functions are placed
in *_generic.h and *_mmx.h, and either one or the other from each set
is inserted into the function lookup table, layer_mode_funcs, in
combine_sub_region ( ), as dictated by the state of HAVE_ASM_MMX. This
design stems from a refactoring performed for 11-18-2001 and
11-19-2001 check-ins, but what the design of the refactoring fails to
account is that the macro USE_MMX_PIXEL_OP_3A_1A(ops) implements a
kind of if-then-else logic: if ("The MMX code can handle the depths of
the given regions") then "execute MMX code and RETURN to CALLER" else
"execute generic C code." By (1) separating the generic and the mmx
versions, (2) populating a table that has only one version or the
other, and  (3) leaving the macro unchanged, then this critical logic
changes: if ("The MMX code can handle the depths of the given
regions") then "execute MMX code and RETURN to CALLER" else RETURN to
CALLER -- leaving out the initialization of one of  the region
buffers, so that random bit noise, instead of the results of a pixel
operation, gets injected into the pipeline. There is nothing
particularly wrong with the MMX code -- it's just incomplete. The old
setup logic used to accommodate this incompleteness; the refactored
logic does not.

This correction noted, I think Josh's policy is correct: in time it
can be more than C v. MMX, and we should think carefully how to do
that. First off, get the design correct, then optimize a correct
design in as many flavors of assembly as people can find time to write
-- and only if need be. I too am inclined to disable the MMX code near
term.

Comment 12 Sven Neumann 2002-11-03 15:51:07 UTC

What we need now is a patch that makes the necessary changes to fix
the behaviour. This patch should probably add some helpful comments
and should keep the MMX code in place so we can reenable it later.

Comment 13 Garry R. Osgood 2002-11-04 03:03:05 UTC

Committed a change to disable MMX specific code, essentially bypassing
the MMX wrapper code giving rise to the miscreant behaviour. This
bypassed code is still broken, and should not be re-enabled without
careful study first. Added "helpful comments" as a study aid -- or at
least verbose ones, in paint-funcs-mmx.h.

Comment 14 Raphaël Quinet 2002-11-04 10:19:24 UTC

IMHO, the MMX/SSE/3DNow/Altivec/VIS optimizations should not even be
considered for 1.4.  A lot of stuff will change when the 16 bits and
float channels are introduced on the way to 2.0 (and we know that 16
bits support is high on the "most wanted" list).  Once the GEGL stuff
is fully integrated in the GIMP, we can start thinking again about
optimizing some operations.

Comment 15 Sven Neumann 2002-11-04 11:10:37 UTC

Since the core won't survive the switch to GEGL anyway, we can as well
use the MMX code in 1.4. I don't want to discourage anyone to fix the
code so that the MMX functions are properly called.