Bug 63621 – Add GAtomicInt for atomic integer operations

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 63621 - Add GAtomicInt for atomic integer operations


Summary:	Add GAtomicInt for atomic integer operations


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	general
Version:	1.3.x
Hardware:	Other other

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2001-11-02 16:44 UTC by Sebastian Wilhelmi
Modified:	2011-02-18 16:07 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
Patch to implement GAtomic for GLib (11.41 KB, patch) 2002-04-03 13:34 UTC, Sebastian Wilhelmi	none	Details \| Review
gatomic-speed-test.c: Test Program for speed of GAtomic (1.22 KB, text/plain) 2002-04-03 13:50 UTC, Sebastian Wilhelmi		Details
new patch for GAtomic (11.95 KB, patch) 2003-03-13 16:05 UTC, Sebastian Wilhelmi	none	Details \| Review
new speed test program (1.34 KB, text/plain) 2003-03-14 11:33 UTC, Sebastian Wilhelmi		Details
yet another patch for GAtomic (15.18 KB, patch) 2003-03-14 14:43 UTC, Sebastian Wilhelmi	none	Details \| Review
yet another patch for atomic ints (12.49 KB, patch) 2003-10-09 13:29 UTC, Sebastian Wilhelmi	none	Details \| Review
new speed test program (1.36 KB, text/plain) 2003-10-09 13:30 UTC, Sebastian Wilhelmi		Details
yet another patch for atomic ints (20.94 KB, patch) 2004-02-04 15:59 UTC, Sebastian Wilhelmi	none	Details \| Review
yet another patch for atomic ints (25.71 KB, patch) 2004-02-05 12:15 UTC, Sebastian Wilhelmi	none	Details \| Review
Speed test with function wrapper (1.94 KB, patch) 2004-02-05 17:21 UTC, Owen Taylor	none	Details \| Review
newest version of g_atomic patch (31.95 KB, patch) 2004-02-18 16:25 UTC, Sebastian Wilhelmi	none	Details \| Review
newest version of g_atomic patch (35.14 KB, patch) 2004-02-20 16:42 UTC, Sebastian Wilhelmi	none	Details \| Review
patch \infty (34.59 KB, patch) 2004-02-23 15:23 UTC, Sebastian Wilhelmi	none	Details \| Review
patch \infty + 1 (41.33 KB, patch) 2004-02-29 10:36 UTC, Sebastian Wilhelmi	none	Details \| Review
patch \infty + 2 (41.89 KB, patch) 2004-02-29 14:01 UTC, Sebastian Wilhelmi	none	Details \| Review

Description Sebastian Wilhelmi 2001-11-02 16:44:05 UTC

See
http://moniker.gnome.org/archives/gtk-devel-list/2001-November/msg00044.html

Comment 1 Sebastian Wilhelmi 2002-04-03 13:33:18 UTC

The following patch implements atomic ints. It has fast
implementations for the following architectures:
i386/powerpc/sparc/alpha and is tested on all of them (mostly with
linux) on compilefarm.sf.net.

Q: So why doesn't the patch use inline functions in headers to 
   implement atomic int instead of the extern functions? 
A: Only gcc understands the used asm syntax and thus we would risk 
   having different libs linked together, where some use the asm 
   versions and other the fallback implementation (because some are 
   compiled with gcc and others aren't) and mixing them of course is
   unsafe.

Q: Why is there no special version for sparc v9, which can handle 
   atomic operation much better than older sparcs?
A: Most environments doesn't seem to be able to assemble v9 code on 
   solaris and linux, even if the processor itself is v9.

Q: Why are all implementations in one file instead of being scattered 
   in arch-specific dirs or files as for glibc or other projects?
A: It just fits better into the GLib struture. While we have 
   arch-specific files (threads, exec), I think this setup is better 
   than having 5 nearlty empty files, but that's just taste.

Q: How fast are asm atomic int compared to the generic implementation.
A: I'll also attach a speed test program.

Comment 2 Sebastian Wilhelmi 2002-04-03 13:34:33 UTC

Created attachment 7529 [details] [review]
Patch to implement GAtomic for GLib

Comment 3 Sebastian Wilhelmi 2002-04-03 13:50:34 UTC

Created attachment 7530 [details]
gatomic-speed-test.c: Test Program for speed of GAtomic

Comment 4 Owen Taylor 2002-11-20 23:20:46 UTC

Looking this over, my main concern would be over the
return values; all of the functions currently return
TRUE if the result of the operation over 0.

 * Is having a return value at all for the operations
   too general? When we want to do an atomic_inc, we
   don't need a return value at all, typically.
   Is there a performance penalty for having the 
   return value on some architectures?

 * Is having the return value be whether the result
   is 0 or not too specific? Are there cases where
   something else is needed?

Linux-2.5 has:

 atomic_add [ no return, 63 ]
 atomic_sub [ no return, 46 ]
 atomic_sub_and_test [ TRUE if result is 0, 1 ]
 atomic_inc [ no return, 693 ]
 atomic_dec [ no return, 263 ]
 atomic_dec_and_test [ TRUE if result is 0, 143 ]
 atomic_inc_and_test [ TRUE if result is 0, 0 ]
 atomic_add_negative [ TRUE if result is < 0, 18 ]

Where the numbers are the total number of uses in the 
kernel source tree. (all 18 uses of add_negative are
in the kernel's semaphore implementation; this is 
the only use of a test for something other than 0.)

I think having the return value be the new value (or
possibly the old value) is more natural, but if 
dec-and-test semantics are significantly more expensive,
then they probably make sense.

Aside from the question of "what is needed", probably
the other interesting question is: what is the performance
difference between

 - dec, no test
 - dec, test for 0
 - dec, return old value

On various platforms.

Comment 5 Havoc Pennington 2002-11-20 23:36:08 UTC

If sub/inc/dec are just macros, I'm not sure I'd bother with them. 
When I saw them in the sample code I assumed they existed because they 
could be optimized in some way vs. just adding.

Comment 6 alan 2002-11-20 23:49:33 UTC

Returning the new/old value is expensive on some platforms but the
dec_and_test_zero stuff is rather important to fast refcounting.

The code needs work however. The spins are not fair and dont work in
the presence of real time threads. They should probably be buzz locks.
Also you'd want to check the rules on page faults for various
platforms, not all get faulted atomic ops atomic. Doesn't matter to a
kernel does matter to a user app.

Comment 7 Owen Taylor 2002-11-21 00:29:35 UTC

OK, expanding out Alan's comments some, based on IRC conversation:

 - It makes sense to have atomic_inc()/atomic_dec_and_test() 
   operations with those semantics since they can be a lot
   faster than the generic operations, and are needed for
   refcounting.

 - Buzz locks are apparently locks which spin for a while,
   then fall back to a heavy weight lock, to avoid 
   a) deadlock issues with priorities b) spinning for long
   periods of time if the process holding the lock gets
   swapped out. Supposedly they are used within Mozilla,
   but I don't see that in the code, which just seems
   to fall back to heavyweight locks in the absence of
   atomic operations. Maybe it's how the non-native
   thread stuff in NSPR works?

 - For the page fault problem, basically, what is being
   said is that copying the way the kernel does things
   won't always work. On some platforms the kernel may
   have an atomic operation, but it is necessary to use
   a heavyweight lock in user-space.

Comment 8 Sebastian Wilhelmi 2002-11-23 10:41:19 UTC

The problem here is to find a good balance between a useful and
consistent interface and the speed (which of course here is most
important, because its the only reason to implement this interface in
the first place).

There is the possibility to only implement, what we really need, i.e.
'inc' and 'dec_and_test' (and not call the thing GAtomicInt then, but
GRefCount or something). 

Or we implement a super-generic GAtomicInt, which would suck
speed-wise on at least i386.

I chose a way in between originally, but in hindsight the first
solution would have been best.

So what about

We implement GRefCount with the operations g_ref_count_inc and
g_ref_count_dec_and_test?

--------------------------------------------------------------------

Now for the implementation side:

I checked glibc. They are 1. userspace and 2. LGPL and thus should be
safe to use.

It contains sysdeps/.../atomicity.h files, which could be used as
starting point. They all define 'exchange_and_add' and 'atomic_add'
(for many platforms, alas except i386).

So we could have:

struct _GRefCount
{
  volatile guint32 value; 
};

void
g_ref_count_inc (GRefCount *refcount)
{
  atomic_add (refcount->value, 1); 
}

gboolean
g_ref_count_dec_and_test (GRefCount *refcount)
{
  return exchange_and_add (refcount->value, -1) == 1;
}

For i386 we would have to see, whether the implementation 

  __asm__ __volatile__ ("lock; addl %2,%0; sete %1"
                        :"=m" (atomic->value), "=qm" (retval)
                        :"ir" (add), "m" (atomic->value) : "memory");
  
is save to use. Alan?

[ There also is the function compare_and_swap declared in atomicity.h,
  which is _very_ useful for all kind of multithreading stuff. We 
  might as well export this...... Why doesn't libc? ]

[ Downside of all this is of course, that only gcc users get the 
  better performance (or is this a plus? ;-) ]

Comment 9 Gustavo Carneiro 2003-01-06 18:28:15 UTC

  I've been seeing comments about dumping the generic functions in
favor of the refcounting-oriented functions. I don't completely
disagree. However, I would be most interested in the following two
functions:

  gboolean g_test_and_set   (gboolean *flag);
  gboolean g_test_and_clear (gboolean *flag);

  g_test_and_set would atomically get the value of flag and set it to
TRUE. The same for g_test_and_clear, except that it would set the
value to FALSE.
  These two functions are very simple but also very useful.

Comment 10 Sebastian Wilhelmi 2003-03-13 16:04:34 UTC

New patch attached:

The interface now looks that way:

gint32   g_atomic_exchange_and_add (GAtomic *atomic, 
                                    gint32   val);
void     g_atomic_add_fallback     (GAtomic *atomic, 
                                    gint32   val);
gboolean g_atomic_compare_and_swap (GAtomic *atomic, 
                                    gint32   oldval, 
                                    gint32   newval);
void     g_atomic_set              (GAtomic *atomic, 
                                    gint32   val);
gint32   g_atomic_get              (GAtomic *atomic);
void     g_atomic_inc              (GAtomic *atomic);
gboolean g_atomic_dec_and_test     (GAtomic *atomic);

I took the 'exchange_and_add' and the 'add' functions from libc. They
really form a good, though not very convenient interface.
'compare_and_swap' follows without cost from libc as well. It isn't
strictly necessary, but nice to have. 'inc' and 'dec_and_test' are
convenient wrappers to be used for reference counting.

For the implementation I used code snippets from the libc this time.
It's LGPL and userspace and thus save to use. Until now I only
implemented the fast assembler routines for i486 and sparc64, but we
can without problems add other archs later. [Actually it seems there
is no save way to implement this without locking on i386, but on
anything better than i486 it works.]

One very important rule is, that all part of a program must use either
the fast assembler or the slow locking functions, a mixture is unsafe.
To ensure this, I determine at configure-time, whether to use the fast
assembler versions. If they are found to be available, they are also
used as inline functions in the gatomic.h header for programs using GLib.

Should e.g. GLib be compiled with gcc on i486, then the fast version
will be used. If now another compiler is used to compile a program
using GLib, then this program will not get the inlined versions, but
it will use the fallback implementations, which in turn will use the
assembler version, so we are on the save side.

When glib isn't compiled with gcc, then the slow locking versions are
always used, even if later programs using GLib are compiled with gcc.

Of course other compilers can be added to that list, maybe the sun
people will add assembler code for their platform/compiler.

Regarding the timing: Using inlined assembler versions of g_atomic_inc
and g_atomic_dec_test is _ten times_ as fast as using the locking
versions on a pentium, so it really pays using them.

Comment 11 Sebastian Wilhelmi 2003-03-13 16:05:57 UTC

Created attachment 14993 [details] [review]
new patch for GAtomic

Comment 12 Sebastian Wilhelmi 2003-03-14 11:32:44 UTC

Note, that to see the effect on i386. You have to call 

    ./configure --build=i486-linux

I'll also attach a new version of the speed test program.

The results so far

i486: (my athlon 1.4 GHz)
glib     : 2.618763 sec
fallback : 33.455265 sec

alpha: (sourceforge compilefarm, COMPAQ AlphaServer DS20E 666 MHz)
glib     : 35.722361 sec
fallback : 153.221434 sec

[ alpha is not yet in my patch above, but I implemented it since then 
  locally ]

Comment 13 Sebastian Wilhelmi 2003-03-14 11:33:33 UTC

Created attachment 15014 [details]
new speed test program

Comment 14 Sebastian Wilhelmi 2003-03-14 14:42:58 UTC

A new version of the patch. Now includes specialcase implementation
for undefined G_THREADS_ENABLED. Also alpha code is includes now.
sparc64 now test in configure, whether the v9 instructions really
work. Some compilers, most notably gcc <= 3.0 can't compile that. For
sparc64 the speed values are as follows (test on TI UltraSparc II 
(BlackBird) on compilefarm.soureforge.net)

glib     : 32.938069 sec
fallback : 234.396596 sec

Comment 15 Sebastian Wilhelmi 2003-03-14 14:43:24 UTC

Created attachment 15017 [details] [review]
yet another patch for GAtomic

Comment 16 Benjamin Otte (Company) 2003-05-21 15:51:34 UTC

Adding myself to keep in touch for GStreamer and make our atomic 
stuff be easily transferable once GAtmoic hits the streets.

The currently used general API is a much better choice for GStreamer 
than the refcount only API as it should use atomic ints quite a bit 
more general.

Comment 17 Matthias Clasen 2003-07-30 00:24:57 UTC

Benjamin, can we expect more feedback regarding the atomic needs and
concerns of GStreamer ? Or are you fine with Sebastians latest patch ?

Comment 18 Benjamin Otte (Company) 2003-07-30 10:34:06 UTC

We (mainly Owen and GStreamer's Wim Taymans) had a discussion about 
this at GUADEC.
The only atomic stuff that GStreamer would need and can't easily do 
with this API are atomic list operations. But they are outside the 
scope of an atomic int I think.

So I would say the current implementation is fine for most of what 
GStreamer does and the rest is so special that it's probably better 
to implement in GStreamer (and maybe add it to glib later if the need 
arises).

Comment 19 Owen Taylor 2003-09-12 14:46:28 UTC

Hmm, it looks like you've based the assembly implementations
on obsolete parts of glibc. glibc now has 
atomic_exchange_and_add() defined for almost every platform,
in terms of a 'atomic_compare_and_exchange_bool_acq' primitive
when it's not defined natively.

include/atomic.h seems to be the current glibc internal
header file for atomic operations. It has a large number
of different operations.... inside glibc, ignoring NPTL,
the operations used seem to be:

atomic_exchange_and_add
atomic_increment
atomic_decrement
atomic_compare_and_exchange_bool_acq
atomic_decrement_and_test

One difference from what's implemented here is that most
of the macros use GCC-specific magic to work on all
widths - the compare-and-exchange operation is most useful
when done on pointer types.

I suspect the public interfaces in the last patch are
OK for a start. I'd really like to see broader coverage
for architecture coverage before we put this in GLib ...
compare-and-exchange-based fallback implementations might
reduce the pain of that.

One thing we have to be careful about is the "small functions
and macros of 10 lines or less" restriction in the LGPL.
Who knows *exactly* what it means, but I think copying 
more than 10 line macros from glibc and putting them in 
our public headers probably should be avoided.

Comment 20 Sebastian Wilhelmi 2003-09-13 10:08:44 UTC

> One difference from what's implemented here is that most
> of the macros use GCC-specific magic to work on all
> widths - the compare-and-exchange operation is most useful
> when done on pointer types.

So would you like to have the full coverage for all bit-sizes or just
extra functions for gint32 and gpointer? I don't think the glibc magic
of determining the operand size at compiler time is in the spirit of GLib

> I suspect the public interfaces in the last patch are
> OK for a start. I'd really like to see broader coverage
> for architecture coverage before we put this in GLib ...
> compare-and-exchange-based fallback implementations might
> reduce the pain of that.

I'll have a look.

> One thing we have to be careful about is the "small functions
> and macros of 10 lines or less" restriction in the LGPL.
> Who knows *exactly* what it means, but I think copying 
> more than 10 line macros from glibc and putting them in 
> our public headers probably should be avoided.

Where is the problem? glibc and GLib both are LGPL. So copying should
be legally OK. IANAL though.

Comment 21 Owen Taylor 2003-09-14 16:39:46 UTC

- I don't think the glibc magic compile time width handling
  is "not in the spirit of GLib", but it certainly isn't 
  *possible* with GLib, since we are compiler-neutral.

  My comment was basically that a "GAtomicInt" specific
  compare-and-exchange wouldn't be that useful.

- The point is not using more-than-10-lines macros inside
  of GLib, but putting them in the GLib public headers;
  if we write such a macro ourselves, we can say it's
  allowed to use it as a special exception to the LGPL,
  but we can't do that if we borrow code from glibc.

Comment 22 Owen Taylor 2003-09-30 21:48:23 UTC

Tim mentions wanting to change gobject refcounting to be atomic.
If we do that, then 

typedef struct _GAtomic GAtomic;
struct _GAtomic
{
  volatile gint32 value; 
};

Is a problem, because I bet there are at least a few 
g_assert (object->ref_count > 0) bits of code out there.

In order to preserve source and binary compatibilty,
we'd really need a 32-bit or sizeof(int) (either is
compatible enough) integral type.

Comment 23 Sebastian Wilhelmi 2003-10-09 13:27:19 UTC

New patch. This time with a changed interface. There's no longer an
extra type GAtomic, we work directly on gint32 or gpointer resp. That
of course makes it a bit easier to abuse by just setting atomic vars,
which is not safe in general, but all in all that is outweight by
better backward compatibility.

The interface:

gint32   g_atomic_int_exchange_and_add     (gint32   *atomic, 
					    gint32    val);
void     g_atomic_int_add                  (gint32   *atomic, 
				            gint32    val);
gboolean g_atomic_int_compare_and_swap     (gint32   *atomic, 
					    gint32    oldval, 
					    gint32    newval);
gboolean g_atomic_pointer_compare_and_swap (gpointer *atomic, 
					    gpointer  oldval, 
					    gpointer  newval);
void     g_atomic_int_set                  (gint32   *atomic, 
				            gint32    val);
void     g_atomic_pointer_set              (gpointer *atomic, 
					    gpointer  val);

exchange_and_add and add are not implemented for pointers, as that
would be pointless in my eyes. What do you think?

There's only one inline implementation (i486) right now, until we
decide on the interface. Then it should be quite straightforward to
add other architectures.

Do we need g_atomic_int_get and g_atomic_pointer_get? I don't think
so, as reading an 32-bit-integer and a pointer should be an atomic
operation on all platforms, GLib runs on.

The function names are a bit long now, we might better use
g_atomic_int_cmp_and_xchg instead of g_atomic_int_compare_and_swap

A new speed test program is also attached.

As for the copying from libc: I have to idea, whether it is safe to
copy those 3-liners. Do you think, I should simply ask for permission
on the libc list?

Comment 24 Sebastian Wilhelmi 2003-10-09 13:29:21 UTC

Created attachment 20596 [details] [review]
yet another patch for atomic ints

Comment 25 Sebastian Wilhelmi 2003-10-09 13:30:17 UTC

Created attachment 20597 [details]
new speed test program

Comment 26 Sebastian Wilhelmi 2004-01-17 13:32:41 UTC

PING? Any chance of this getting in. What is blocking here?
Implementaions for other archs than i386? That should however not
prevent from getting the API is before freeze.

Comment 27 Owen Taylor 2004-02-03 00:10:14 UTC

I think the blocking issue for the freeze is convincing ourselves
that it is *possible* to implement this API in reasonably fast
manner on the important platforms, then it's OK.

Important platforms, in my opinion:

 i386
 Sparc (because it's slow to begin with...)
 ia64
 x86_64 
 ppc32/64 (because of OS X more than the IBM stuff)

Everything else is pretty much marginal.

Comment 28 Sebastian Wilhelmi 2004-02-04 15:58:44 UTC

Regarding the feasibility of compare_and_swap for 32 (and 64 bit for
platforms with 64bit pointers). (All other proposed functions could be
implemented by means of compare_and_swap):

i386             - no way without locking
i486 and upwards - no problem for 32 bit
sparc            - no way without locking
sparcv9          - no problem for 32 bit (I don't think, there are 
                   still pre-v9 sparcs out there, sparc-v9 shipped 
                   beginning 1996)
sparc64          - no problem for 32 bit and 64 bit
ia64             - no problem for 32 bit and 64 bit. (seems to be a 
                   compiler builtin for gcc: 
                   __sync_bool_compare_and_swap_di
                   and __sync_bool_compare_and_swap_si) I have no 
                   access to a test platform though.
x86_64           - no problem for 32 bit (cmpxchgl) and 64 bit
                  (cmpxchgq)
ppc32/64         - no problem for 32 bit (stwcx) and 64 bit (stdcx)

So far native support for the following platforms is implemented:

sparc-v9 (sparc-solaris1.sourceforge.net):
------------------------------------------
 
glib     : 8.969010 sec
fallback : 48.226671 sec
 
sparc64:
-----------
 
No access to test platform, which can compile and link sparc64
stuff. It should work however, its quite similar to sparc32. We might
ask the sun people to try it.
 
alpha (usf-cf-alpha-linux-1.sourceforge.net):
---------------------------------------------
 
glib     : 22.489575 sec
fallback : 76.471542 sec
 
i486 (athlon; my computer; 1 GHz):
----------------------------------
 
glib     : 3.938015 sec
fallback : 23.864003 sec
 
ia64:
-----
 
Not done yet. No access to test platform.
 
x86_64 (amd64-linux1.sourceforge.net):
--------------------------------------
 
glib     : 2.207783 sec
fallback : 12.074824 sec
 
ppc32/64
---------
 
Not done yet. No access to test platform.


The new patch is attached.

Food for thought: exchange_and_add does not exchange at all. It reads,
in GLib's terms: it gets.  So get_and_add would be a more appropriate
name, while of course if might confuse people which are used to the
widely adopted exchange_and_add terminology. 

Also compare_and_swap would be a better name than compare_and_exchange
(shorter and according to google more widely used)

Comment 29 Sebastian Wilhelmi 2004-02-04 15:59:29 UTC

Created attachment 24061 [details] [review]
yet another patch for atomic ints

Comment 30 Manish Singh 2004-02-04 19:32:51 UTC

I can test on the missing platforms, probably today or tomorrow.

Personally, I'd like to see inc, dec, and dec_and_test in the API,
since those make it a lot clearer for refcounting, as well as a
possible slight efficiency gain.

Comment 31 Sebastian Wilhelmi 2004-02-04 20:09:51 UTC

It would be nice if you could test sparc64. Run tests/atomic-test and
post the numbers of gatomic-speed-test, as attached here.

ppc32/64 and ia64 aren't ready yet. But I'll try to update that tomorrow.

Comment 32 Sebastian Wilhelmi 2004-02-05 12:15:10 UTC

The new status (BTW, the time values are output by gatomic-speed-test,
as attached above, and show how much faster the new glib refcounting
primitives are compared to the fallback with locking)

The g_atomic functions are now implemented for all major platform,
they are however untested on the last three (ia64, sparc64, powerpc64)
I would lile anyone with access to one of those platforms to give it a
try. First run ./configure and look, whether the right platform is
detected (grep for 'inline assembler') and then compile GLib.
Then compile and run tests/atomic-test and finally compile and run the
gatomic-speed-test program attched above and post the results.

Thanks.

sparc-v9 (sparc-solaris1.sourceforge.net):
------------------------------------------

glib     : 8.969010 sec 
fallback : 48.226671 sec

alpha (usf-cf-alpha-linux-1.sourceforge.net):
---------------------------------------------

glib     : 22.489575 sec
fallback : 76.471542 sec

i486 (Athlon GHz; my computer):
-------------------------------

glib     : 3.938015 sec
fallback : 23.864003 sec

x86_64 (amd64-linux1.sourceforge.net):
--------------------------------------

glib     : 2.207783 sec
fallback : 12.074824 sec

powerpc32 (ppc-osx2.cf.sourceforge.net):
----------------------------------------

glib     : 4.271049 sec
fallback : 41.643611 sec

ia64:
-----

Implemented. No access to test platform.

sparc64:
--------

Implemented. No access to test platform.

powerpc64:
----------

Implemented. No access to test platform.

Comment 33 Sebastian Wilhelmi 2004-02-05 12:15:37 UTC

Created attachment 24091 [details] [review]
yet another patch for atomic ints

Comment 34 Owen Taylor 2004-02-05 17:19:05 UTC

Thanks for doing this work Sebastian, this is looking
very good. I can test on ppc64 and and ia64.

One question that is still in my mind is "how much
good is inlining doing here" ... as someone pointed out
earlier, putting all the assembly inline does have
some disadvantages:

 * We can't bug-fix / improve the assembly and
   have apps take advantage of that.
 * We run into the > 10-line inline function licensing
   issue.

I've tried modifying the speed-test program to also
time trivial wrapper functions around the inline
versions, I get:

glib     : 4.376484 sec
function : 7.325000 sec
fallback : 23.984717 sec

Which doesn't really answer the question definitively... 
it's slower but not a lot slower.

Comment 35 Owen Taylor 2004-02-05 17:21:30 UTC

Created attachment 24108 [details] [review]
Speed test with function wrapper

Comment 36 Owen Taylor 2004-02-05 19:21:28 UTC

Tested OK on ppc64:

ppc64: (RS64-IV, 668mhz)
glib     : 1.961480 sec
function : 3.328593 sec
fallback : 31.004492 sec

On ia64, compilation gives the warnings:

In file included from glib.h:33,
                 from ghash.c:33:
../glib/gatomic.h: In function `g_atomic_int_exchange_and_add':
../glib/gatomic.h:453: warning: passing arg 1 of
`__sync_fetch_and_add_di' from incompatible pointer type
../glib/gatomic.h: In function `g_atomic_int_add':
../glib/gatomic.h:460: warning: passing arg 1 of
`__sync_fetch_and_add_di' from incompatible pointer type
../glib/gatomic.h: In function `g_atomic_pointer_compare_and_swap':
../glib/gatomic.h:476: warning: passing arg 1 of
`__sync_bool_compare_and_swap_di' from incompatible pointer type
../glib/gatomic.h:476: warning: passing arg 2 of
`__sync_bool_compare_and_swap_di' makes integer from pointer without a
cast
../glib/gatomic.h:476: warning: passing arg 3 of
`__sync_bool_compare_and_swap_di' makes integer from pointer without a
cast

The functions have the signatures:

extern long __sync_fetch_and_add_di (long *, long);
extern int __sync_bool_compare_and_swap_di (long *, long, long);

Using __sync_fetch_and_add_si() instead and inserting casts:

  return __sync_bool_compare_and_swap_di ((long *)atomic,
(long)oldval, (long)newval);

fixes the warnings. The tests then pass, and the timings
are (Itanium 2, 1100mhz)

glib     : 5.493671 sec
function : 5.859069 sec
fallback : 25.820156 sec

Comment 37 Manish Singh 2004-02-08 23:51:34 UTC

Sparc64, Debian unstable:

glib     : 13.522815 sec
function : 12.172104 sec
fallback : 111.402083 sec

With debian's gcc __sparcv8 is *never* defined, but __sparc_v9__ is,
so the check in gatomic.h needs to be changed. Which compiler does the
sourceforge machine have?

Comment 38 Sebastian Wilhelmi 2004-02-09 09:26:49 UTC

wilhelmi@sparc-solaris2:~$ gcc -v
Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.9/3.3.2/specs
Configured with: ../configure --with-as=/usr/ccs/bin/as
--with-ld=/usr/ccs/bin/ld --disable-nls
Thread model: posix
gcc version 3.3.2
wilhelmi@sparc-solaris2:~$ echo "" >test.c
wilhelmi@sparc-solaris2:~$ gcc -dM -E test.c >flags
wilhelmi@sparc-solaris2:~$ gcc -mcpu=v9 -dM -E test.c >flags.v9
wilhelmi@sparc-solaris2:~$ diff -u flags flags.v9
--- flags       Mon Feb  9 01:18:52 2004
+++ flags.v9    Mon Feb  9 01:18:55 2004
@@ -63,5 +63,6 @@
 #define __REGISTER_PREFIX__
 #define __LDBL_DIG__ 33
 #define __NO_INLINE__ 1
+#define __sparcv8 1
 #define __FLT_MANT_DIG__ 24
 #define __VERSION__ "3.3.2"

So it seems, only the gcc on sparc does it wrong. Maybe Debian patched
their gcc up. 

In any case we should test for and accept both.

Comment 39 Manish Singh 2004-02-09 10:19:18 UTC

Debian didn't patch their gcc. Looks like gcc has one set of
definitions for solaris, and another set for all other sparc. My
interpretation of gcc's definitions:

Solaris 32-bit, -mcpu=v9: -D__sparcv8
Solaris 64-bit: -D__sparcv9
Non-Solaris, 32 & 64-bit: -D__sparc_v9__

Comment 40 Manish Singh 2004-02-09 10:27:49 UTC

So defined(__sparcv8) || defined(__sparcv9) || defined(__sparc_v9__)
should be our test. __sparcv8 is only defined in the Solaris
definition, so non-Solaris platforms should be safe from that screwup.

My guess from looking at the gcc code is that the Sun toolchain
expects __sparcv9 to imply 64-bit as well, so they couldn't safely
define __sparcv9 in 32-bit mode.

Comment 41 Sebastian Wilhelmi 2004-02-09 12:52:42 UTC

Thanks for checking that. I've updated my patch accordingly.

Comment 42 Sebastian Wilhelmi 2004-02-18 16:20:48 UTC

I've attached a new patch, this time complete with docs.

Changes:

I've removed the _set functions. They are not needed. While it isn't
save to set an int/pointer, when it is concurrently accesses by other
threads, it is rarely needed and can be achieved by using
g_atomic_(int|pointer)_compare_and_swap. 

I've renamed g_atomic_int_exchange_and_add to
g_atomic_int_get_and_add. It's shorter and adheres more to the GLib
naming scheme and is more correct (Nothing is exchanged here).

The changes for both ia64 and sparc64 as proposed by Manish and Owen
are incorporated.

As for the function <-> inline function argument, I think, we should
leave it as inline functions. Performance is the only reason to add
these functions. So I would use the faster version.

Of course I see the license implication. But I'm pretty sure, we would
get permission to use the snippets as LGPL header stuff.

Comment 43 Sebastian Wilhelmi 2004-02-18 16:25:43 UTC

Created attachment 24520 [details] [review]
newest version of g_atomic patch

Comment 44 Sebastian Wilhelmi 2004-02-20 16:42:11 UTC

New version.

Added 

g_atomic_int_get
g_atomic_pointer_get

which also act as a memory barrier.

This makes possible

gpointer list;

do
{
  head = g_atomic_pointer_get (&list);
  new_head = head->next;
} while (!g_atomic_pointer_compare_and_swap (&list, head, new_head));

to quickly pop the first element off a list.

Comment 45 Sebastian Wilhelmi 2004-02-20 16:42:49 UTC

Created attachment 24611 [details] [review]
newest version of g_atomic patch

Comment 46 Markku Vire 2004-02-23 08:18:41 UTC

Because of GPE (gpe.handhelds.org), ARM could be an important platform
too.

Comment 47 Sebastian Wilhelmi 2004-02-23 15:23:15 UTC

Created attachment 24693 [details] [review]
patch \infty

Comment 48 Sebastian Wilhelmi 2004-02-23 15:24:58 UTC

This last patch renames the functions as Owen wanted. The 
interface now looks like that:

gint32      g_atomic_int_get                (gint32 *atomic);
void        g_atomic_int_add                (gint32 *atomic,
                                             gint32 val);
gint32      g_atomic_int_exchange_and_add   (gint32 *atomic,
                                             gint32 val);
gboolean    g_atomic_int_compare_and_exchange
                                            (gint32 *atomic,
                                             gint32 oldval,
                                             gint32 newval);
gpointer    g_atomic_pointer_get            (gpointer *atomic);
gboolean    g_atomic_pointer_compare_and_exchange
                                            (gpointer *atomic,
                                             gpointer oldval,
                                             gpointer newval);
void        g_atomic_int_inc                (gint32 *atomic);
gboolean    g_atomic_int_dec_and_test       (gint32 *atomic);

Comment 49 Sebastian Wilhelmi 2004-02-26 14:31:03 UTC

2004-02-26  Sebastian Wilhelmi  <seppi@seppi.de>

	* glib/gatomic.c, glib/gatomic.h: New files to implement atomic
	operations for different platforms. Fixes bug #63621.

	* glib/glib.h: Include gatomic.h.

	* configure.in: Add test for assembler routines for atomic operations.

	* glib/Makefile.am: Add gatomic.c, gatomic.h.

	* tests/Makefile.am, tests/atomic-test.c: Unit test for atomic
	operations.

	* glib/glib-overrides.txt, glib/glib-sections.txt,
	glib/glib-docs.sgml, glib/tmpl/atomic_operations.sgml: Add docs
	for atomic operations.

Comment 50 Sebastian Wilhelmi 2004-02-29 09:51:31 UTC

According to Ulrich Drepper we should better replace the inline
assembler functions by normal functions in the .c-files.

Also it turned out, that we better use gint instead of gint32 (they
are most often the same anyway) to be able to replace reference
counting for   ABI-fixed structures as well.

I will shortly attache a fist version of that.

Comment 51 Sebastian Wilhelmi 2004-02-29 10:36:36 UTC

Created attachment 24929 [details] [review]
patch \infty + 1

Comment 52 Sebastian Wilhelmi 2004-02-29 13:59:05 UTC

Some bugs fixed and tested on the following platforms:

sparc-v9 (sparc-solaris1.sourceforge.net):
------------------------------------------

glib     : 31.482255 sec
fallback : 44.756862 sec

alpha (usf-cf-alpha-linux-1.sourceforge.net):
---------------------------------------------

glib     : 26.235578 sec
fallback : 76.845411 sec

i486 (Athlon GHz; my computer):
-------------------------------

glib     : 6.249439 sec
fallback : 34.432714 sec

x86_64 (amd64-linux1.sourceforge.net):
--------------------------------------

glib     : 2.695916 sec
fallback : 12.063377 sec


Other platforms not tested yet.

sparc-v9 has really become slower. The rest is OK.

As we are not using inline functions any more I had to do some macro
magic, when the _add_ functions are defined in terms of cmp_xchg.
Otherwise on sparc glib would have been slower than fallback ;-)

Comment 53 Sebastian Wilhelmi 2004-02-29 14:01:56 UTC

Created attachment 24934 [details] [review]
patch \infty + 2

Comment 54 Sebastian Wilhelmi 2004-02-29 16:50:29 UTC

2004-02-29  Sebastian Wilhelmi  <seppi@seppi.de>
 
        * configure.in, glib/gatomic.c, glib/gatomic.h: Moved the
        assembler functions from gatomic.h to gatomic.c, which makes for
        better maintainability. Also use gint instead of gint32 to be able
        to use reference counting for ABI-fixed structures with
        gint/guint.
 
        * glib/gthread.h: Adapted accordingly.
 
        * tests/atomic-test.c: Updated to test for G_MAXINT and G_MININT.

Comment 55 Mark McLoughlin 2004-03-02 15:32:56 UTC

Sebastien asked my to try this on ppc64:

ppc64: (RS64-IV, 668mhz)
glib     : 5.707164 sec
function : 8.714315 sec
fallback : 34.817229 sec