Bug 644437 – GHashTable is memory hungry

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 644437 - GHashTable is memory hungry


Summary:	GHashTable is memory hungry


Status:	RESOLVED FIXED

Product:	glib
Classification:	Platform
Component:	general
Version:	unspecified
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkdev
QA Contact:	gtkdev

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2011-03-10 21:04 UTC by Morten Welinder
Modified:	2011-05-07 18:43 UTC

See Also:
GNOME target:	---
GNOME version:	---

Attachments
0001-GHash-use-macros-to-check-for-magic-hash-values.patch (6.55 KB, patch) 2011-04-27 15:26 UTC, Morten Welinder	none	Details \| Review
0002-GHash-split-nodes-array-into-seperate-arrays.patch (17.25 KB, patch) 2011-04-27 15:26 UTC, Morten Welinder	none	Details \| Review
0003-GHash-eliminate-one-of-the-lookup_nodes-functions.patch (4.95 KB, patch) 2011-04-27 15:26 UTC, Morten Welinder	none	Details \| Review
0004-GHash-introduce-set-mode.patch (5.96 KB, patch) 2011-04-27 15:27 UTC, Morten Welinder	reviewed	Details \| Review
0005-GHash-typo.patch (818 bytes, patch) 2011-04-29 13:18 UTC, Morten Welinder	accepted-commit_now	Details \| Review
0006-GHash-don-t-ignore-values-for-sets.-Test-that-they-m.patch (787 bytes, patch) 2011-04-29 13:18 UTC, Morten Welinder	none	Details \| Review
0007-GHash-introduce-g_hash_table_exists-convenience-API.patch (2.63 KB, patch) 2011-04-29 13:18 UTC, Morten Welinder	none	Details \| Review
0008-GHash-introduce-g_hash_table_set_insert.patch (2.82 KB, patch) 2011-04-29 13:19 UTC, Morten Welinder	none	Details \| Review
GHashTable: tweak sets (14.45 KB, patch) 2011-05-05 17:29 UTC, Allison Karlitskaya (desrt)	none	Details \| Review

Description Morten Welinder 2011-03-10 21:04:59 UTC

When the 7895161st element is added to a GHashTable, the memory usage
goes up to ~400M.

Temporarily, during the resize that happens then, it goes up to ~600M,
all of which is accessed.  That's about 75 bytes per element.

(Someone threw a 8M cell spreadsheet at Gnumeric.  I hadn't quite expected
that kind of memory being used for the cell hash.)

Comment 1 Hans Petter Jansson 2011-03-26 16:01:00 UTC

Currently, GHashTable uses 12 bytes per bucket on 32-bit platforms, or 24 bytes on 64-bit platforms. Open hash tables also maintain a variable number of unused buckets in addition to the actual elements.

I can think of one change that would reduce the memory footprint to 9 bytes per bucket (or 17 bytes on 64-bit platforms) without changing the API or scaling properties of GHashTable. This also applies to empty buckets and tombstones, so in effect the 600M figure would be reduced to 425M.

The bucket structure currently looks like this:

struct _GHashNode
{
  gpointer   key;
  gpointer   value;

  /* If key_hash == 0, node is not in use
   * If key_hash == 1, node is a tombstone
   * If key_hash >= 2, node contains data */
  guint      key_hash;
};

key_hash is 4 bytes, with another 4 bytes of padding on 64-bit platforms. We could take it out of the structure and maintain key_hash in a separate array. 

Additionally, we could store it as a single byte by xor'ing together the component bytes of the integer generated by the hash function. Subtracting the values for unused buckets and tombstones, there's still only a 1/254 chance that different values produce the same hash.

The only downside would be that we'd need to access at least two cache lines for each lookup. I don't know if that'd matter much in practice - it's worth benchmarking.

So:

struct _GHashNode
{
  gpointer   key;
  gpointer   value;
};

struct _GHashTable
{
  gint             size;
  gint             mod;
  guint            mask;
  gint             nnodes;
  gint             noccupied;  /* nnodes + tombstones */
  GHashNode       *nodes;

  /* If key_hash == 0, node is not in use
   * If key_hash == 1, node is a tombstone
   * If key_hash >= 2, node contains data */
  guchar          *key_hashes;

  GHashFunc        hash_func;
  GEqualFunc       key_equal_func;
  volatile gint    ref_count;
#ifndef G_DISABLE_ASSERT
  /*
   * Tracks the structure of the hash table, not its contents: is only
   * incremented when a node is added or removed (is not incremented
   * when the key or data of a node is modified).
   */
  int              version;
#endif
  GDestroyNotify   key_destroy_func;
  GDestroyNotify   value_destroy_func;
};

An added benefit is that the hash table's memory would be allocated in two chunks, reducing the need for contiguous memory slightly.

If we were to provide a different API, we could do even better. Maybe it's time for a GHashDict where key == value?

Comment 2 Morten Welinder 2011-03-26 17:34:55 UTC

> If we were to provide a different API, we could do even better. Maybe it's time
> for a GHashDict where key == value?

I don't think a new API is needed for that.  One could call

    g_hash_table_set_flags (h,
                            ..._KEY_ONLY /* value */,
                            ..._KEY_ONLY /* mask */);

at a time before the first element is inserted.

That would require that the keys and values be stored in separate arrays,
and only one of them would be allocated if the above flag was set.
GHashNode would be down to single elements, i.e., pointless.

In my prototype I also got rid of the need to store the hash at all, see
bug 644197.  That's pretty brutal and is tricky unless the
user can supply two magic key values.  I think I like your 1-byte version.

Comment 3 Hans Petter Jansson 2011-03-26 21:41:32 UTC

Having flags that can only be set prior to the first insertion feels a bit ad-hoc to me, though admittedly it might be better than a whole new data structure. We could also silently convert between key-value and key-only fairly easily, avoiding the requirement that the flags be set prior to first insertion.

It will also slow us down a little with extra checks that need to be made for every operation.

The 1-byte trick would likely slow us down more, since the full hash would need to be recalculated for every key when the table is resized - potentially a couple of million calls to a user-provided function that might reference extra memory and cause a lot of page faults/cache misses.

It'd be nice to have benchmarks for this so we can see what the tradeoff's like, preferably from an app that uses GHashTable heavily (like Gnumeric). I don't trust synthetic benchmarks that much, since it's easy to over-emphasize a use case or get misleading results from a too "clean" environment that way.

Comment 4 Matthias Clasen 2011-03-28 01:02:00 UTC

I don't think optimizing for millions of elements is really what GHashTable is for...

If you can come up with non-intrusive ways to curb memory consumption, sure. But complicated restructuring and new flags don't sound attractive to me.

Comment 5 Morten Welinder 2011-03-28 18:46:26 UTC

"I don't think optimizing for millions of elements is really what
GHashTable is for..."

That seems to mean "we shouldn't make GHashTable optimal for the case
of millions of elements disregarding any effects of other cases".  I
don't think anyone will disagree with that.  (And no-one suggested it.)

I think the following is feasible without major changes:

1. Split the GHashNode array into three separate arrays: keys/values/hashes.
   [Saves 1/6]
2. Make the values array optional, in practice by pointing it the same place
   as the keys array so most code wouldn't notice.  [Saves 1/3]

Step 2 requires some kind of flag setting, but if you don't like making that
explicit, a separate constructor, say g_hash_table_new_set (...), could do
it.

I think the old code that needs to know about step 2 is constructor,
destructor, and resize.  And, perhaps, a key==value check for insert.

Comment 6 Matthias Clasen 2011-04-27 06:21:58 UTC

would be interesting to see patches for 1 and 2, indeed.

Comment 7 Morten Welinder 2011-04-27 15:26:15 UTC

Created attachment 186747 [details] [review]
0001-GHash-use-macros-to-check-for-magic-hash-values.patch

First of 4 patches.

Comment 8 Morten Welinder 2011-04-27 15:26:34 UTC

Created attachment 186748 [details] [review]
0002-GHash-split-nodes-array-into-seperate-arrays.patch

Comment 9 Morten Welinder 2011-04-27 15:26:56 UTC

Created attachment 186749 [details] [review]
0003-GHash-eliminate-one-of-the-lookup_nodes-functions.patch

Comment 10 Morten Welinder 2011-04-27 15:27:14 UTC

Created attachment 186750 [details] [review]
0004-GHash-introduce-set-mode.patch

Comment 11 Morten Welinder 2011-04-27 15:29:49 UTC

Patch 2 is roughly step 1.
Patch 4 is roughly step 2.  Simple test case included.

Patches 1 and 3 are for simplifying this to make the other two easier.

Comment 12 Morten Welinder 2011-04-27 15:36:43 UTC

Unrelatedly, someone ought to decide whether to use "int" or "guint"
for node indices.

Comment 13 Matthias Clasen 2011-04-27 15:47:42 UTC

Thanks !
Now we need some performance tests...

Comment 14 Matthias Clasen 2011-04-27 15:52:46 UTC

Review of attachment 186750 [details] [review]:

::: glib/ghash.c
@@ +604,3 @@
+ *
+ * Creates a new #GHashTable like g_hash_table_new_full(), but for use as a
+ * set where key and value are always the same.  Only the key will be stores

s/stores/stored/

@@ +955,2 @@
   g_return_if_fail (hash_table != NULL);
   g_return_if_fail (hash_table->ref_count > 0);

We should probably mention in the docs that it is an error to call g_hash_table_insert (h, k, v) with k != v on a hash set. And maybe have a g_warning ?

Comment 15 Morten Welinder 2011-04-27 15:57:07 UTC

We do?  HP, do you have something from when you changed algorithm?
That would be less suspect than anything I can cook up.

Comment 16 Morten Welinder 2011-04-27 15:59:09 UTC

>We should probably mention in the docs that it is an error to call
>g_hash_table_insert (h, k, v) with k != v on a hash set. And maybe have a
>g_warning ?

I was a bit more brutal, see g_hash_table_insert_internal:

  if (hash_table->is_set)
    value = key;

In other words, we ignore the value.  It might be better to document
that.

Comment 17 Benjamin Otte (Company) 2011-04-27 16:20:17 UTC

I'm very much in favor of adding a g_return_if_fail (!is_set || key != value) to the insertion functions. The current approach of silently doing something else is not good. It makes people not find their bugs.

Also, I'd be interested if for large hashes there's cache issues when using 3 separate arrays. I don't think it's worth optimizing for that case, but if someone does perf measurements, it'd be nice to know.

Last but not least, do we need to think about bindability for this (so that bindings can allow set.insert (value) instead of needing set.insert (key, value)?

Comment 18 Morten Welinder 2011-04-27 16:48:17 UTC

Bindability can be solved like this:

void
g_hash_table_set_insert (GHashTable *hash_table, gpointer key)
{
  g_hash_table_insert (hash_table, key, key);
}

(Explicitly a valid and meaningful call even for non-sets.)

I'm holding off on the little stuff until we figure out if HP has some
performance tests around.

Comment 19 Morten Welinder 2011-04-27 18:08:29 UTC

I added the following test function to tests/hash.c and timed the resulting
code.

static void
performance_hash_test (void)
{
  GHashTable *hash_table = g_hash_table_new (NULL, NULL);
  int i;

  for (i = 0; i < (1 << 22); i++)
    {
      int ri = g_test_rand_int_range (0, 1 << 24);
      g_hash_table_insert (hash_table, GINT_TO_POINTER (ri), NULL);
    }

  for (i = 0; i < (1 << 22); i++)
    {
      int ri = g_test_rand_int_range (0, 1 << 24);
      if (ri & 16)
	g_hash_table_insert (hash_table, GINT_TO_POINTER (ri), NULL);
      else
	g_hash_table_remove (hash_table, GINT_TO_POINTER (ri));
    }

  g_hash_table_destroy (hash_table);
}


On this machine and with this test sample it appears that the new and old
are about the same speed.  If anything, the new code appears to be master.
(That makes a bit of sense: the cpu's addressing unit can multiply by
4 and 8.  Multiplying by 24 takes two instructions extra.)

new:

real	0m3.977s
user	0m3.440s
sys	0m0.348s

real	0m3.883s
user	0m3.448s
sys	0m0.340s

real	0m3.956s
user	0m3.436s
sys	0m0.408s

real	0m3.947s
user	0m3.384s
sys	0m0.380s

real	0m3.990s
user	0m3.468s
sys	0m0.376s

real	0m3.968s
user	0m3.388s
sys	0m0.364s



old:

real	0m3.957s
user	0m3.500s
sys	0m0.416s

real	0m3.986s
user	0m3.412s
sys	0m0.440s

real	0m4.108s
user	0m3.556s
sys	0m0.416s

real	0m4.056s
user	0m3.560s
sys	0m0.352s

real	0m4.037s
user	0m3.452s
sys	0m0.460s

real	0m4.112s
user	0m3.528s
sys	0m0.404s

Comment 20 Matthias Clasen 2011-04-27 21:57:27 UTC

(In reply to comment #18)
> Bindability can be solved like this:
> 
> void
> g_hash_table_set_insert (GHashTable *hash_table, gpointer key)
> {
>   g_hash_table_insert (hash_table, key, key);
> }
> 
> (Explicitly a valid and meaningful call even for non-sets.)
> 

If we go down that route, we probably also want

gboolean g_hash_table_set_contains (GHashTable *ht, gpointer key)


My quick-and-dirty performance comparisons also point at the new code being slightly faster.

Comment 21 Benjamin Otte (Company) 2011-04-27 22:08:06 UTC

I don't like going that route. If we were to add set-specific APIs, we should have a GHashSet and give it a completely custom API, kinda like we do with all the different array types.

But having hash table API that works on just some hash tables seems like a bad idea to me.

Comment 22 Matthias Clasen 2011-04-27 22:58:18 UTC

both g_hash_table_set_insert (ht, key) will work just fine on non-set hash tables (assuming that g_hash_table_set_insert (ht, key, key)  does)

g_hash_table_set_contains (ht, key) will work just fine on any hash table, set or not...

Comment 23 Morten Welinder 2011-04-29 13:18:21 UTC

Created attachment 186880 [details] [review]
0005-GHash-typo.patch

Comment 24 Morten Welinder 2011-04-29 13:18:42 UTC

Created attachment 186881 [details] [review]
0006-GHash-don-t-ignore-values-for-sets.-Test-that-they-m.patch

Comment 25 Morten Welinder 2011-04-29 13:18:59 UTC

Created attachment 186882 [details] [review]
0007-GHash-introduce-g_hash_table_exists-convenience-API.patch

Comment 26 Morten Welinder 2011-04-29 13:19:17 UTC

Created attachment 186883 [details] [review]
0008-GHash-introduce-g_hash_table_set_insert.patch

Comment 27 Colin Walters 2011-04-29 13:20:53 UTC

Review of attachment 186880 [details] [review]:

Just commit stuff like this IMO.

Comment 28 Morten Welinder 2011-04-29 13:25:01 UTC

There are still things that can be improved memory-wise.

For example, if g_direct_hash is used -- explicitly or via NULL -- then
there is no point in storing the hash value.  We would still need the
unused/tombstone/used marker, but a much smaller data item would suffice.
I do *not* think that would be worth the complications of the code.

Unrelatedly, if the sorting of names in glib.symbols is important, then
g_hash_table_unref and g_hash_table_ref need to be moved.

Comment 29 Morten Welinder 2011-04-29 13:30:29 UTC

Colin: attachment 186880 [details] [review] fixes a typo introduced in attachment 186750 [details] [review],
so I can't commit it now.

Comment 30 Matthias Clasen 2011-04-29 14:28:48 UTC

(In reply to comment #28)
> There are still things that can be improved memory-wise.
> 
> For example, if g_direct_hash is used -- explicitly or via NULL -- then
> there is no point in storing the hash value.  We would still need the
> unused/tombstone/used marker, but a much smaller data item would suffice.
> I do *not* think that would be worth the complications of the code.

Yeah, lets hold off on that.

> Unrelatedly, if the sorting of names in glib.symbols is important, then
> g_hash_table_unref and g_hash_table_ref need to be moved.

Its not important, except for the fact that it makes stuff easier to find


From my perspective, things are looking good; just waiting for a quick nod from hpj.

Comment 31 Benjamin Otte (Company) 2011-04-29 17:21:46 UTC

(In reply to comment #25)
> Created an attachment (id=186882) [details] [review]
> 0007-GHash-introduce-g_hash_table_exists-convenience-API.patch
>
I strongly prefer g_hash_table_contains() as a name for that function.

Comment 32 Allison Karlitskaya (desrt) 2011-04-29 17:55:29 UTC

2cents:

Questions of bindability aside, I think it would make sense to have this happen 'automatically' for every hash table instead of introducing a new constructor.

First, we decide a convention -- either 'value' is always (void*)1 or equal to key.  Let's say key == value, since that's easier.

Then we just need to create all hash tables in 'set' mode and add a check like this on inserts:

if G_UNLIKELY (key != value && key_array == value_array)
  value_array = memdup (value_array);

(G_UNLIKELY since it would only happen once per table, and sometimes never).



I agree with 'contains' being a better name than 'exists'.  'has' or 'has_key' would also be OK in my opinion.



I mildly prefer storing (void*)1 instead of key == value for a couple of reasons:

  - it's less bizarre from a language-binding standpoint since you just have
    HashTable<thing, bool>.  Then you could have a function for insert_true or
    something like that.

  - it lets you store NULL in your set in a way that's easy to distinguish (ie:
    without using the 'extended' lookup API)

but with the drawback:

  - implementing lookup becomes very slightly more complicated (and therefore
    slightly slower).  I wouldn't care if it wasn't such a common operation.

Comment 33 Allison Karlitskaya (desrt) 2011-04-29 17:57:19 UTC

Ah.  Of course, the contains() function would address my point about NULL.

Comment 34 Morten Welinder 2011-04-29 18:46:26 UTC

At some point we need to stop talking and start doing.

(void*)1 is not a good option.  It requires a lot of extra changes
(== run-time tests) that are currently avoided by pointing the two
arrays to the same data.

Comment 35 Wouter Bolsterlee (uws) 2011-04-29 20:39:36 UTC

(In reply to comment #32)
> if G_UNLIKELY (key != value && key_array == value_array)
>   value_array = memdup (value_array);

I guess that last line was supposed to read "memdup (key_array)" instead?  (Apart from the actual memdup implementation which would be a malloc+memcpy I presume).

Comment 36 Allison Karlitskaya (desrt) 2011-04-29 20:42:19 UTC

Read again.  key_array == value_array. :)

But ya... obviously it was pseudo-code, since we'd use g_memdup and last I checked, memdup takes more arguments. :)

Comment 37 Hans Petter Jansson 2011-04-29 23:44:56 UTC

(In reply to comment #30)

> From my perspective, things are looking good; just waiting for a quick nod from
> hpj.

Performance-wise, it looks good to me. I trust you on the API changes. I like Ryan's "automatic conversion" approach, but I worry about hard-to-notice memory inefficiencies where buggy app code accidentally turns a set into a map.

Comment 38 Matthias Clasen 2011-05-01 03:13:58 UTC

(In reply to comment #34)
> At some point we need to stop talking and start doing.

Indeed. So, since I quite like the 'automatic conversion' idea myself, here is what I have done now:

- I have taken the first 3 patches which do the splitting of the nodes array into three separate arrays

- The patch that introduces 'set mode' has been tweaked to do so transparently without an explicit is_set boolean.

- The documentation now contains an example that shows how to use a hash table for storing a set

- No new api has been added; we can always do that later if we decide that we need bindable api or a completely separate GHashSet.

Comment 39 Benjamin Otte (Company) 2011-05-01 05:05:41 UTC

Thinking about it, it should work, but are we sure that g_hash_table_new_full () with 2 DestroyNotifys works properly? A test with a refcounted structure would be nice.

Comment 40 Dan Winship 2011-05-01 13:32:19 UTC

Commit 6e45153 broke "make check":

** ERROR **: failed to reallocate slice from magazine (after g_thread_init): size=81
/bin/sh: line 5: 14743 Trace/breakpoint trap   srcdir=. LIBCHARSET_ALIAS_DIR=../glib/libcharset MALLOC_CHECK_=2 MALLOC_PERTURB_=$((${RANDOM:-256} % 256)) ${dir}$tst
FAIL: slice-threadinit

Comment 41 Matthias Clasen 2011-05-01 14:20:36 UTC

(In reply to comment #39)
> Thinking about it, it should work, but are we sure that g_hash_table_new_full
> () with 2 DestroyNotifys works properly? A test with a refcounted structure
> would be nice.

Thinking about it, it doesn't make any sense to call

g_hash_table_insert (h, key, key)

on a hash table with 2 destroy notifys

Comment 42 Matthias Clasen 2011-05-01 14:57:11 UTC

(In reply to comment #41)
> (In reply to comment #39)
> > Thinking about it, it should work, but are we sure that g_hash_table_new_full
> > () with 2 DestroyNotifys works properly? A test with a refcounted structure
> > would be nice.
> 
> Thinking about it, it doesn't make any sense to call
> 
> g_hash_table_insert (h, key, key)
> 
> on a hash table with 2 destroy notifys

But g_hash_table_insert (h, key, ref(key)) might make sense. I've added a test and fix for that now.

Comment 43 Matthias Clasen 2011-05-01 21:10:30 UTC

(In reply to comment #40)
> Commit 6e45153 broke "make check":
> 
> ** ERROR **: failed to reallocate slice from magazine (after g_thread_init):
> size=81
> /bin/sh: line 5: 14743 Trace/breakpoint trap   srcdir=.
> LIBCHARSET_ALIAS_DIR=../glib/libcharset MALLOC_CHECK_=2
> MALLOC_PERTURB_=$((${RANDOM:-256} % 256)) ${dir}$tst
> FAIL: slice-threadinit


Should be fixed now.

Comment 44 Allison Karlitskaya (desrt) 2011-05-05 02:43:50 UTC

(In reply to comment #42)
> (In reply to comment #41)
> > Thinking about it, it doesn't make any sense to call
> > 
> > g_hash_table_insert (h, key, key)
> > 
> > on a hash table with 2 destroy notifys
> 
> But g_hash_table_insert (h, key, ref(key)) might make sense. I've added a test
> and fix for that now.


How do you propose that this might work from a g_hash_table_add_to_set() type API, should we want to add one?  We don't know the ref function...

Also: even if we did know it, what if it is something like g_strdup() (not too unlikely since I think that strings are the most commonly hashed items)?

Comment 45 Matthias Clasen 2011-05-05 03:16:44 UTC

One approach would be to only allow the use of add_to_set if value_destroy is NULL

Comment 46 Allison Karlitskaya (desrt) 2011-05-05 06:49:16 UTC

i'm mildly surprised that i don't entirely hate that idea. :)

Comment 47 Morten Welinder 2011-05-05 11:00:01 UTC

> One approach would be to only allow the use of add_to_set if value_destroy is
> NULL

There's no point.

Using g_free as key and value destructor and inserting with key==value has
always been a Bad[tm] idea.  That has nothing to do with set functionality.

Just don't do it.

Comment 48 Allison Karlitskaya (desrt) 2011-05-05 15:51:25 UTC

I thought about this a bit more and figured out a way that we could dodge the issue somewhat.

I don't think it could be seen as a substantial API break if we failed to call the value destroy notify on NULL pointers.  Probably there are two situations here anyway:

 - we fail to call a function that would do nothing anyway

or

 - we fail to call a function that would have crashed due to NULL

both are fine and it's hard to imagine a case where destroy(NULL) really needs to be called.



Then we could store NULL as the value for sets (and adjust the code accordingly).  This would never cause problems of any kind since we wouldn't call the destroy notify.

My original reason for favouring (void*)1 instead of NULL was so that you could use g_hash_table_lookup() to query the existence of an item in the set but this can clearly be solved with the contains() function.

The *only* problem that can be imagined here is that the user assumes that the return value of g_hash_table_lookup() is non-NULL and they call it on a key that they added with add_to_set().  That's really impossible to protect against (and particularly when you consider that NULL is also used to simple signal that the key is not in the hashtable in the first place).

I'll write a patch implementing my suggestion.  You can take it or leave it.

Comment 49 Benjamin Otte (Company) 2011-05-05 15:55:21 UTC

Apart from the fact that I don't think we need separate set API on GHashTable, I do agree with Morten: If you create a set hash table with two destroy notifies, you deserve what you get.

In particular, adding magic NULL special cases to the general code to break a little less when people to stupid things is not a good idea IMO.

Comment 50 Allison Karlitskaya (desrt) 2011-05-05 17:29:38 UTC

Created attachment 187301 [details] [review]
GHashTable: tweak sets

Comment 51 Allison Karlitskaya (desrt) 2011-05-06 08:05:41 UTC

(In reply to comment #49)
> If you create a set hash table with two destroy
> notifies, you deserve what you get.

Your language binding may do this for you.

Comment 52 Benjamin Otte (Company) 2011-05-06 08:41:56 UTC

If anything, this is more a reason for separating hash tables and sets than it is for messing up destroy notifies.

Comment 53 Morten Welinder 2011-05-07 18:43:01 UTC

> Your language binding may do this for you.

I don't get it.  They did not do so yesterday so why will they suddenly
start doing it now?

Note, that item->item mappings are far more useful that just item->1 mappings.
With the former you can still search for an entry when by supplying just enough
information to satisfy equal+hash.  The lookup function will then give you the
full entry.

For example, in my case -- a set of cells in a spreadsheet -- I simply supply
a cell structure with only the coordinates filled in to the lookup function.
That gives me the cell that is actually in the hash.