After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 687161 - IndividualAggregator performance issue
IndividualAggregator performance issue
Status: RESOLVED FIXED
Product: folks
Classification: Platform
Component: libfolks
git master
Other Linux
: Normal normal
: Unset
Assigned To: Simon McVittie
folks-maint
Depends on:
Blocks: 671995
 
 
Reported: 2012-10-30 00:26 UTC by Jeremy Whiting
Modified: 2017-08-08 18:34 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Improve performance of IndividualAggregator (4.24 KB, patch)
2012-10-30 00:26 UTC, Jeremy Whiting
rejected Details | Review
simple test case (3.63 KB, text/plain)
2012-10-30 09:42 UTC, Xavier Claessens
  Details
WiP: IndividualAggregator: use a GHashTable<string,GPtrArray> for the link map (10.42 KB, patch)
2013-03-25 20:57 UTC, Simon McVittie
committed Details | Review
fixup "IndividualAggregator: use a GHashTable<string,GPtrArray> for the link map" (2.75 KB, patch)
2013-03-28 12:10 UTC, Simon McVittie
committed Details | Review
IndividualAggregator: assert and assume that _link_map[key][i] != null (3.90 KB, patch)
2013-03-28 12:26 UTC, Simon McVittie
committed Details | Review
IndividualAggregator: be a bit more careful about nullability (2.51 KB, patch)
2013-03-28 12:28 UTC, Simon McVittie
committed Details | Review
By default, do debug output iff at least one debug domain is enabled (1.23 KB, patch)
2013-04-02 16:54 UTC, Simon McVittie
committed Details | Review
backends: when adding a persona by details, do not assume HashSet (2.66 KB, patch)
2013-04-02 18:34 UTC, Simon McVittie
committed Details | Review
Do not explicitly link libfolks-internal.la into things (4.59 KB, patch)
2013-04-02 18:34 UTC, Simon McVittie
committed Details | Review
Folks.SmallSet: add and test (46.81 KB, patch)
2013-04-02 18:38 UTC, Simon McVittie
committed Details | Review
folks, backends: use "small sets" instead of hash sets most of the time (63.67 KB, patch)
2013-04-02 18:41 UTC, Simon McVittie
committed Details | Review
Add fast-path iteration for SmallSet (3.94 KB, patch)
2013-04-02 18:47 UTC, Simon McVittie
committed Details | Review
FolksSmallSet: add a doc-comment (1.57 KB, patch)
2013-04-03 17:12 UTC, Simon McVittie
committed Details | Review
FolksSmallSet: simplify copy() by removing generic typing (3.28 KB, patch)
2013-04-03 17:13 UTC, Simon McVittie
committed Details | Review
FolksSmallSet: coding style (666 bytes, patch)
2013-04-03 17:13 UTC, Simon McVittie
committed Details | Review
FolksSmallSet: cope with copying a non-set iterable with no free-function (958 bytes, patch)
2013-04-03 17:14 UTC, Simon McVittie
committed Details | Review
FolksSmallSet: Iterator.valid is boolean, not int (766 bytes, patch)
2013-04-03 17:14 UTC, Simon McVittie
committed Details | Review
SmallSet: amend comment (836 bytes, patch)
2013-04-03 17:14 UTC, Simon McVittie
committed Details | Review
Trf.PersonaStore: amend comment about not using SmallSet (1.14 KB, patch)
2013-04-03 17:15 UTC, Simon McVittie
committed Details | Review
Use SmallSet.copy() where it makes sense (4.67 KB, patch)
2013-04-03 17:24 UTC, Simon McVittie
committed Details | Review
Declare folks_small_set_copy properly (857 bytes, patch)
2013-04-04 17:34 UTC, Simon McVittie
committed Details | Review

Description Jeremy Whiting 2012-10-30 00:26:28 UTC
Created attachment 227604 [details] [review]
Improve performance of IndividualAggregator

Xavier Classens told me today a performance issue in IndividualAggregator comes from _remove_individuals_from_link_map.  Upon investigating I found we are calling that method which itself iterates over a HashMultiMap from within loops in _add_personas and _personas_changed_cb.  

In an effort to make this faster, I thought we could pass a Collection of Individuals into the method, and while iterating over the HashMultiMap checking against the whole list of Individuals we are interested in.  Note this only improves performance if the slow down was caused by iterating over the HashMultiMap multiple times.  I.E. instead of iterating over Collection A and iterating over HashMultiMap B inside we iterate over HashMultiMap B and iterate over Collection A inside.

I've attached a patch that builds and runs but haven't ran any performance checks on it.  Xavier, if you could do that, or show me how to that would be great.
Comment 1 Xavier Claessens 2012-10-30 09:42:09 UTC
sadly, this patch actually makes things worse... sorry :/
Comment 2 Xavier Claessens 2012-10-30 09:42:36 UTC
Created attachment 227620 [details]
simple test case
Comment 3 Xavier Claessens 2012-10-30 09:48:51 UTC
Without the patch:
real	0m1.555s
user	0m1.716s
sys	0m0.048s

With your patch:
real	0m5.759s
user	0m5.916s
sys	0m0.052s
Comment 4 Philip Withnall 2012-11-03 16:29:02 UTC
Suggestion: change IndividualAggregator._link_map’s type from MultiMap<string, Individual> to MultiMap<string, Persona> to indirect the Individuals.

The problem with _remove_individual_from_link_map() is that it has to iterate over the entire link map to find the values which match the individual to be removed. In the case you highlighted, it’s because that Individual pointer is being replaced by another one which will use exactly the same keys in the link_map (plus potentially some extras). Making the link_map map keys to Personas (rather than to Individuals) gives a layer of indirection over the Individuals.

Individuals can now be accessed using persona.individual, and afaik those pointers should be valid and correct at that point during linking. (This should be checked though.)

I hope that makes sense.
Comment 5 Philip Withnall 2012-12-30 18:26:09 UTC
Comment on attachment 227604 [details] [review]
Improve performance of IndividualAggregator

Rejecting as per comment #1.
Comment 6 Simon McVittie 2013-03-25 20:49:45 UTC
(In reply to comment #4)
> The problem with _remove_individual_from_link_map() is that it has to iterate
> over the entire link map to find the values which match the individual to be
> removed.

It turns out this can be made an order of magnitude faster by just not using Gee for the _link_map, so it might not be necessary to do this (but I'll try it anyway.)
Comment 7 Simon McVittie 2013-03-25 20:57:59 UTC
Created attachment 239831 [details] [review]
 WiP: IndividualAggregator: use a  GHashTable<string,GPtrArray> for the link map

The link map is a hot path, particularly when matching multiple copies
of the same contact. Speed this up.

---

This takes the "clones" and "clones-one-individual" test cases in tests/eds/perf (3x500 triples contacts where each triple is identical apart from the Evolution UID) from about 10 seconds down to about 1.7 seconds, and the other test cases (500 different contacts) from about 0.4 to 0.37 seconds.

This is work-in-progress. I haven't yet tried your suggestion from this bug, profiled the 500-contact cases properly, tried with real-world data, or tried with G(S)List instead of GPtrArray (which might also be a win).

remove_individual_from_link_map() was previously accounting for over 70% of samples in callgrind in this synthetic test (which failed due to timing out), and tens of percent in sysprof when Arun tried it in jhbuild with real Telepathy accounts.
Comment 8 Simon McVittie 2013-03-26 20:05:49 UTC
(In reply to comment #7)
> This is work-in-progress. I haven't yet tried your suggestion from this bug,
> profiled the 500-contact cases properly, tried with real-world data, or tried
> with G(S)List instead of GPtrArray (which might also be a win).

It's a pretty clear win with synthetic data, at least (10 -> 1.7 seconds for the "bad case" of 3x500 matched contacts; ~4.2 -> ~3.7 seconds for the rest), so I'm inclined to say it's ready for review.

I haven't tried with G(S)List, but I'm inclined to avoid that anyway, because Vala.

Testing with real data isn't working very well for me - the test '_test_preprepared' usually either crashes with "Couldn't cache avatar" or stalls while preparing the e-d-s source - but on the tests that do run, there's a similar performance gain. So that's encouraging.

Some other bits and pieces that have potential for improvement:

* edsf_persona_real_linkable_property_to_links() is spending a lot of time playing with Gee iterators

* edsf_persona_constructor() spends a lot of time in _edsf_persona_update()

* folks_individual_set_personas() appears to be connecting 271,500 signals across the 8000 individuals tested
Comment 9 Simon McVittie 2013-03-26 20:10:02 UTC
(In reply to comment #4)
> Suggestion: change IndividualAggregator._link_map’s type from MultiMap<string,
> Individual> to MultiMap<string, Persona> to indirect the Individuals.
> 
> The problem with _remove_individual_from_link_map() is that it has to iterate
> over the entire link map to find the values which match the individual to be
> removed.

_remove_individual_from_link_map() is down from around 70% to 2.3% in callgrind, so it's no longer the bottleneck that it used to be. edsf_persona_real_linkable_property_to_links() is now lower-hanging fruit.
Comment 10 Philip Withnall 2013-03-28 07:27:01 UTC
Review of attachment 239831 [details] [review]:

Looking good. I haven’t had a chance to compile/test this. I assume the test suites all pass with it applied? This is great work; let’s get it merged soon so that it gets plenty of testing throughout the 3.10 cycle.

(In reply to comment #9)
> edsf_persona_real_linkable_property_to_links() is now lower-hanging fruit.

There are *_linkable_property_to_links() implementations in every subclass of Persona. I think this design was required by Vala limitations at the time. It would make sense to consolidate them all somewhere before starting to optimise, so the optimisations help every backend, not just EDS. Consolidation might also allow you to remove a virtual method call, which would be a performance win. Inlining the method (e.g. in IndividualAggregator) would also mean it’s not called inside a loop over each Persona’s (linkable) properties, which would also be a performance win.

::: folks/individual-aggregator.vala
@@ +121,3 @@
   private unowned PersonaStore? _primary_store = null;
   private HashSet<Backend> _backends;
+  private HashTable<string, GenericArray<Individual>> _link_map;

It should be documented that this needs to have MultiMap semantics; i.e. duplicate entries in a given GenericArray instance aren’t allowed, and each GenericArray must be non-empty.

It should probably also be noted that strong references are held to Individuals in the link map. I tried to remove this requirement a while ago, but things broke, so it’s better explicitly documented.

@@ +519,3 @@
       debug.indent ();
 
+      var iter = HashTableIter<string,GenericArray<Individual>> (

Missing space before ‘GenericArray’.

@@ +1122,2 @@
                     {
+                      /* FIXME: can this really be null? */

Doesn’t look like it could be NULL. Is Vala giving you a warning about it? Might be a Vala bug. You could try “unowned GenericArray<Individual> _candidates = (!) candidates” and see if indexing into _candidates gives the same problem.

@@ +1356,3 @@
     }
 
+  private void _link_map_set (string key, Individual individual)

Might want to briefly document this as adding an individual to the link map, just as if “MultiMap.set()” was being used.

@@ +1642,1 @@
       /* Validate the link map. */

It might be worthwhile adding some code below to validate that _link_map does actually have MultiMap semantics (e.g. every GenericArray is non-empty and contains no duplicates). Doesn’t matter about efficiency here, since this is only debug code.
Comment 11 Simon McVittie 2013-03-28 12:02:09 UTC
(In reply to comment #10)
> Looking good. I haven’t had a chance to compile/test this. I assume the test
> suites all pass with it applied?

Yes.

> (In reply to comment #9)
> > edsf_persona_real_linkable_property_to_links() is now lower-hanging fruit.
> 
> There are *_linkable_property_to_links() implementations in every subclass of
> Persona. I think this design was required by Vala limitations at the time. It
> would make sense to consolidate them all somewhere before starting to optimise,
> so the optimisations help every backend, not just EDS.

Perhaps - I'll check what they do.

> +  private HashTable<string, GenericArray<Individual>> _link_map;
> 
> It should be documented that this needs to have MultiMap semantics; i.e.
> duplicate entries in a given GenericArray instance aren’t allowed, and each
> GenericArray must be non-empty.

Sure.

> Missing space before ‘GenericArray’.

Fixed.

> +                      /* FIXME: can this really be null? */
> 
> Doesn’t look like it could be NULL. Is Vala giving you a warning about it?

No, but the existing code is explicitly allowing for null even though, based on the syntax/nullability, I don't think _link_map[key] can ever contain a null.

I didn't want to make the unrelated change of "assume non-null" in the same commit as this, so I just left a comment. I'd have just said "var" but I wasn't sure that that interacted nicely with an "unowned" accessor.

> Might want to briefly document this as adding an individual to the link map,
> just as if “MultiMap.set()” was being used.

Fixed.

> It might be worthwhile adding some code below to validate that _link_map does
> actually have MultiMap semantics

OK.

> Doesn’t matter about efficiency here, since this is
> only debug code.

You say that, but debug_output_enabled defaults to true and I can't find anything that makes it false, other than a library user explicitly saying Folks.Debug.dup ().debug_output_enabled = false... perhaps tests/eds/perf should explicitly turn it off?

On the other hand, what I really want to optimize is real-life uses of Folks (GNOME Contacts, Empathy, etc.) so tests/eds/perf should reflect the "most natural" thing for such applications to do, even if it's not as fast as it could be.

tests/eds/perf does not vary its runtime significantly when these checks are enabled or disabled. However, it is a relatively "good" case for these checks, because it adds all individuals simultaneously. The thing that would be made expensive by these checks is adding individuals one at a time, which would be O(individuals**2): perhaps this should be an additional test? (Add 50 clones of existing personas and 50 new ones, or something?)
Comment 12 Simon McVittie 2013-03-28 12:10:46 UTC
Created attachment 240034 [details] [review]
fixup "IndividualAggregator: use a  GHashTable<string,GPtrArray> for the link map"

In response to review - I'll squash it for commit but it's probably
easier to review like this.

From your review comments, I didn't (yet) fix the stuff about whether _link_map[i] can contain null.
Comment 13 Simon McVittie 2013-03-28 12:26:11 UTC
Created attachment 240036 [details] [review]
IndividualAggregator: assert and assume that  _link_map[key][i] != null

The link map's type is non-nullable, and when we add individuals
to the link map we do so through a function that does not allow
a null argument, so we don't need to sprinkle (!) casts around.
Comment 14 Simon McVittie 2013-03-28 12:28:02 UTC
Created attachment 240037 [details] [review]
IndividualAggregator: be a bit more careful about  nullability

This change is not sufficient for Folks or even individual-aggregator.vala
to compile successfully with --enable-experimental-non-null, but it
at least makes my recent patch not a regression in that department...

Unfortunately, valac --enable-experimental-non-null won't allow us to
assert at runtime that an allegedly non-null thing is, in fact, non-null.
Trust the compiler, the compiler is your friend?

---

Is there any way to make valac --enable-experimental-non-null allow comparison of non-nullable types with null - at least when it appears in an assertion - or perhaps downgrade nullability violations to a warning?
Comment 15 Philip Withnall 2013-03-28 19:39:36 UTC
(In reply to comment #11)
> > +                      /* FIXME: can this really be null? */
> > 
> > Doesn’t look like it could be NULL. Is Vala giving you a warning about it?
> 
> No, but the existing code is explicitly allowing for null even though, based on
> the syntax/nullability, I don't think _link_map[key] can ever contain a null.
> 
> I didn't want to make the unrelated change of "assume non-null" in the same
> commit as this, so I just left a comment. I'd have just said "var" but I wasn't
> sure that that interacted nicely with an "unowned" accessor.

Oh yes, so it does. I guess that’s left from an earlier refactoring.

> > Doesn’t matter about efficiency here, since this is
> > only debug code.
> 
> You say that, but debug_output_enabled defaults to true and I can't find
> anything that makes it false, other than a library user explicitly saying
> Folks.Debug.dup ().debug_output_enabled = false... perhaps tests/eds/perf
> should explicitly turn it off?

Eek. debug_output_enabled should default to false, and only be set to true in Debug.dup_with_flags() if a non-empty set of debug domains is provided (via G_MESSAGES_DEBUG). Would you mind fixing this? It shouldn’t change behaviour (apart from disabling redundant code in the IndividualAggregator), as the output is filtered out in debug() anyway if no debug domains are active.

> On the other hand, what I really want to optimize is real-life uses of Folks
> (GNOME Contacts, Empathy, etc.) so tests/eds/perf should reflect the "most
> natural" thing for such applications to do, even if it's not as fast as it
> could be.

Applications should definitely be executing with (debug_output_enabled == false). It’s a bug.

> tests/eds/perf does not vary its runtime significantly when these checks are
> enabled or disabled. However, it is a relatively "good" case for these checks,
> because it adds all individuals simultaneously. The thing that would be made
> expensive by these checks is adding individuals one at a time, which would be
> O(individuals**2): perhaps this should be an additional test? (Add 50 clones of
> existing personas and 50 new ones, or something?)

I’m not sure how realistic it is to add a lot of new Personas individually and in quick succession; that should only happen when a PersonaStore appears (or disappears), and it’s a bug if a store isn’t batching up changes like that.

If it’s not too much work, it can’t hurt to have such a test case, though.
Comment 16 Philip Withnall 2013-03-28 19:40:42 UTC
Review of attachment 240034 [details] [review]:

++
Comment 17 Philip Withnall 2013-03-28 19:58:40 UTC
Review of attachment 240036 [details] [review]:

++, subject to merging with the next patch.
Comment 18 Philip Withnall 2013-03-28 20:02:48 UTC
Review of attachment 240037 [details] [review]:

::: folks/individual-aggregator.vala
@@ +1131,2 @@
                     {
+                      var candidate_ind = ((!) candidates)[i];

I guess it would be a bit neater to say:
    var _candidates = this._link_map.get (…);
    if (_candidates != null)
      {
        var candidates = (!) _candidates;

        for (uint i = 0; i < candidates.length; i++)
          …

but it’s not worth iterating the patch again. Please commit! :-)
Comment 19 Philip Withnall 2013-03-28 20:08:02 UTC
(In reply to comment #14)
> Is there any way to make valac --enable-experimental-non-null allow comparison
> of non-nullable types with null - at least when it appears in an assertion - or
> perhaps downgrade nullability violations to a warning?

You could try static casting to a nullable type:
    Individual my_non_nullable_thing = …;
    assert ((Individual?) my_non_nullable_thing != null);

It might be better to use a dynamic cast:
    Individual my_non_nullable_thing = …;
    assert ((my_non_nullable_thing as Individual?) != null);

but I have no idea if this will work syntax-wise. Dynamic casts differ from static casts in that they insert a GObject casting macro (e.g. the above should, IIRC, expand to FOLKS_INDIVIDUAL(my_non_nullable_thing)). Static casts don’t insert the macro, and just assume everything will be fine.

If casting doesn’t work, the following really should work (but ewwww):
    Individual my_non_nullable_thing = …;
    Individual? _my_non_nullable_thing = my_non_nullable_thing;
    assert (_my_non_nullable_thing != null);
Comment 20 Simon McVittie 2013-04-02 16:46:08 UTC
Comment on attachment 239831 [details] [review]
 WiP: IndividualAggregator: use a  GHashTable<string,GPtrArray> for the link map

Part of 2ec38a5d15
Comment 21 Simon McVittie 2013-04-02 16:46:35 UTC
Comment on attachment 240034 [details] [review]
fixup "IndividualAggregator: use a  GHashTable<string,GPtrArray> for the link map"

Part of 2ec38a5d15
Comment 22 Simon McVittie 2013-04-02 16:47:01 UTC
Comment on attachment 240036 [details] [review]
IndividualAggregator: assert and assume that  _link_map[key][i] != null

Committed as part of c438d6d00a
Comment 23 Simon McVittie 2013-04-02 16:47:04 UTC
Comment on attachment 240037 [details] [review]
IndividualAggregator: be a bit more careful about  nullability

Committed as part of c438d6d00a
Comment 24 Simon McVittie 2013-04-02 16:54:35 UTC
Created attachment 240412 [details] [review]
By default, do debug output iff at least one debug domain is  enabled

Previously, debug output (and expensive checks that are enabled alongside
debug output) was always enabled unless explicitly disabled, but
Philip noted that this was not intentional.

---

In response to Comment #15.
Comment 25 Simon McVittie 2013-04-02 18:34:21 UTC
Created attachment 240420 [details] [review]
backends: when adding a persona by details, do not  assume HashSet

All we need is a Set, so we shouldn't need to cast this strictly.
Comment 26 Simon McVittie 2013-04-02 18:34:42 UTC
Created attachment 240421 [details] [review]
Do not explicitly link libfolks-internal.la into things

libfolks.la embeds a copy of libfolks-internal.la, so every public
(or internal) symbol in libfolks-internal.la is available in
libfolks.la, even if linked with -Wl,--no-copy-dt-needed-entries.

If libfolks-internal.la defines a GObject type, then linking two copies
of it is actively harmful, because they can't both be registered with the
type system.
Comment 27 Simon McVittie 2013-04-02 18:38:05 UTC
Created attachment 240422 [details] [review]
Folks.SmallSet: add and test

---

_folks_small_set_new_take_array() is not bound into Vala. It's for FolksSmallMap and FolksSmallMultiMap, which I think are likely to be a bigger performance win than FolksSmallSet, but aren't finished yet.

The regression test might not have full coverage, but is thorough enough that it found Bug #696710 when I used HashSet to help reverse-engineer what API SmallSet should present :-)
Comment 28 Simon McVittie 2013-04-02 18:41:22 UTC
Created attachment 240423 [details] [review]
folks, backends: use "small sets" instead of hash sets  most of the time

Notable exceptions are:

* sets of personas that are, or might be, the entire set from a backend
  (possible future refinement: use a SmallSet if there aren't many)

* sets of potentially many Individuals (likewise)

* the object cache (potentially pretty large)

* debug stuff (not relevant for performance)

* the set of IM addresses being matched (we want to keep this O(1),
  but it should be a GHashTable)

This speeds up tests/eds/perf by around 5%.

---

I hope SmallMultiMap will be a bigger performance win in practice.
Comment 29 Simon McVittie 2013-04-02 18:47:49 UTC
Created attachment 240424 [details] [review]
Add fast-path iteration for SmallSet

---

The e-d-s part of this accounts for about a 10% time decrease.
Comment 30 Philip Withnall 2013-04-03 12:51:23 UTC
Review of attachment 240412 [details] [review]:

++
Comment 31 Philip Withnall 2013-04-03 12:52:15 UTC
Review of attachment 240420 [details] [review]:

++
Comment 32 Philip Withnall 2013-04-03 12:53:51 UTC
Review of attachment 240421 [details] [review]:

Looks good. How did you notice this was a problem?
Comment 33 Philip Withnall 2013-04-03 13:52:53 UTC
Review of attachment 240422 [details] [review]:

Looking good so far. A few minor comments.

::: folks/folks-generics.vapi
@@ +24,3 @@
+
+/* Unfortunately, we have to do this as a .vapi because going from C to
+ * GIR to Vala loses the generic types. */

Please add a FIXME referencing bug #639908 (about GIR support for generics).

::: folks/small-set.c
@@ +24,3 @@
+
+#include "folks/small-set.h"
+#include "folks/small-set-internal.h"

Would be useful to have a block comment at the top of here briefly explaining why SmallSet is needed, what it’s optimised (and *not* optimised) for, and its overall design (i.e. wrap a GPtrArray and present the same API as Gee.HashSet but without making use of the hashing).

@@ +190,3 @@
+      NULL);
+  other->items = g_ptr_array_ref (self->items);
+  other->flags = FOLKS_SMALL_SET_FLAG_READ_ONLY;

Might it be more future-proof to use:
    other->flags = self->flags | FOLKS_SMALL_SET_FLAG_READ_ONLY;
here, in case other flags are added in future?

@@ +211,3 @@
+
+  /* FIXME: benchmark whether giving self a weak ref to other is a
+   * performance win or not */

FIXME here.

@@ +283,3 @@
+{
+  FolksSmallSet *self = g_object_new (FOLKS_TYPE_SMALL_SET,
+      NULL);

There are a few places in the IndividualAggregator where we deliberately pass around empty Sets of Personas (such as the personas-changed signal IIRC). At the moment, we don’t make use of Set.empty(), but it might make sense to do so in order to save allocation time (especially if the immutable empty set was a singleton). Then again, personas-changed is only emitted a few tens of times before quiescence, so it would probably make very little difference.

@@ +330,3 @@
+{
+  FolksSmallSet *self;
+  GeeIterator *iter;

Would it be worth asserting that item_type subsumes the item type of the existing set?

@@ +347,3 @@
+      self->items = g_ptr_array_new_full (other->items->len,
+          item_free);
+      self->flags = 0;

For future-proofness, would this be better as:
    self->flags = (other->flags & ~FOLKS_SMALL_SET_FLAG_READ_ONLY);

@@ +352,3 @@
+        {
+          g_ptr_array_add (self->items,
+              item_dup (g_ptr_array_index (other->items, i)));

This path uses the new item_dup() function, whereas the two slow paths below (implicitly) use the item_dup() function from the old set. For consistency, we probably want to use the old set’s function in this path too. (I would hope they’re the same function, but who knows.)

@@ +376,3 @@
+      /* Do it the hard way: there might be duplicates. */
+
+      while (gee_iterator_next (iter))

Errant new line.

@@ +381,3 @@
+
+          if (_find (self, item, NULL))
+            item_free (item);

Again, this is using the item_free() function from the new set rather than the old one.

@@ +516,3 @@
+    copy = self->item_dup ((gpointer) item);
+
+  g_ptr_array_add (self->items, copy);

Are there many cases in folks where we add a Set of elements to a SmallSet (i.e. construct the set union)? If so, it might make sense to implement Gee.Collection.add_all() and add a fast path which is O(n) rather than O(n^2). Using add_all() is currently O(n^2) (for n elements in the set being added) because it unnecessarily checks each of the new elements against each of the new elements which have already been appended to the array. Though for small sets this may not be a problem.

@@ +662,3 @@
+    {
+      case ITER_PROP_VALID:
+        g_value_set_int (value, _iterator_is_valid (self));

Shouldn’t this be g_value_set_boolean()? Gee says the property is has bool type: http://www.valadoc.org/#!api=libgee-0.8/Gee.Iterator.valid

::: tests/folks/small-set.vala
@@ +101,3 @@
+        });
+
+      /* GNOME #696710: this is wrong in HashSet */

Please use the phrase ‘FIXME’ so things like this are greppable.
Comment 34 Philip Withnall 2013-04-03 14:10:54 UTC
Review of attachment 240423 [details] [review]:

Looking good. A few minor comments and thoughts. This is probably ready to commit after a quick check through once the SmallSet patch is ready to commit.

::: backends/tracker/lib/trf-persona-store.vala
@@ +871,3 @@
     {
+      /* FIXME: would changing this to SmallSet be an API break?
+       * It's documented that we use HashSet, and Trf.Persona assumes it */

Yeah, it would be an API break which we should avoid. Either leave it using HashSet, or create a private version of unserialize_local_ids() which uses SmallSet, and refactor the public version to be a wrapper around it which converts the SmallSet into a HashSet for any client program unfortunate enough to call unserialize_local_ids().

::: backends/tracker/lib/trf-persona.vala
@@ +560,3 @@
           AbstractFieldDetails<string>.hash_static,
           AbstractFieldDetails<string>.equal_static);
       this._phone_numbers_ro = this._phone_numbers.read_only_view;

Another idea for performance improvements (though completely unrelated to this bug): lazily allocate the read-only copies of sets and other data structures. Only allocate them when the relevant getter method is called. That could up to halve the number of GObject allocations we do if clients only access a few of the properties in a Persona or Individual.

::: folks/individual-aggregator.vala
@@ +1946,3 @@
+      var personas = new SmallSet<Persona> ();
+      /* FIXME: this is O(n**2) but if we know that individual.personas
+       * is a SmallSet, we can do it in O(n) */

See my earlier comment about Gee.Collection.add_all(). Even if we don’t implement it in SmallSet, this code should be changed to:
    personas.add_all (individual.personas);
I’m sure there are many other places throughout folks where this pattern of replacement could be used. Probably wherever there’s a comment saying that we need to copy a set before iteration. It would be useful to survey the code for those patterns and factor them out into some utility methods, since there are probably easy performance gains to be had there.
Comment 35 Philip Withnall 2013-04-03 14:14:06 UTC
Review of attachment 240424 [details] [review]:

++
Comment 36 Philip Withnall 2013-04-03 14:15:56 UTC
(In reply to comment #33)
> @@ +283,3 @@
> +{
> +  FolksSmallSet *self = g_object_new (FOLKS_TYPE_SMALL_SET,
> +      NULL);
> 
> There are a few places in the IndividualAggregator where we deliberately pass
> around empty Sets of Personas (such as the personas-changed signal IIRC). At
> the moment, we don’t make use of Set.empty(), but it might make sense to do so
> in order to save allocation time (especially if the immutable empty set was a
> singleton). Then again, personas-changed is only emitted a few tens of times
> before quiescence, so it would probably make very little difference.

…and attachment 240423 [details] [review] makes use of SmallSet.empty(). Great! Still might potentially improve things to make the empty set a singleton.
Comment 37 Simon McVittie 2013-04-03 15:41:56 UTC
(In reply to comment #32)
> Review of attachment 240421 [details] [review]:
> 
> Looks good. How did you notice this was a problem?

When I added FolksSmallSet, there were two copies of folks_small_set_get_type() in one of the tests (one in the shared libfolks and one in the executable) and badness ensued.

(In reply to comment #33)
> Please add a FIXME referencing bug #639908 (about GIR support for generics).

Will do.

> Would be useful to have a block comment at the top of here briefly explaining
> why SmallSet is needed, what it’s optimised (and *not* optimised) for, and its
> overall design (i.e. wrap a GPtrArray and present the same API as Gee.HashSet
> but without making use of the hashing).

OK.

> +  other->flags = FOLKS_SMALL_SET_FLAG_READ_ONLY;
> 
> Might it be more future-proof to use:
>     other->flags = self->flags | FOLKS_SMALL_SET_FLAG_READ_ONLY;
> here, in case other flags are added in future?

I don't think I can say authoritatively whether future flags will be appropriate to copy or not until they're actually added.

> +  /* FIXME: benchmark whether giving self a weak ref to other is a
> +   * performance win or not */
> 
> FIXME here.

Noted. Without fixing this it's still a performance win, though.

> There are a few places in the IndividualAggregator where we deliberately pass
> around empty Sets of Personas (such as the personas-changed signal IIRC). At
> the moment, we don’t make use of Set.empty(), but it might make sense to do so
> in order to save allocation time (especially if the immutable empty set was a
> singleton). Then again, personas-changed is only emitted a few tens of times
> before quiescence, so it would probably make very little difference.

It's worth a go, I suppose. Do you mind it being a one-per-process "leak"?

> 
> @@ +330,3 @@
> +{
> +  FolksSmallSet *self;
> +  GeeIterator *iter;
> 
> Would it be worth asserting that item_type subsumes the item type of the
> existing set?

I suppose so. I haven't made much use of copy() yet: replacing some add_all() with copy() is another potential speedup.

> For future-proofness, would this be better as:
>     self->flags = (other->flags & ~FOLKS_SMALL_SET_FLAG_READ_ONLY);

Without knowing the semantics of hypothetical future flags I can't answer that. I'm inclined to leave it as-is until we find that we need another flag.

> @@ +352,3 @@
> +        {
> +          g_ptr_array_add (self->items,
> +              item_dup (g_ptr_array_index (other->items, i)));
> 
> This path uses the new item_dup() function, whereas the two slow paths below
> (implicitly) use the item_dup() function from the old set.

Hmm, that's true... we could just use _dup (other, i).

> Errant new line.

Noted.

> Again, this is using the item_free() function from the new set rather than the
> old one.

If that's a problem, then we've already lost, because we're dup'ing an item from the old set and inserting it into a GPtrArray whose free-function is the new free-function.

I don't want to check for equality, because stuff like -Bsymbolic can result in having two different pointers to what is fundamentally the same free-function.

I suppose we could make copy() not be generic, and retrieve the type and copy- and free-functions from the old container? That would remove any potential problem by only having one set. That would also change its Vala API from copy<K,V>(iterable, hash, equals) to copy(iterable, hash, equals), which is somewhat nicer anyway.

> @@ +516,3 @@
> +    copy = self->item_dup ((gpointer) item);
> +
> +  g_ptr_array_add (self->items, copy);
> 
> Are there many cases in folks where we add a Set of elements to a SmallSet
> (i.e. construct the set union)?

All the ones I've spotted so far are add_all() (or, worse, iterative adding) into an empty set, which makes copy() a better fast-path.

The major advantage of copy() over add_all() (which is, as you note, a union operation) is that we know the "left hand side of the union" is an empty set, so if the "right-hand side" is a Set, we don't need to test for uniqueness at all.

> +        g_value_set_int (value, _iterator_is_valid (self));
> 
> Shouldn’t this be g_value_set_boolean()?

Yes, it should. Looks like I didn't test that: writing the test in Vala means it compiles to gee_iterator_get_valid() or something, rather than falling all the way back to g_object_get().

> +      /* GNOME #696710: this is wrong in HashSet */
> 
> Please use the phrase ‘FIXME’ so things like this are greppable.

Will do. Bug #696710 has in fact been fixed now, but there doesn't seem any point in adding a versioned dependency just for this.
Comment 38 Simon McVittie 2013-04-03 15:45:21 UTC
(In reply to comment #34)
> Yeah, it would be an API break which we should avoid. Either leave it using
> HashSet, or create a private version of unserialize_local_ids() which uses
> SmallSet

I don't think unserialize_local_ids() shows up in my callgrind results, so I'll leave it (until/unless I find that it's significant).

> Another idea for performance improvements (though completely unrelated to this
> bug): lazily allocate the read-only copies of sets and other data structures.
> Only allocate them when the relevant getter method is called.

Yes, that's on my list.

> I’m sure there are many other places throughout folks where this pattern of
> replacement could be used. Probably wherever there’s a comment saying that we
> need to copy a set before iteration.

I think copy() is somewhat better than add_all() where it's usable, but yes, using add_all() is the next best thing.
Comment 39 Simon McVittie 2013-04-03 16:08:49 UTC
Comment on attachment 240412 [details] [review]
By default, do debug output iff at least one debug domain is  enabled

5226ce939
Comment 40 Simon McVittie 2013-04-03 16:09:07 UTC
Comment on attachment 240420 [details] [review]
backends: when adding a persona by details, do not  assume HashSet

e5b07508
Comment 41 Simon McVittie 2013-04-03 16:09:20 UTC
Comment on attachment 240421 [details] [review]
Do not explicitly link libfolks-internal.la into things

fcfd16cbc
Comment 42 Simon McVittie 2013-04-03 17:12:49 UTC
Created attachment 240508 [details] [review]
FolksSmallSet: add a doc-comment
Comment 43 Simon McVittie 2013-04-03 17:13:05 UTC
Created attachment 240509 [details] [review]
FolksSmallSet: simplify copy() by removing generic  typing

If we pick up the generic type, copy and free functions from the
iterable, then we don't need to worry about whether ours and theirs
are consistent: by definition, they are.
Comment 44 Simon McVittie 2013-04-03 17:13:18 UTC
Created attachment 240510 [details] [review]
FolksSmallSet: coding style
Comment 45 Simon McVittie 2013-04-03 17:14:09 UTC
Created attachment 240511 [details] [review]
FolksSmallSet: cope with copying a non-set iterable  with no free-function

---

This would previously have crashed. This, and the rest of this round of patches, could be squashed into Attachment #240422 [details] if you want.
Comment 46 Simon McVittie 2013-04-03 17:14:23 UTC
Created attachment 240512 [details] [review]
FolksSmallSet: Iterator.valid is boolean, not int
Comment 47 Simon McVittie 2013-04-03 17:14:43 UTC
Created attachment 240513 [details] [review]
SmallSet: amend comment
Comment 48 Simon McVittie 2013-04-03 17:15:01 UTC
Created attachment 240514 [details] [review]
Trf.PersonaStore: amend comment about not using  SmallSet
Comment 49 Simon McVittie 2013-04-03 17:24:31 UTC
Created attachment 240516 [details] [review]
Use SmallSet.copy() where it makes sense

It isn't more than fractionally faster in tests/eds/perf, but might help.
Comment 50 Philip Withnall 2013-04-03 20:06:11 UTC
(In reply to comment #37)
> > +  other->flags = FOLKS_SMALL_SET_FLAG_READ_ONLY;
> > 
> > Might it be more future-proof to use:
> >     other->flags = self->flags | FOLKS_SMALL_SET_FLAG_READ_ONLY;
> > here, in case other flags are added in future?
> 
> I don't think I can say authoritatively whether future flags will be
> appropriate to copy or not until they're actually added.

Fair enough. :-)

> > There are a few places in the IndividualAggregator where we deliberately pass
> > around empty Sets of Personas (such as the personas-changed signal IIRC). At
> > the moment, we don’t make use of Set.empty(), but it might make sense to do so
> > in order to save allocation time (especially if the immutable empty set was a
> > singleton). Then again, personas-changed is only emitted a few tens of times
> > before quiescence, so it would probably make very little difference.
> 
> It's worth a go, I suppose. Do you mind it being a one-per-process "leak"?

I’d rather avoid it being a one-per-process leak, since various libraries have been trying to eliminate those recently and it complicates valgrinding (slightly). I don’t think there’s any other way to do it, though. It seems a bit too much to ask you to add a library-wide cleanup function just for this, so don’t bother with any of it. If empty() starts appearing on profiling traces, then we can do something about it.

> > For future-proofness, would this be better as:
> >     self->flags = (other->flags & ~FOLKS_SMALL_SET_FLAG_READ_ONLY);
> 
> Without knowing the semantics of hypothetical future flags I can't answer that.
> I'm inclined to leave it as-is until we find that we need another flag.

Righto.

> > Again, this is using the item_free() function from the new set rather than the
> > old one.
> 
> If that's a problem, then we've already lost, because we're dup'ing an item
> from the old set and inserting it into a GPtrArray whose free-function is the
> new free-function.
> 
> I don't want to check for equality, because stuff like -Bsymbolic can result in
> having two different pointers to what is fundamentally the same free-function.
> 
> I suppose we could make copy() not be generic, and retrieve the type and copy-
> and free-functions from the old container? That would remove any potential
> problem by only having one set. That would also change its Vala API from
> copy<K,V>(iterable, hash, equals) to copy(iterable, hash, equals), which is
> somewhat nicer anyway.

Sounds good. Please do so!

> > +        g_value_set_int (value, _iterator_is_valid (self));
> > 
> > Shouldn’t this be g_value_set_boolean()?
> 
> Yes, it should. Looks like I didn't test that: writing the test in Vala means
> it compiles to gee_iterator_get_valid() or something, rather than falling all
> the way back to g_object_get().

I think you can use my_object.get_property() in Vala to force it to use g_object_get(). It does mean you end up using GValues though, so pain.

> > +      /* GNOME #696710: this is wrong in HashSet */
> > 
> > Please use the phrase ‘FIXME’ so things like this are greppable.
> 
> Will do. Bug #696710 has in fact been fixed now, but there doesn't seem any
> point in adding a versioned dependency just for this.

No, no point in bumping our libgee dependency for this. Leave the FIXME in and we can come back to it.

(In reply to comment #38)
> (In reply to comment #34)
> > Yeah, it would be an API break which we should avoid. Either leave it using
> > HashSet, or create a private version of unserialize_local_ids() which uses
> > SmallSet
> 
> I don't think unserialize_local_ids() shows up in my callgrind results, so I'll
> leave it (until/unless I find that it's significant).

Righto.

> > I’m sure there are many other places throughout folks where this pattern of
> > replacement could be used. Probably wherever there’s a comment saying that we
> > need to copy a set before iteration.
> 
> I think copy() is somewhat better than add_all() where it's usable, but yes,
> using add_all() is the next best thing.

Agreed wholeheartedly.
Comment 51 Philip Withnall 2013-04-03 20:07:12 UTC
Review of attachment 240508 [details] [review]:

Great, thanks.
Comment 52 Philip Withnall 2013-04-03 20:08:21 UTC
Review of attachment 240509 [details] [review]:

++
Comment 53 Philip Withnall 2013-04-03 20:08:38 UTC
Review of attachment 240510 [details] [review]:

++
Comment 54 Philip Withnall 2013-04-03 20:09:11 UTC
Review of attachment 240511 [details] [review]:

++
Comment 55 Philip Withnall 2013-04-03 20:09:35 UTC
Review of attachment 240512 [details] [review]:

++
Comment 56 Philip Withnall 2013-04-03 20:09:56 UTC
Review of attachment 240513 [details] [review]:

++
Comment 57 Philip Withnall 2013-04-03 20:10:22 UTC
Review of attachment 240514 [details] [review]:

++
Comment 58 Philip Withnall 2013-04-03 20:12:21 UTC
Review of attachment 240516 [details] [review]:

Nice.
Comment 59 Philip Withnall 2013-04-03 20:16:09 UTC
(In reply to comment #33)
> Review of attachment 240422 [details] [review]:
> 
> ::: folks/folks-generics.vapi
> @@ +24,3 @@
> +
> +/* Unfortunately, we have to do this as a .vapi because going from C to
> + * GIR to Vala loses the generic types. */
> 
> Please add a FIXME referencing bug #639908 (about GIR support for generics).

I haven’t noticed a fix for this; with a patch for that squashed in, please commit everything. Thanks!
Comment 60 Simon McVittie 2013-04-04 17:34:31 UTC
Created attachment 240646 [details] [review]
Declare folks_small_set_copy properly

---

Unfortunately, Folks won't build with -Werror=implicit, because plugins use "internal" things like folks_persona_store_set_is_user_set_default() which don't turn up in the header file.
Comment 61 Philip Withnall 2013-04-04 19:02:08 UTC
Review of attachment 240646 [details] [review]:

++
Comment 62 Simon McVittie 2013-04-05 13:43:49 UTC
Comment on attachment 240646 [details] [review]
Declare folks_small_set_copy properly

f18d26283
Comment 63 Travis Reitter 2013-04-24 17:32:31 UTC
Jeremy, Xavier: can you try out the latest git master version of Folks to see if Simon's patches make the performance here acceptable?
Comment 64 Alexandre Franke 2017-07-17 18:30:18 UTC
(In reply to Travis Reitter from comment #63)
> Jeremy, Xavier: can you try out the latest git master version of Folks to
> see if Simon's patches make the performance here acceptable?

Jeremy, Xavier, can we close this?
Comment 65 Philip Withnall 2017-08-08 18:34:56 UTC
Yes, let’s.