GNOME Bugzilla – Bug 687161
IndividualAggregator performance issue
Last modified: 2017-08-08 18:34:56 UTC
Created attachment 227604 [details] [review] Improve performance of IndividualAggregator Xavier Classens told me today a performance issue in IndividualAggregator comes from _remove_individuals_from_link_map. Upon investigating I found we are calling that method which itself iterates over a HashMultiMap from within loops in _add_personas and _personas_changed_cb. In an effort to make this faster, I thought we could pass a Collection of Individuals into the method, and while iterating over the HashMultiMap checking against the whole list of Individuals we are interested in. Note this only improves performance if the slow down was caused by iterating over the HashMultiMap multiple times. I.E. instead of iterating over Collection A and iterating over HashMultiMap B inside we iterate over HashMultiMap B and iterate over Collection A inside. I've attached a patch that builds and runs but haven't ran any performance checks on it. Xavier, if you could do that, or show me how to that would be great.
sadly, this patch actually makes things worse... sorry :/
Created attachment 227620 [details] simple test case
Without the patch: real 0m1.555s user 0m1.716s sys 0m0.048s With your patch: real 0m5.759s user 0m5.916s sys 0m0.052s
Suggestion: change IndividualAggregator._link_map’s type from MultiMap<string, Individual> to MultiMap<string, Persona> to indirect the Individuals. The problem with _remove_individual_from_link_map() is that it has to iterate over the entire link map to find the values which match the individual to be removed. In the case you highlighted, it’s because that Individual pointer is being replaced by another one which will use exactly the same keys in the link_map (plus potentially some extras). Making the link_map map keys to Personas (rather than to Individuals) gives a layer of indirection over the Individuals. Individuals can now be accessed using persona.individual, and afaik those pointers should be valid and correct at that point during linking. (This should be checked though.) I hope that makes sense.
Comment on attachment 227604 [details] [review] Improve performance of IndividualAggregator Rejecting as per comment #1.
(In reply to comment #4) > The problem with _remove_individual_from_link_map() is that it has to iterate > over the entire link map to find the values which match the individual to be > removed. It turns out this can be made an order of magnitude faster by just not using Gee for the _link_map, so it might not be necessary to do this (but I'll try it anyway.)
Created attachment 239831 [details] [review] WiP: IndividualAggregator: use a GHashTable<string,GPtrArray> for the link map The link map is a hot path, particularly when matching multiple copies of the same contact. Speed this up. --- This takes the "clones" and "clones-one-individual" test cases in tests/eds/perf (3x500 triples contacts where each triple is identical apart from the Evolution UID) from about 10 seconds down to about 1.7 seconds, and the other test cases (500 different contacts) from about 0.4 to 0.37 seconds. This is work-in-progress. I haven't yet tried your suggestion from this bug, profiled the 500-contact cases properly, tried with real-world data, or tried with G(S)List instead of GPtrArray (which might also be a win). remove_individual_from_link_map() was previously accounting for over 70% of samples in callgrind in this synthetic test (which failed due to timing out), and tens of percent in sysprof when Arun tried it in jhbuild with real Telepathy accounts.
(In reply to comment #7) > This is work-in-progress. I haven't yet tried your suggestion from this bug, > profiled the 500-contact cases properly, tried with real-world data, or tried > with G(S)List instead of GPtrArray (which might also be a win). It's a pretty clear win with synthetic data, at least (10 -> 1.7 seconds for the "bad case" of 3x500 matched contacts; ~4.2 -> ~3.7 seconds for the rest), so I'm inclined to say it's ready for review. I haven't tried with G(S)List, but I'm inclined to avoid that anyway, because Vala. Testing with real data isn't working very well for me - the test '_test_preprepared' usually either crashes with "Couldn't cache avatar" or stalls while preparing the e-d-s source - but on the tests that do run, there's a similar performance gain. So that's encouraging. Some other bits and pieces that have potential for improvement: * edsf_persona_real_linkable_property_to_links() is spending a lot of time playing with Gee iterators * edsf_persona_constructor() spends a lot of time in _edsf_persona_update() * folks_individual_set_personas() appears to be connecting 271,500 signals across the 8000 individuals tested
(In reply to comment #4) > Suggestion: change IndividualAggregator._link_map’s type from MultiMap<string, > Individual> to MultiMap<string, Persona> to indirect the Individuals. > > The problem with _remove_individual_from_link_map() is that it has to iterate > over the entire link map to find the values which match the individual to be > removed. _remove_individual_from_link_map() is down from around 70% to 2.3% in callgrind, so it's no longer the bottleneck that it used to be. edsf_persona_real_linkable_property_to_links() is now lower-hanging fruit.
Review of attachment 239831 [details] [review]: Looking good. I haven’t had a chance to compile/test this. I assume the test suites all pass with it applied? This is great work; let’s get it merged soon so that it gets plenty of testing throughout the 3.10 cycle. (In reply to comment #9) > edsf_persona_real_linkable_property_to_links() is now lower-hanging fruit. There are *_linkable_property_to_links() implementations in every subclass of Persona. I think this design was required by Vala limitations at the time. It would make sense to consolidate them all somewhere before starting to optimise, so the optimisations help every backend, not just EDS. Consolidation might also allow you to remove a virtual method call, which would be a performance win. Inlining the method (e.g. in IndividualAggregator) would also mean it’s not called inside a loop over each Persona’s (linkable) properties, which would also be a performance win. ::: folks/individual-aggregator.vala @@ +121,3 @@ private unowned PersonaStore? _primary_store = null; private HashSet<Backend> _backends; + private HashTable<string, GenericArray<Individual>> _link_map; It should be documented that this needs to have MultiMap semantics; i.e. duplicate entries in a given GenericArray instance aren’t allowed, and each GenericArray must be non-empty. It should probably also be noted that strong references are held to Individuals in the link map. I tried to remove this requirement a while ago, but things broke, so it’s better explicitly documented. @@ +519,3 @@ debug.indent (); + var iter = HashTableIter<string,GenericArray<Individual>> ( Missing space before ‘GenericArray’. @@ +1122,2 @@ { + /* FIXME: can this really be null? */ Doesn’t look like it could be NULL. Is Vala giving you a warning about it? Might be a Vala bug. You could try “unowned GenericArray<Individual> _candidates = (!) candidates” and see if indexing into _candidates gives the same problem. @@ +1356,3 @@ } + private void _link_map_set (string key, Individual individual) Might want to briefly document this as adding an individual to the link map, just as if “MultiMap.set()” was being used. @@ +1642,1 @@ /* Validate the link map. */ It might be worthwhile adding some code below to validate that _link_map does actually have MultiMap semantics (e.g. every GenericArray is non-empty and contains no duplicates). Doesn’t matter about efficiency here, since this is only debug code.
(In reply to comment #10) > Looking good. I haven’t had a chance to compile/test this. I assume the test > suites all pass with it applied? Yes. > (In reply to comment #9) > > edsf_persona_real_linkable_property_to_links() is now lower-hanging fruit. > > There are *_linkable_property_to_links() implementations in every subclass of > Persona. I think this design was required by Vala limitations at the time. It > would make sense to consolidate them all somewhere before starting to optimise, > so the optimisations help every backend, not just EDS. Perhaps - I'll check what they do. > + private HashTable<string, GenericArray<Individual>> _link_map; > > It should be documented that this needs to have MultiMap semantics; i.e. > duplicate entries in a given GenericArray instance aren’t allowed, and each > GenericArray must be non-empty. Sure. > Missing space before ‘GenericArray’. Fixed. > + /* FIXME: can this really be null? */ > > Doesn’t look like it could be NULL. Is Vala giving you a warning about it? No, but the existing code is explicitly allowing for null even though, based on the syntax/nullability, I don't think _link_map[key] can ever contain a null. I didn't want to make the unrelated change of "assume non-null" in the same commit as this, so I just left a comment. I'd have just said "var" but I wasn't sure that that interacted nicely with an "unowned" accessor. > Might want to briefly document this as adding an individual to the link map, > just as if “MultiMap.set()” was being used. Fixed. > It might be worthwhile adding some code below to validate that _link_map does > actually have MultiMap semantics OK. > Doesn’t matter about efficiency here, since this is > only debug code. You say that, but debug_output_enabled defaults to true and I can't find anything that makes it false, other than a library user explicitly saying Folks.Debug.dup ().debug_output_enabled = false... perhaps tests/eds/perf should explicitly turn it off? On the other hand, what I really want to optimize is real-life uses of Folks (GNOME Contacts, Empathy, etc.) so tests/eds/perf should reflect the "most natural" thing for such applications to do, even if it's not as fast as it could be. tests/eds/perf does not vary its runtime significantly when these checks are enabled or disabled. However, it is a relatively "good" case for these checks, because it adds all individuals simultaneously. The thing that would be made expensive by these checks is adding individuals one at a time, which would be O(individuals**2): perhaps this should be an additional test? (Add 50 clones of existing personas and 50 new ones, or something?)
Created attachment 240034 [details] [review] fixup "IndividualAggregator: use a GHashTable<string,GPtrArray> for the link map" In response to review - I'll squash it for commit but it's probably easier to review like this. From your review comments, I didn't (yet) fix the stuff about whether _link_map[i] can contain null.
Created attachment 240036 [details] [review] IndividualAggregator: assert and assume that _link_map[key][i] != null The link map's type is non-nullable, and when we add individuals to the link map we do so through a function that does not allow a null argument, so we don't need to sprinkle (!) casts around.
Created attachment 240037 [details] [review] IndividualAggregator: be a bit more careful about nullability This change is not sufficient for Folks or even individual-aggregator.vala to compile successfully with --enable-experimental-non-null, but it at least makes my recent patch not a regression in that department... Unfortunately, valac --enable-experimental-non-null won't allow us to assert at runtime that an allegedly non-null thing is, in fact, non-null. Trust the compiler, the compiler is your friend? --- Is there any way to make valac --enable-experimental-non-null allow comparison of non-nullable types with null - at least when it appears in an assertion - or perhaps downgrade nullability violations to a warning?
(In reply to comment #11) > > + /* FIXME: can this really be null? */ > > > > Doesn’t look like it could be NULL. Is Vala giving you a warning about it? > > No, but the existing code is explicitly allowing for null even though, based on > the syntax/nullability, I don't think _link_map[key] can ever contain a null. > > I didn't want to make the unrelated change of "assume non-null" in the same > commit as this, so I just left a comment. I'd have just said "var" but I wasn't > sure that that interacted nicely with an "unowned" accessor. Oh yes, so it does. I guess that’s left from an earlier refactoring. > > Doesn’t matter about efficiency here, since this is > > only debug code. > > You say that, but debug_output_enabled defaults to true and I can't find > anything that makes it false, other than a library user explicitly saying > Folks.Debug.dup ().debug_output_enabled = false... perhaps tests/eds/perf > should explicitly turn it off? Eek. debug_output_enabled should default to false, and only be set to true in Debug.dup_with_flags() if a non-empty set of debug domains is provided (via G_MESSAGES_DEBUG). Would you mind fixing this? It shouldn’t change behaviour (apart from disabling redundant code in the IndividualAggregator), as the output is filtered out in debug() anyway if no debug domains are active. > On the other hand, what I really want to optimize is real-life uses of Folks > (GNOME Contacts, Empathy, etc.) so tests/eds/perf should reflect the "most > natural" thing for such applications to do, even if it's not as fast as it > could be. Applications should definitely be executing with (debug_output_enabled == false). It’s a bug. > tests/eds/perf does not vary its runtime significantly when these checks are > enabled or disabled. However, it is a relatively "good" case for these checks, > because it adds all individuals simultaneously. The thing that would be made > expensive by these checks is adding individuals one at a time, which would be > O(individuals**2): perhaps this should be an additional test? (Add 50 clones of > existing personas and 50 new ones, or something?) I’m not sure how realistic it is to add a lot of new Personas individually and in quick succession; that should only happen when a PersonaStore appears (or disappears), and it’s a bug if a store isn’t batching up changes like that. If it’s not too much work, it can’t hurt to have such a test case, though.
Review of attachment 240034 [details] [review]: ++
Review of attachment 240036 [details] [review]: ++, subject to merging with the next patch.
Review of attachment 240037 [details] [review]: ::: folks/individual-aggregator.vala @@ +1131,2 @@ { + var candidate_ind = ((!) candidates)[i]; I guess it would be a bit neater to say: var _candidates = this._link_map.get (…); if (_candidates != null) { var candidates = (!) _candidates; for (uint i = 0; i < candidates.length; i++) … but it’s not worth iterating the patch again. Please commit! :-)
(In reply to comment #14) > Is there any way to make valac --enable-experimental-non-null allow comparison > of non-nullable types with null - at least when it appears in an assertion - or > perhaps downgrade nullability violations to a warning? You could try static casting to a nullable type: Individual my_non_nullable_thing = …; assert ((Individual?) my_non_nullable_thing != null); It might be better to use a dynamic cast: Individual my_non_nullable_thing = …; assert ((my_non_nullable_thing as Individual?) != null); but I have no idea if this will work syntax-wise. Dynamic casts differ from static casts in that they insert a GObject casting macro (e.g. the above should, IIRC, expand to FOLKS_INDIVIDUAL(my_non_nullable_thing)). Static casts don’t insert the macro, and just assume everything will be fine. If casting doesn’t work, the following really should work (but ewwww): Individual my_non_nullable_thing = …; Individual? _my_non_nullable_thing = my_non_nullable_thing; assert (_my_non_nullable_thing != null);
Comment on attachment 239831 [details] [review] WiP: IndividualAggregator: use a GHashTable<string,GPtrArray> for the link map Part of 2ec38a5d15
Comment on attachment 240034 [details] [review] fixup "IndividualAggregator: use a GHashTable<string,GPtrArray> for the link map" Part of 2ec38a5d15
Comment on attachment 240036 [details] [review] IndividualAggregator: assert and assume that _link_map[key][i] != null Committed as part of c438d6d00a
Comment on attachment 240037 [details] [review] IndividualAggregator: be a bit more careful about nullability Committed as part of c438d6d00a
Created attachment 240412 [details] [review] By default, do debug output iff at least one debug domain is enabled Previously, debug output (and expensive checks that are enabled alongside debug output) was always enabled unless explicitly disabled, but Philip noted that this was not intentional. --- In response to Comment #15.
Created attachment 240420 [details] [review] backends: when adding a persona by details, do not assume HashSet All we need is a Set, so we shouldn't need to cast this strictly.
Created attachment 240421 [details] [review] Do not explicitly link libfolks-internal.la into things libfolks.la embeds a copy of libfolks-internal.la, so every public (or internal) symbol in libfolks-internal.la is available in libfolks.la, even if linked with -Wl,--no-copy-dt-needed-entries. If libfolks-internal.la defines a GObject type, then linking two copies of it is actively harmful, because they can't both be registered with the type system.
Created attachment 240422 [details] [review] Folks.SmallSet: add and test --- _folks_small_set_new_take_array() is not bound into Vala. It's for FolksSmallMap and FolksSmallMultiMap, which I think are likely to be a bigger performance win than FolksSmallSet, but aren't finished yet. The regression test might not have full coverage, but is thorough enough that it found Bug #696710 when I used HashSet to help reverse-engineer what API SmallSet should present :-)
Created attachment 240423 [details] [review] folks, backends: use "small sets" instead of hash sets most of the time Notable exceptions are: * sets of personas that are, or might be, the entire set from a backend (possible future refinement: use a SmallSet if there aren't many) * sets of potentially many Individuals (likewise) * the object cache (potentially pretty large) * debug stuff (not relevant for performance) * the set of IM addresses being matched (we want to keep this O(1), but it should be a GHashTable) This speeds up tests/eds/perf by around 5%. --- I hope SmallMultiMap will be a bigger performance win in practice.
Created attachment 240424 [details] [review] Add fast-path iteration for SmallSet --- The e-d-s part of this accounts for about a 10% time decrease.
Review of attachment 240412 [details] [review]: ++
Review of attachment 240420 [details] [review]: ++
Review of attachment 240421 [details] [review]: Looks good. How did you notice this was a problem?
Review of attachment 240422 [details] [review]: Looking good so far. A few minor comments. ::: folks/folks-generics.vapi @@ +24,3 @@ + +/* Unfortunately, we have to do this as a .vapi because going from C to + * GIR to Vala loses the generic types. */ Please add a FIXME referencing bug #639908 (about GIR support for generics). ::: folks/small-set.c @@ +24,3 @@ + +#include "folks/small-set.h" +#include "folks/small-set-internal.h" Would be useful to have a block comment at the top of here briefly explaining why SmallSet is needed, what it’s optimised (and *not* optimised) for, and its overall design (i.e. wrap a GPtrArray and present the same API as Gee.HashSet but without making use of the hashing). @@ +190,3 @@ + NULL); + other->items = g_ptr_array_ref (self->items); + other->flags = FOLKS_SMALL_SET_FLAG_READ_ONLY; Might it be more future-proof to use: other->flags = self->flags | FOLKS_SMALL_SET_FLAG_READ_ONLY; here, in case other flags are added in future? @@ +211,3 @@ + + /* FIXME: benchmark whether giving self a weak ref to other is a + * performance win or not */ FIXME here. @@ +283,3 @@ +{ + FolksSmallSet *self = g_object_new (FOLKS_TYPE_SMALL_SET, + NULL); There are a few places in the IndividualAggregator where we deliberately pass around empty Sets of Personas (such as the personas-changed signal IIRC). At the moment, we don’t make use of Set.empty(), but it might make sense to do so in order to save allocation time (especially if the immutable empty set was a singleton). Then again, personas-changed is only emitted a few tens of times before quiescence, so it would probably make very little difference. @@ +330,3 @@ +{ + FolksSmallSet *self; + GeeIterator *iter; Would it be worth asserting that item_type subsumes the item type of the existing set? @@ +347,3 @@ + self->items = g_ptr_array_new_full (other->items->len, + item_free); + self->flags = 0; For future-proofness, would this be better as: self->flags = (other->flags & ~FOLKS_SMALL_SET_FLAG_READ_ONLY); @@ +352,3 @@ + { + g_ptr_array_add (self->items, + item_dup (g_ptr_array_index (other->items, i))); This path uses the new item_dup() function, whereas the two slow paths below (implicitly) use the item_dup() function from the old set. For consistency, we probably want to use the old set’s function in this path too. (I would hope they’re the same function, but who knows.) @@ +376,3 @@ + /* Do it the hard way: there might be duplicates. */ + + while (gee_iterator_next (iter)) Errant new line. @@ +381,3 @@ + + if (_find (self, item, NULL)) + item_free (item); Again, this is using the item_free() function from the new set rather than the old one. @@ +516,3 @@ + copy = self->item_dup ((gpointer) item); + + g_ptr_array_add (self->items, copy); Are there many cases in folks where we add a Set of elements to a SmallSet (i.e. construct the set union)? If so, it might make sense to implement Gee.Collection.add_all() and add a fast path which is O(n) rather than O(n^2). Using add_all() is currently O(n^2) (for n elements in the set being added) because it unnecessarily checks each of the new elements against each of the new elements which have already been appended to the array. Though for small sets this may not be a problem. @@ +662,3 @@ + { + case ITER_PROP_VALID: + g_value_set_int (value, _iterator_is_valid (self)); Shouldn’t this be g_value_set_boolean()? Gee says the property is has bool type: http://www.valadoc.org/#!api=libgee-0.8/Gee.Iterator.valid ::: tests/folks/small-set.vala @@ +101,3 @@ + }); + + /* GNOME #696710: this is wrong in HashSet */ Please use the phrase ‘FIXME’ so things like this are greppable.
Review of attachment 240423 [details] [review]: Looking good. A few minor comments and thoughts. This is probably ready to commit after a quick check through once the SmallSet patch is ready to commit. ::: backends/tracker/lib/trf-persona-store.vala @@ +871,3 @@ { + /* FIXME: would changing this to SmallSet be an API break? + * It's documented that we use HashSet, and Trf.Persona assumes it */ Yeah, it would be an API break which we should avoid. Either leave it using HashSet, or create a private version of unserialize_local_ids() which uses SmallSet, and refactor the public version to be a wrapper around it which converts the SmallSet into a HashSet for any client program unfortunate enough to call unserialize_local_ids(). ::: backends/tracker/lib/trf-persona.vala @@ +560,3 @@ AbstractFieldDetails<string>.hash_static, AbstractFieldDetails<string>.equal_static); this._phone_numbers_ro = this._phone_numbers.read_only_view; Another idea for performance improvements (though completely unrelated to this bug): lazily allocate the read-only copies of sets and other data structures. Only allocate them when the relevant getter method is called. That could up to halve the number of GObject allocations we do if clients only access a few of the properties in a Persona or Individual. ::: folks/individual-aggregator.vala @@ +1946,3 @@ + var personas = new SmallSet<Persona> (); + /* FIXME: this is O(n**2) but if we know that individual.personas + * is a SmallSet, we can do it in O(n) */ See my earlier comment about Gee.Collection.add_all(). Even if we don’t implement it in SmallSet, this code should be changed to: personas.add_all (individual.personas); I’m sure there are many other places throughout folks where this pattern of replacement could be used. Probably wherever there’s a comment saying that we need to copy a set before iteration. It would be useful to survey the code for those patterns and factor them out into some utility methods, since there are probably easy performance gains to be had there.
Review of attachment 240424 [details] [review]: ++
(In reply to comment #33) > @@ +283,3 @@ > +{ > + FolksSmallSet *self = g_object_new (FOLKS_TYPE_SMALL_SET, > + NULL); > > There are a few places in the IndividualAggregator where we deliberately pass > around empty Sets of Personas (such as the personas-changed signal IIRC). At > the moment, we don’t make use of Set.empty(), but it might make sense to do so > in order to save allocation time (especially if the immutable empty set was a > singleton). Then again, personas-changed is only emitted a few tens of times > before quiescence, so it would probably make very little difference. …and attachment 240423 [details] [review] makes use of SmallSet.empty(). Great! Still might potentially improve things to make the empty set a singleton.
(In reply to comment #32) > Review of attachment 240421 [details] [review]: > > Looks good. How did you notice this was a problem? When I added FolksSmallSet, there were two copies of folks_small_set_get_type() in one of the tests (one in the shared libfolks and one in the executable) and badness ensued. (In reply to comment #33) > Please add a FIXME referencing bug #639908 (about GIR support for generics). Will do. > Would be useful to have a block comment at the top of here briefly explaining > why SmallSet is needed, what it’s optimised (and *not* optimised) for, and its > overall design (i.e. wrap a GPtrArray and present the same API as Gee.HashSet > but without making use of the hashing). OK. > + other->flags = FOLKS_SMALL_SET_FLAG_READ_ONLY; > > Might it be more future-proof to use: > other->flags = self->flags | FOLKS_SMALL_SET_FLAG_READ_ONLY; > here, in case other flags are added in future? I don't think I can say authoritatively whether future flags will be appropriate to copy or not until they're actually added. > + /* FIXME: benchmark whether giving self a weak ref to other is a > + * performance win or not */ > > FIXME here. Noted. Without fixing this it's still a performance win, though. > There are a few places in the IndividualAggregator where we deliberately pass > around empty Sets of Personas (such as the personas-changed signal IIRC). At > the moment, we don’t make use of Set.empty(), but it might make sense to do so > in order to save allocation time (especially if the immutable empty set was a > singleton). Then again, personas-changed is only emitted a few tens of times > before quiescence, so it would probably make very little difference. It's worth a go, I suppose. Do you mind it being a one-per-process "leak"? > > @@ +330,3 @@ > +{ > + FolksSmallSet *self; > + GeeIterator *iter; > > Would it be worth asserting that item_type subsumes the item type of the > existing set? I suppose so. I haven't made much use of copy() yet: replacing some add_all() with copy() is another potential speedup. > For future-proofness, would this be better as: > self->flags = (other->flags & ~FOLKS_SMALL_SET_FLAG_READ_ONLY); Without knowing the semantics of hypothetical future flags I can't answer that. I'm inclined to leave it as-is until we find that we need another flag. > @@ +352,3 @@ > + { > + g_ptr_array_add (self->items, > + item_dup (g_ptr_array_index (other->items, i))); > > This path uses the new item_dup() function, whereas the two slow paths below > (implicitly) use the item_dup() function from the old set. Hmm, that's true... we could just use _dup (other, i). > Errant new line. Noted. > Again, this is using the item_free() function from the new set rather than the > old one. If that's a problem, then we've already lost, because we're dup'ing an item from the old set and inserting it into a GPtrArray whose free-function is the new free-function. I don't want to check for equality, because stuff like -Bsymbolic can result in having two different pointers to what is fundamentally the same free-function. I suppose we could make copy() not be generic, and retrieve the type and copy- and free-functions from the old container? That would remove any potential problem by only having one set. That would also change its Vala API from copy<K,V>(iterable, hash, equals) to copy(iterable, hash, equals), which is somewhat nicer anyway. > @@ +516,3 @@ > + copy = self->item_dup ((gpointer) item); > + > + g_ptr_array_add (self->items, copy); > > Are there many cases in folks where we add a Set of elements to a SmallSet > (i.e. construct the set union)? All the ones I've spotted so far are add_all() (or, worse, iterative adding) into an empty set, which makes copy() a better fast-path. The major advantage of copy() over add_all() (which is, as you note, a union operation) is that we know the "left hand side of the union" is an empty set, so if the "right-hand side" is a Set, we don't need to test for uniqueness at all. > + g_value_set_int (value, _iterator_is_valid (self)); > > Shouldn’t this be g_value_set_boolean()? Yes, it should. Looks like I didn't test that: writing the test in Vala means it compiles to gee_iterator_get_valid() or something, rather than falling all the way back to g_object_get(). > + /* GNOME #696710: this is wrong in HashSet */ > > Please use the phrase ‘FIXME’ so things like this are greppable. Will do. Bug #696710 has in fact been fixed now, but there doesn't seem any point in adding a versioned dependency just for this.
(In reply to comment #34) > Yeah, it would be an API break which we should avoid. Either leave it using > HashSet, or create a private version of unserialize_local_ids() which uses > SmallSet I don't think unserialize_local_ids() shows up in my callgrind results, so I'll leave it (until/unless I find that it's significant). > Another idea for performance improvements (though completely unrelated to this > bug): lazily allocate the read-only copies of sets and other data structures. > Only allocate them when the relevant getter method is called. Yes, that's on my list. > I’m sure there are many other places throughout folks where this pattern of > replacement could be used. Probably wherever there’s a comment saying that we > need to copy a set before iteration. I think copy() is somewhat better than add_all() where it's usable, but yes, using add_all() is the next best thing.
Comment on attachment 240412 [details] [review] By default, do debug output iff at least one debug domain is enabled 5226ce939
Comment on attachment 240420 [details] [review] backends: when adding a persona by details, do not assume HashSet e5b07508
Comment on attachment 240421 [details] [review] Do not explicitly link libfolks-internal.la into things fcfd16cbc
Created attachment 240508 [details] [review] FolksSmallSet: add a doc-comment
Created attachment 240509 [details] [review] FolksSmallSet: simplify copy() by removing generic typing If we pick up the generic type, copy and free functions from the iterable, then we don't need to worry about whether ours and theirs are consistent: by definition, they are.
Created attachment 240510 [details] [review] FolksSmallSet: coding style
Created attachment 240511 [details] [review] FolksSmallSet: cope with copying a non-set iterable with no free-function --- This would previously have crashed. This, and the rest of this round of patches, could be squashed into Attachment #240422 [details] if you want.
Created attachment 240512 [details] [review] FolksSmallSet: Iterator.valid is boolean, not int
Created attachment 240513 [details] [review] SmallSet: amend comment
Created attachment 240514 [details] [review] Trf.PersonaStore: amend comment about not using SmallSet
Created attachment 240516 [details] [review] Use SmallSet.copy() where it makes sense It isn't more than fractionally faster in tests/eds/perf, but might help.
(In reply to comment #37) > > + other->flags = FOLKS_SMALL_SET_FLAG_READ_ONLY; > > > > Might it be more future-proof to use: > > other->flags = self->flags | FOLKS_SMALL_SET_FLAG_READ_ONLY; > > here, in case other flags are added in future? > > I don't think I can say authoritatively whether future flags will be > appropriate to copy or not until they're actually added. Fair enough. :-) > > There are a few places in the IndividualAggregator where we deliberately pass > > around empty Sets of Personas (such as the personas-changed signal IIRC). At > > the moment, we don’t make use of Set.empty(), but it might make sense to do so > > in order to save allocation time (especially if the immutable empty set was a > > singleton). Then again, personas-changed is only emitted a few tens of times > > before quiescence, so it would probably make very little difference. > > It's worth a go, I suppose. Do you mind it being a one-per-process "leak"? I’d rather avoid it being a one-per-process leak, since various libraries have been trying to eliminate those recently and it complicates valgrinding (slightly). I don’t think there’s any other way to do it, though. It seems a bit too much to ask you to add a library-wide cleanup function just for this, so don’t bother with any of it. If empty() starts appearing on profiling traces, then we can do something about it. > > For future-proofness, would this be better as: > > self->flags = (other->flags & ~FOLKS_SMALL_SET_FLAG_READ_ONLY); > > Without knowing the semantics of hypothetical future flags I can't answer that. > I'm inclined to leave it as-is until we find that we need another flag. Righto. > > Again, this is using the item_free() function from the new set rather than the > > old one. > > If that's a problem, then we've already lost, because we're dup'ing an item > from the old set and inserting it into a GPtrArray whose free-function is the > new free-function. > > I don't want to check for equality, because stuff like -Bsymbolic can result in > having two different pointers to what is fundamentally the same free-function. > > I suppose we could make copy() not be generic, and retrieve the type and copy- > and free-functions from the old container? That would remove any potential > problem by only having one set. That would also change its Vala API from > copy<K,V>(iterable, hash, equals) to copy(iterable, hash, equals), which is > somewhat nicer anyway. Sounds good. Please do so! > > + g_value_set_int (value, _iterator_is_valid (self)); > > > > Shouldn’t this be g_value_set_boolean()? > > Yes, it should. Looks like I didn't test that: writing the test in Vala means > it compiles to gee_iterator_get_valid() or something, rather than falling all > the way back to g_object_get(). I think you can use my_object.get_property() in Vala to force it to use g_object_get(). It does mean you end up using GValues though, so pain. > > + /* GNOME #696710: this is wrong in HashSet */ > > > > Please use the phrase ‘FIXME’ so things like this are greppable. > > Will do. Bug #696710 has in fact been fixed now, but there doesn't seem any > point in adding a versioned dependency just for this. No, no point in bumping our libgee dependency for this. Leave the FIXME in and we can come back to it. (In reply to comment #38) > (In reply to comment #34) > > Yeah, it would be an API break which we should avoid. Either leave it using > > HashSet, or create a private version of unserialize_local_ids() which uses > > SmallSet > > I don't think unserialize_local_ids() shows up in my callgrind results, so I'll > leave it (until/unless I find that it's significant). Righto. > > I’m sure there are many other places throughout folks where this pattern of > > replacement could be used. Probably wherever there’s a comment saying that we > > need to copy a set before iteration. > > I think copy() is somewhat better than add_all() where it's usable, but yes, > using add_all() is the next best thing. Agreed wholeheartedly.
Review of attachment 240508 [details] [review]: Great, thanks.
Review of attachment 240509 [details] [review]: ++
Review of attachment 240510 [details] [review]: ++
Review of attachment 240511 [details] [review]: ++
Review of attachment 240512 [details] [review]: ++
Review of attachment 240513 [details] [review]: ++
Review of attachment 240514 [details] [review]: ++
Review of attachment 240516 [details] [review]: Nice.
(In reply to comment #33) > Review of attachment 240422 [details] [review]: > > ::: folks/folks-generics.vapi > @@ +24,3 @@ > + > +/* Unfortunately, we have to do this as a .vapi because going from C to > + * GIR to Vala loses the generic types. */ > > Please add a FIXME referencing bug #639908 (about GIR support for generics). I haven’t noticed a fix for this; with a patch for that squashed in, please commit everything. Thanks!
Created attachment 240646 [details] [review] Declare folks_small_set_copy properly --- Unfortunately, Folks won't build with -Werror=implicit, because plugins use "internal" things like folks_persona_store_set_is_user_set_default() which don't turn up in the header file.
Review of attachment 240646 [details] [review]: ++
Comment on attachment 240646 [details] [review] Declare folks_small_set_copy properly f18d26283
Jeremy, Xavier: can you try out the latest git master version of Folks to see if Simon's patches make the performance here acceptable?
(In reply to Travis Reitter from comment #63) > Jeremy, Xavier: can you try out the latest git master version of Folks to > see if Simon's patches make the performance here acceptable? Jeremy, Xavier, can we close this?
Yes, let’s.