GNOME Bugzilla – Bug 647121
Crash in individual_store_contact_sort at empathy-individual-store.c line 1387
Last modified: 2011-04-28 09:00:05 UTC
This seems to happen every day while I'm away. Program received signal SIGABRT, Aborted. 0x00007ff4f2e1e175 in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 64 return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig); (gdb) bt
+ Trace 226635
Thread 1 (Thread 0x7ff4f7a73920 (LWP 2108))
(gdb)
What is the version of empathy ?
It just happened again Empathy 3.0.0 libfolks 0.4.2
The account pointer inside the EmpathyContactPriv is NULL.. I think it happens when butterfly disconnects and reconnects.
Interesting. Could you please check if tp_contact in EmpathyContactPriv is also NULL? Would be cool if you could run empathy with EMPATHY_DEBUG=all FOLKS_DEBUG=all and attach logs when the crash happens.
Created attachment 185573 [details] log from empathy Attached is the log and tp_contact does exist... I wonder if Empathy or Folks get confused because this contact is a "recursive" contact for the account itself. (gdb) p *contact_a $1 = {parent = {g_type_instance = {g_class = 0xf8d600}, ref_count = 3, qdata = 0x0}, priv = 0xa350c0} (gdb) p *((EmpathyContactPriv*)contact_a->priv) $2 = {tp_contact = 0x174d020, account = 0x0, persona = 0x174cef0, id = 0x0, alias = 0x0, avatar = 0x0, presence = TP_CONNECTION_PRESENCE_TYPE_AVAILABLE, handle = 0, capabilities = EMPATHY_CAPABILITIES_NONE, is_user = 1, hash = 0, location = 0x0, groups = 0x0, client_types = 0x0} (gdb) p *((EmpathyContactPriv*)contact_a->priv)->tp_contact $3 = {parent = {g_type_instance = {g_class = 0xe70930}, ref_count = 3, qdata = 0x0}, priv = 0x174d040} (gdb) p *((EmpathyContactPriv*)contact_a->priv)->tp_contact->priv $4 = {connection = 0x12a5720, handle = 1, identifier = 0xf66fe0 "olivier.crete@ocrete.ca", has_features = 53, alias = 0x16904f0 "olivier.crete@ocrete.ca", avatar_token = 0x0, avatar_file = 0x0, avatar_mime_type = 0x0, presence_type = TP_CONNECTION_PRESENCE_TYPE_AVAILABLE, presence_status = 0x174c7e0 "available", presence_message = 0x124cc70 "", location = 0x0, client_types = 0x0, capabilities = 0xe85af0, contact_info = 0x0, subscribe = TP_SUBSCRIPTION_STATE_UNKNOWN, publish = TP_SUBSCRIPTION_STATE_UNKNOWN, publish_request = 0x0, contact_groups = 0x0} (gdb) p *((EmpathyContactPriv*)contact_a->priv)->tp_contact->priv->connection $5 = {parent = {parent = {g_type_instance = {g_class = 0xae4830}, ref_count = 15, qdata = 0xeff2c0}, dbus_daemon = 0x898170, dbus_connection = 0x0, bus_name = 0x1235670 ":1.1132", object_path = 0xdcc660 "/org/freedesktop/Telepathy/Connection/butterfly/msn/olivier_2ecrete_40ocrete_2eca", invalidated = 0x1677870, priv = 0x12a5770}, priv = 0x12a57a0}
Interesting. So we end up with a TpContact having a TpConnection which is not associated with the connection of the accounts stored in the AM. We see these 2 warnings before the assertion: (empathy:12399): GLib-GObject-CRITICAL **: g_object_ref: assertion `G_IS_OBJECT (object)' failed (empathy:12399): GLib-GObject-CRITICAL **: g_object_ref: assertion `G_IS_OBJECT (object)' failed The second one is certainly because of: priv->account = g_object_ref (empathy_get_account_for_connection (connection)); as empathy_get_account_for_connection() returns NULL, but it could be interesting to know when the first one come from. Could you please run with G_DEBUG=fatal-warnings and get a trace? Meanwhile I'll continue digging.
Also, could you please run with EMPATHY_DEBUG=all *and* FOLKS_DEBUG=all, this log only contains folks debug.
The MSN account seems to connect/disconnect, I suspect that confuses Folks. Travis, Philip: would be nice if one of you coud take a look a well as this crash is probably folk related.
Created attachment 185828 [details] [review] Squashed diff of the 647121-user-persona-not-removed branch http://git.collabora.co.uk/?p=user/pwith/folks;a=shortlog;h=refs/heads/647121-user-persona-not-removed Here's a patch to folks which should fix it. Basically, folks was failing to remove self personas under a specific set of circumstances, which caused the corresponding TpContact object to stick around with a stale TpConnection, which was what was causing the crashes (because this connection couldn't be associated with any currently valid ones). This was happening when: 1. The user had two accounts (A and B) on the same protocol, one of which (A) had added the other (B) to its contact list. 2. The two personas representing the user in account B (one in account A's contact list which we'll call B1, and one as the self persona of account B which we'll call B*) were linked to form a single individual, either implicitly by their IIDs or explicitly, together with the self persona of account A which we'll call A*. 3. Personas B1 and A* are seen by the individual aggregator first, and are correspondingly put in different singleton individuals (only one of which is the user individual). 4. The aggregator then sees persona B*, and links it together with B1 and A* to form a single new user individual, _which happens to have the same ID as the singleton individual for persona B1_. It is at this point that things go wrong. The singleton individuals for personas B1 and A* both (correctly) end up in replaced_individuals in IndividualAggregator._personas_changed_cb(), meaning that at the end of that method, their .replace() methods are called. This causes their "removed" signals to be emitted, executing IndividualAggregator._individual_removed_cb() for each of them. For the singleton individual for persona B1 in particular, the following line of code is executed: this.individuals.remove (i.id); Since this singleton individual's ID is the same as the fully-linked individual containing personas A*, B1 and B*, this removes the new fully-linked individual from the IndividualAggregator, causing it to get lost. This causes the TpConnection problems mentioned above. The patch fixes it in the following two ways: 1. Changing Tpf.Persona UIDs to actually be UIDs, so they don't collide any more. This doesn't really fix the problem, though it happens to change things just enough in my case that it actually does. Still, it's quite related, definitely necessary, and a cunning way for me to sneak parts of the fix for bug #629537 into the tree. 2. Only removing individuals from IndividualAggregator.individuals in IndividualAggregator._individual_removed_cb() if they're actually who we think they are. This fixes the root of the problem.
Patch seems to work ... ++ Can we get this into a 0.4 release too btw ?
Review of attachment 185828 [details] [review]: /* Only signal if the individual is still in this.individuals. This allow * us to group removals together in, e.g., _personas_changed_cb(). */ - if (this.individuals.lookup (i.id) == null) + if (this.individuals.lookup (i.id) != i) return; This seems to warrant a warning. Or does this odd state (this.individuals.lookup (i.id) != i) get resolved by the matching one getting removed? Won't we be leaking the surviving individual, since it won't have an entry in this.individuals? Other than that (ie, if I'm making some mistake above), please set this to "accepted-commit_now" and merge.
(In reply to comment #11) > Review of attachment 185828 [details] [review]: > > /* Only signal if the individual is still in this.individuals. This allow > * us to group removals together in, e.g., _personas_changed_cb(). */ > - if (this.individuals.lookup (i.id) == null) > + if (this.individuals.lookup (i.id) != i) > return; > > This seems to warrant a warning. Or does this odd state > (this.individuals.lookup (i.id) != i) get resolved by the matching one getting > removed? Won't we be leaking the surviving individual, since it won't have an > entry in this.individuals? It's not an “odd” state, it's just unusual (er, infrequent). I'm not sure what you mean by “surviving individual”, but I'm fairly sure this doesn't introduce any leaks. Basically, this condition only gets hit (causing an early return) if an individual has been replaced by the aggregator. In that case, the old individual (which is the one emitting the “removed” signal) has been dereffed by this.individuals because it's been replaced in it.
Comment on attachment 185828 [details] [review] Squashed diff of the 647121-user-persona-not-removed branch Pushed to master: commit 9fa57c03a51e4c5b4eaf04642ebab7a52d62003f Author: Philip Withnall <philip.withnall@collabora.co.uk> Date: Tue Apr 12 20:42:06 2011 +0100 Don't remove the wrong Individuals from IndividualAggregator In IndividualAggregator._individual_removed_cb(), don't remove the individual from IndividualAggregator.individuals unless it is actually the individual which is emitting the "removed" signal. Closes: bgo#647121 NEWS | 2 ++ folks/individual-aggregator.vala | 8 ++++---- 2 files changed, 6 insertions(+), 4 deletions(-) commit ce370e34d73fc1fb13781cf59f29efe5553000ca Author: Philip Withnall <philip.withnall@collabora.co.uk> Date: Sat Jan 22 11:34:53 2011 +0000 Fix Tpf.Persona to use its store's ID in persona UIDs It was previously (incorrectly) not using the PersonaStore's ID as the middle component in its UID, which meant that the same IM address as added on two different accounts using the same protocol would have the same UID, undermining the concept of a UID as a globally unique identifier for a Persona. Closes: bgo#647121 backends/telepathy/lib/tpf-persona.vala | 2 +- tests/folks/aggregation.vala | 15 ++++++++++----- tests/telepathy/individual-properties.vala | 3 ++- tests/telepathy/individual-retrieval.vala | 3 ++- 4 files changed, 15 insertions(+), 8 deletions(-)
(If I've misinterpreted your comment or you have more ideas about leaks, please re-open and let me know. :-) )
Can you put it in a 0.4 branch ? This should be pushed to GNOME 3 distributors.
(In reply to comment #15) > Can you put it in a 0.4 branch ? This should be pushed to GNOME 3 distributors. Philip, was this fix intentionally not pushed to the 0.4.x branch?
(In reply to comment #16) > (In reply to comment #15) > > Can you put it in a 0.4 branch ? This should be pushed to GNOME 3 distributors. > > Philip, was this fix intentionally not pushed to the 0.4.x branch? I haven't tested it on 0.4.x. I'll see if I can give it a spin later and merge to 0.4.x as well then.
Cherry-picked to folks-0-4 too: commit 29ab3e525540ee83960f132a9e2a18ea4b2f3eac Author: Philip Withnall <philip.withnall@collabora.co.uk> Date: Tue Apr 12 20:42:06 2011 +0100 Don't remove the wrong Individuals from IndividualAggregator In IndividualAggregator._individual_removed_cb(), don't remove the individual from IndividualAggregator.individuals unless it is actually the individual which is emitting the "removed" signal. Closes: bgo#647121 NEWS | 2 ++ folks/individual-aggregator.vala | 8 ++++---- 2 files changed, 6 insertions(+), 4 deletions(-) commit 8cc9064a2516b82e984e098c48568e6cf4613fe0 Author: Philip Withnall <philip.withnall@collabora.co.uk> Date: Sat Jan 22 11:34:53 2011 +0000 Fix Tpf.Persona to use its store's ID in persona UIDs It was previously (incorrectly) not using the PersonaStore's ID as the middle component in its UID, which meant that the same IM address as added on two different accounts using the same protocol would have the same UID, undermining the concept of a UID as a globally unique identifier for a Persona. Closes: bgo#647121 backends/telepathy/lib/tpf-persona.vala | 2 +- tests/folks/aggregation.vala | 15 ++++++++++----- tests/telepathy/individual-properties.vala | 3 ++- tests/telepathy/individual-retrieval.vala | 3 ++- 4 files changed, 15 insertions(+), 8 deletions(-)
(In reply to comment #15) > Can you put it in a 0.4 branch ? This should be pushed to GNOME 3 distributors. I've just released this in Folks 0.4.3.
*** Bug 648847 has been marked as a duplicate of this bug. ***