After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 321552 - Double entries in trackable list when binding a slot to a slot
Double entries in trackable list when binding a slot to a slot
Status: RESOLVED DUPLICATE of bug 589202
Product: libsigc++
Classification: Bindings
Component: adaptors
2.0.x
Other Linux
: Normal normal
: ---
Assigned To: gtkmm-forge
gtkmm-forge
Depends on:
Blocks:
 
 
Reported: 2005-11-15 22:47 UTC by Philip Langdale
Modified: 2011-02-22 10:09 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Testcase for the binding problem (830 bytes, text/plain)
2005-11-15 22:55 UTC, Philip Langdale
Details
Valgrind log output (5.17 KB, text/plain)
2005-12-02 20:04 UTC, Philip Langdale
Details
tar file with modified trackable.cc and valgrind output (4.28 KB, application/x-compressed-tar)
2010-11-17 09:05 UTC, Kjell Ahlstedt
Details

Description Philip Langdale 2005-11-15 22:47:23 UTC
This is a rather hard bug to describe - it's easier to read the testcase I've
just written - but the basic idea is as follows:

1) You have a method 'foo' on a trackable object that takes a slot as a parameter.
2) You want to use another method 'bar' on that object as that slot.
3) You build a slot out of 'foo' and bind a slot of 'bar' at the same time.
4) You connect this slot to some signal.
5) The object goes away.

Something magically wrong happens at step 3 causing each of the slots to appear
in the objects notify target list twice. When the object goes away, it notifies
each slot once. Then one of the slots (not sure which yet) notifies the signal,
which in turn releases and deletes the slot. Then the object notifies that slot
the second time causing an invalid memory access. Due to the layout of all this
in memory, there is no segfault but valgrind will complain.

Interesting observations:

1) If the bound slot is on another object, there is no problem.
2) If the bound slot is first created and assigned to a named variable and then
this variable is bound, there is no problem. *Only* the inline case exhibits
this problem.

So, the trivial workaround is to assign the bound slot first. I will continue to
investigate but this looks very strange.
Comment 1 Philip Langdale 2005-11-15 22:48:29 UTC
I am clearly not paying attention. This is a libsigc++ bug.
Comment 2 Philip Langdale 2005-11-15 22:55:38 UTC
Created attachment 54808 [details]
Testcase for the binding problem
Comment 3 Murray Cumming 2005-11-16 08:54:49 UTC
Maybe the copying in this patch has caused this somehow, though it's a wild guess:
http://bugzilla.gnome.org/show_bug.cgi?id=166217

You could test a version of libsigc++ before the patch was applied.
Comment 4 Philip Langdale 2005-11-16 18:40:00 UTC
While I could understand why it might be responsible, that's not the case here;
I just tested with 2.0.9 and the problem still existed back then.
Comment 5 Régis Duchesne 2005-11-19 07:05:43 UTC
Philip,

The only difference I can see between the normal and workaround cases is:

o In the normal case, storage for the bound slot is allocated on the stack (it
is pushed) right before the call to bind(), and it is freed (it is poped) right
after the call to bind(). So the storage does not exist anymore at the time
'foo' is deleted.

o In the workaround case, storage for the bound slot ('two') is allocated on the
stack upon entering main(). So the storage still exists at the time 'foo' is
deleted.

Valgrind is not perfect: it only knows about accesses to the address space, it
does not know when objects are constructed or destructed.

I believe that the bug actually happens in _both_ cases, but it is only detected
by valgrind in the normal case (where a memory access occurs beyond the end of
the stack), and it is not detected in the workaround case (where a memory access
occurs inside the stack, so it appears valid to valgrind although it happens
after the object that was stored there has been destructed).

In other words, I don't think the workaround case is an actual workaround. I
wanted to verify this theory by myself, by crafting a more careful workaround
case: allocate 'two' on the _heap_ right before the call to bind(), then delete
it after the call to bind(). I believe this will help valgrind detect the
problem even in the "workaround" case.

Unfortunately, I'm unable to reproduce the problem on my system (which uses gcc
4.0.3) under valgrind, i.e. there is nothing after the "DoFoo" line in
valgrind's output. Can you please verify my theory on your system?
Comment 6 Philip Langdale 2005-11-30 02:24:54 UTC
Régis. I get no errors with two on the heap.
Comment 7 Murray Cumming 2005-12-01 11:56:59 UTC
I see no valgrind errors. I'm using g++ 4.0.2-4ubuntu3 (in Ubuntu dapper -
unstable, which also has valgrind 3.0.1-debian).

I get "Do Foo" as output.
Comment 8 Murray Cumming 2005-12-01 11:58:01 UTC
Actually it looks like I too have g++ 4.0.3 (the other version is the package
version).
Comment 9 Philip Langdale 2005-12-01 17:59:02 UTC
Hmm. The implication then seems to be that it's a g++ error that has since been
fixed. I'm using 3.3.6 here and we use 3.3.3 for the binary that we ship, so we
can't ignore it yet. As we're going to bump our compiler in due time, that will
be how we fix the problem.
Comment 10 Murray Cumming 2005-12-02 08:28:59 UTC
> The implication then seems to be that it's a g++ error that has since been
fixed.

I wouldn't rely on that. Could you upload the valgrind output, even if it isn't
completely useful, please? If we can reproduce the problem then surely we should
be able to discover when/how it is happening.

Slots as bound slot parameters: You really are exercising the API.

Comment 11 Murray Cumming 2005-12-02 08:29:30 UTC
Reopening, just in case.
Comment 12 Philip Langdale 2005-12-02 20:04:24 UTC
Created attachment 55549 [details]
Valgrind log output

As requested.
Comment 13 Kjell Ahlstedt 2010-11-17 09:05:56 UTC
Created attachment 174656 [details]
tar file with modified trackable.cc and valgrind output

The bug reports 321552 (this one) and 589202 "Segfault when destroying a
sigc::trackable referenced multiple times in a sigc::slot" are very similar,
and the solution that I propose in bug 589202, comment 8, works also for the
test case in 321552.

The attached archive file contains:

- trackable.cc: The same modified version as attached to bug 589202. Some extra
  output shows when callback functions are added to and removed from
  trackable::callback_list_. It also shows when the callback functions are
  called.
  By activating a #define directive, trackable::remove_destroy_notify_callback
  can be modified as described in bug 589202.

- output1.txt: Output when the test case in comment 2 is run without the work-
  around, with only the output activated in trackable.cc.
  Error reported even without valgrind. 

- valgrind-output1.txt: Output when the test case in comment 2 is run without
  the workaround, with only the output activated in trackable.cc.
  valgrind reports accesses to freed memory.

- valgrind-output2.txt: With the workaround, with only the output activated in
  trackable.cc.

- valgrind-output3.txt: Without the workaround,
  trackable::remove_destroy_notify_callback modified. It can remove (or
  actually disable) entries in the list of callback functions while
  trackable::notify_callbacks is executing.

- valgrind-output4.txt: With the workaround,
  trackable::remove_destroy_notify_callback modified.

In valgrind-output[234].txt, valgrind reports no errors, neither reading from
freed memory nor memory leaks.

I've run the tests on Ubuntu 10.10, with a version of libsigc++ that I cloned
from the master branch in Git on November 5, 2010.
Comment 14 Murray Cumming 2010-11-17 09:32:03 UTC
Many thanks for being so thorough. I'm at a conference now but I hope to have time to go over this properly in a couple of days.
Comment 15 Murray Cumming 2011-02-22 10:09:26 UTC
We think this is all fixed as part of the fix in bug #589202 .

*** This bug has been marked as a duplicate of bug 589202 ***