After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 561885 - Glib::Source - Want simpler ref counting and deletion logic, at least at ABI break
Glib::Source - Want simpler ref counting and deletion logic, at least at ABI ...
Status: RESOLVED FIXED
Product: glibmm
Classification: Bindings
Component: general
2.18.x
Other All
: Normal normal
: ---
Assigned To: gtkmm-forge
gtkmm-forge
: 539186 694778 (view as bug list)
Depends on: 695631
Blocks:
 
 
Reported: 2008-11-22 01:46 UTC by Cristi Posoiu
Modified: 2016-12-03 09:33 UTC
See Also:
GNOME target: ---
GNOME version: 2.23/2.24


Attachments
fix lifetime mgmt of Glib::Source (3.28 KB, patch)
2009-12-09 16:49 UTC, Paul Davis
none Details | Review
header file patch (1.52 KB, patch)
2009-12-09 16:50 UTC, Paul Davis
none Details | Review
slightly smaller test case (889 bytes, text/plain)
2009-12-09 16:54 UTC, Paul Davis
  Details
0001-Glib-Source-Correct-the-memory-mangement.patch (6.58 KB, patch)
2011-09-13 13:43 UTC, Murray Cumming
needs-work Details | Review
glib patch calling the GSource's finalize vfunc with the additional user_data parameter (1.07 KB, patch)
2013-03-07 07:03 UTC, José Alburquerque
none Details | Review
glibmm patch implementing the finalize vfunc to correctly free the wrapper (4.88 KB, patch)
2013-03-07 07:05 UTC, José Alburquerque
none Details | Review
glib: Allow bindings to correctly free their GSource wrappers. (2.31 KB, patch)
2013-03-11 06:01 UTC, José Alburquerque
none Details | Review
Glib::Source: Partially correct the deletion of the wrappers. (4.93 KB, patch)
2013-03-11 06:02 UTC, José Alburquerque
none Details | Review
patch: Glib::Source: Fix the destruction and deletion. (5.55 KB, patch)
2013-03-11 14:29 UTC, Kjell Ahlstedt
none Details | Review
patch: Glib::Source: Fix the destruction and deletion. (5.58 KB, patch)
2013-03-20 11:59 UTC, Kjell Ahlstedt
none Details | Review
Glib::Source: Preliminary patch to correct the wrapper destruction. (5.61 KB, patch)
2013-04-03 04:18 UTC, José Alburquerque
none Details | Review
gtkmm-documentation: timeout: Modified to test Gllib::Source. (795 bytes, patch)
2013-04-03 04:20 UTC, José Alburquerque
none Details | Review
patch: Glib::Source: Fix the destruction and deletion. (5.53 KB, patch)
2013-04-03 15:40 UTC, Kjell Ahlstedt
none Details | Review
patch: Glib::Source: Fix the destruction and deletion. (5.79 KB, patch)
2013-04-07 16:40 UTC, Kjell Ahlstedt
committed Details | Review

Description Cristi Posoiu 2008-11-22 01:46:44 UTC
Please describe the problem:
I posted first bug 561884, but digging more I got into what seems to me, bigger other problems.
Looking at the code of Glib::RefPtr and Glib::Source::~Source I think I saw something problematic so I ran the same simple C++ program under 'valgrind'. As expected, memory access problems (accessing the same C++ object which was already freed)
Checking into the code of both Glib::Source and C GSource I think there is a misunderstanding over the 'destroy' notification from C GSource.
It seems that the 'destroy' notification has nothing to do w/ reference counting of the GSource itself but with the fact that the 'attachment' to a context is actually destroyed. So, in Glib::Source there shouldn't be a correlation when the notify/destroy callback is called and the destroying of the C++ Glib::Source object itself. I think that Glib::Source object destroying should be done completely different (seems that it might not be a derivation of g_object (why!?) but you still have a 'finalize' callback to use when ref counting goes to 0.


Steps to reproduce:
run under valgrind, the following c++ program (see in it how to compile it), once as is, and once after changing the "#if 0":

#include <glibmm.h>

// g++ -g `pkg-config --cflags glibmm-2.4` `pkg-config --libs glibmm-2.4`  t.cpp
// bug 1: 561884 

static bool on_t1(Glib::RefPtr<Glib::MainLoop> l)
{
	printf("timer 1\n");
	l->quit();
	return FALSE;
}


int main(int argc, char *argv[])
{
	Glib::init();

	Glib::RefPtr<Glib::MainLoop> loop(Glib::MainLoop::create());

	Glib::RefPtr<Glib::TimeoutSource> t1;

	t1 = Glib::TimeoutSource::create(1000);
	t1->connect(sigc::bind(sigc::ptr_fun(on_t1), loop));
	t1->attach();
#if 0
	t1->reference();t1->reference();t1->reference();
#endif
	loop->run();
	printf("ending loop (timer 1=%p)\n", t1->gobj());
	t1.clear();
	return 0;
}


Actual results:
valgrind: memory access to already freed data

Expected results:
no memory access problems. no immediate destroy of the Glib::TimeoutSource in case it has enough of a big ref count.


Does this happen every time?
yes

Other information:
no patch this time :-/
Comment 1 Cristi Posoiu 2008-11-22 01:48:57 UTC
Oops, correction to indroduction part: I came to see these problems after reading source of Glib::RefPtr and Glib::Source::destroy_notify_callback - instead of destructor of Source.
Comment 2 Murray Cumming 2009-01-03 00:32:14 UTC
> Checking into the code of both Glib::Source and C GSource I think there is a
> misunderstanding over the 'destroy' notification from C GSource.

Do you mean the GDestroyNotify notify callback given to the g_source_set_callback() function?
http://library.gnome.org/devel/glib/unstable/glib-The-Main-Event-Loop.html#g-source-set-callback

Comment 3 Cristi Posoiu 2009-01-03 11:55:17 UTC
Yes.

Other comments:
- why the C GSource isn't derived also from GObject? (I'm not waiting an answer to this though :-/ )
- since it looks like GSource doesn't have a QData data member, I'm wondering how/if you can wrap the same C GSource with the same C++ GSource object instance... (there's no such code around the wrapping C++ GSource constructor to do similar things as in Glib::ObjectBase)
   If you can't do that, that you won't be able to use Glib::RefPtr<GSource> comparisons - unless you change Glib::RefPtr to compare the actual C GObject (pCppObject_->gobj()) instead of the C++ instances.
Comment 4 Jonathon Jongsma 2009-04-28 04:08:13 UTC
For what it's worth, I spent most of the evening looking at this bug and I'm not really much closer to a solution.  Ideally, we would simply be able to use the 'finalize' vfunc, but GSource doesnt' allow us to store arbitrary data (i.e. a pointer to the wrapper) in it, so we are not able to determine the proper wrapper object at that point.  So I guess the destroy_notify callback was used as a workaround for these deficiencies, but it does certainly seem to have problems.  

One potential idea I thought about was passing a slightly larger struct size to g_source_new() and then storing a pointer to the wrapper in the larger struct, e.g.:

struct GSourceWithWrapper {
  GSource source;
  Glib::Source* wrapper;
};

however, this only helps when we're creating a new Glib::Source from glibmm.  it doesn't help us at all if we're creating a wrapper object for an existing GSource* object.

So I'm out of ideas for now.
Comment 5 Jonathon Jongsma 2009-04-28 04:44:28 UTC
OK, so I do have one other idea -- keep our own map of GSource*-to-wrapper so that we can use the finalize vfunc and look up a C object's wrapper object when  finalize gets called.  This wouldn't be great for performance since we'd probably have to protect the map with a mutex to make it thread-safe, but I don't imagine that creating and destroying GSource objects is going to be on the critical path for performance in any applications...
Comment 6 Jonathon Jongsma 2009-05-17 03:11:08 UTC
*** Bug 539186 has been marked as a duplicate of this bug. ***
Comment 7 Jonathon Jongsma 2009-05-17 03:15:22 UTC
Add danielk to cc on this bug since we had a discussion about this on IRC a little while back.
Comment 8 Paul Davis 2009-12-09 16:49:29 UTC
Created attachment 149450 [details] [review]
fix lifetime mgmt of Glib::Source

This patch is a proposed fix, based on IRC conversations with Daniel & Owen. It makes it impossible to wrap an existing GSource as a Glib::Source, but the gain is correct lifetime management of Glib::Source at very little cost.

I suspect that it may not be complete, but any missing pieces will hopefully be trivial. I will attach the header file patch and a small test source file immediately.
Comment 9 Paul Davis 2009-12-09 16:50:01 UTC
Created attachment 149451 [details] [review]
header file patch
Comment 10 Paul Davis 2009-12-09 16:54:34 UTC
Created attachment 149458 [details]
slightly smaller test case

the patch was tested on the source code included in the existing bug report, and on this small test case, which has the interesting wrinkle that we quit a MainLoop from within the handler for an IOSource.
Comment 11 Murray Cumming 2010-03-26 09:45:25 UTC
Could you just provide one patch, please. It could add the test case to the tests in tests/. Please use the @deprecated doxygen keyword and write a ChangeLog entry.

It looks like we have missed the opportunity to get this in this glibmm version. Sorry.
Comment 12 Murray Cumming 2010-06-08 15:56:57 UTC
Paul?
Comment 13 Murray Cumming 2010-12-22 22:24:19 UTC
Please?
Comment 14 Paul Davis 2011-02-04 13:00:39 UTC
Murray, for some reason, I haven't seen any emails arising from your requests in this report, so apologies. I'll try to follow through on the request ASAP.
Comment 15 Murray Cumming 2011-02-21 10:06:28 UTC
Paul?
Comment 16 Murray Cumming 2011-09-13 13:43:52 UTC
Created attachment 196372 [details] [review]
0001-Glib-Source-Correct-the-memory-mangement.patch

Here is an actual git patch with Paul's changes, with some cosmetic changes by me. I have attempted to explain the changes in the ChangeLog/commit-message, but I could not explain all of them.

It would be great if someone else reviewed this patch.

It does not add the test case, because I don't know what the test case should do, or not do. It does not stop either with or without this patch, so we can't just add it like the existing tests.
Comment 17 Murray Cumming 2011-09-16 08:34:40 UTC
I noticed that this patch triggers the abort in Glom, because Glib::IOSource's constructor uses that Glib::Source constructor, in glib/glibmm/main.c:

IOSource::IOSource(const Glib::RefPtr<IOChannel>& channel, IOCondition condition)
:
  Source(g_io_create_watch(channel->gobj(), (GIOCondition) condition),
         (GSourceFunc) &glibmm_iosource_callback)
{}


0  0x00110832 in ?? () from /lib/ld-linux.so.2
  • #1 raise
    at ../nptl/sysdeps/unix/sysv/linux/raise.c line 64
  • #2 abort
    at abort.c line 92
  • #3 g_assertion_message
  • #4 Glib::Source::Source
    at main.cc line 793
  • #5 Glib::IOSource::IOSource
    at main.cc line 1067
  • #6 Glib::IOSource::create
    at main.cc line 1049
  • #7 Glib::SignalIO::connect
    at main.cc line 403
  • #8 Glom::Spawn::Impl::SpawnInfo::redirect_to_string
    at glom/libglom/spawn_with_feedback.cc line 156

Comment 18 José Alburquerque 2012-09-25 18:28:57 UTC
I've not looked at the patches in this bug.  I've only been recently considering the possibility of wrapping an existing C object and making sure that the wrapper is freed when the C object is finalized.

One way is to add weak referencing to GSource as GStreamer did recently with their GstMiniObject type:
http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/gstreamer-GstMiniObject.html#gst-mini-object-weak-ref

I've been looking into this and it would not be possible to add any fields in any of the existing GSource structs to store the weak reference callbacks with the given data because that would break ABI (I think).  GSource already has a _GSourcePrivate struct so what I was thinking (I might be able to provide a patch) is to create a new "_GSourceBiggerPrivate" struct that can store what the original private struct stores and the weak references (either by having a member _GSourcePrivate struct as it first member so it can be cast or by including the two fields of the original private struct as the first two members in the new struct).

The code in GSource that creates, accesses and frees the private struct would have to be changed so that it uses the new private struct.  Also weak referencing code would have to be added.

If this solution is feasible it may be a good idea to consider removing the recently added Pollable[Input|Output]Stream and Converter[Input|Output]Stream interfaces and classes because the interfaces have unwrapped virtual functions that return GSources and the classes derive from the interfaces.  Adding the virtual functions would break ABI later.  Maybe they can be added for 2.36.
Comment 19 Kjell Ahlstedt 2013-03-04 18:14:46 UTC
*** Bug 694778 has been marked as a duplicate of this bug. ***
Comment 20 Phillip Susi 2013-03-04 18:57:07 UTC
How about you just have the C++ object keep it's own reference counter and have the destroy function g_object_unref() the C object that many times before setting the pointer to NULL, then have the C++ object's unreference() method skip calling g_object_unref() when it is NULL?
Comment 21 Phillip Susi 2013-03-04 18:58:50 UTC
Actually, there's no reason to pass the references straight through to the C object in the first place.  The C++ object just needs its own reference count, period, and it can just maintain a single reference to the C object that can be released in the destroy method.
Comment 22 Kjell Ahlstedt 2013-03-05 10:14:36 UTC
(In reply to comments 20 and 21)

Adding a ref counter in Glib::Source would break ABI. That's allowed only in a
new major release (glibmm 3.0).

The present situation: The C++ object, Glib::Source, is deleted in Source::
destroy_notify_callback(). That's too early. The deleted object is accessed in
Source::unreference(). The C object, GSource is not deleted. We have a memory
leak.

Solutions in comments 20 and 21: With some care it would be possible to have
a solution that avoids both memory leaks and accesses to deallocated memory.
But I'm not fond of double reference counters. And as I said, it can't be
implemented until glibmm 3.0, whenever that will be.

Bug 539186 comment 3 summarizes the problem with the destroy functions.
The GSource object is _not_ being destroyed when Source::
destroy_notify_callback() is called, it's being detached from its GMainContext.

(In reply to comment 18)

Has anyone filed a glib bug, requesting some addition to GSource? I can't find
any such bug.

I suppose GSource is a problematic class for all language bindings. There's no
way to store the address of a wrapper object such that it survives until the
GSource object is finalized.

I can see only two ABI/API-preserving solutions.

1. Introduce our own GSource*-to-wrapper* map, as proposed on comment 5.

2. Ask for added functionality in GSource. A weak reference, as proposed in
   comment 18 is one possibility.

The patch in comment 16 avoids breaking ABI, but doesn't it break API?
One of the constructors does not construct sensible objects any more.

There's an alternative to a weak reference in GSource that would be very easy
to implement, if the glib developers accept it:

Add an element, user_data, in struct _GSourcePrivate:

  struct _GSourcePrivate
  {
    ...
    void* user_data;
  };

This struct is declared in gmain.c. It would not be an API/ABI break to change
it. Also add

  void* g_source_get_user_data(const GSource* source)
  {
    return source->priv->user_data;
  }

  void g_source_set_user_data(GSource* source, void* user_data)
  {
    source->priv->user_data = user_data;
  }

With this change of GSource glibmm_source_get_callback_data() can be
simplified.
Comment 23 Phillip Susi 2013-03-05 14:51:22 UTC
What?  It wouldn't break ABI since it already gives the appearance of having a refcount, it just implements it by piggy backing on to of the C object refcount.  All that is needed is to add a member variable to the class and change the reference/unreference implementation and that doesn't break ABI.

Then you change Source::destroy_notify_callback to call unreference() instead of delete self.  That way if the caller is still holding the RefPtr, the delete will be delayed until the RefPtr is released, the way it is supposed to.
Comment 24 Kjell Ahlstedt 2013-03-06 13:42:53 UTC
Have I misunderstood? Don't you want to add a data member to Glib::Source
(int ref_count or something like that)? Adding a data member or a virtual
function breaks the ABI. (ABI = application binary interface, not API)

Even if it would be possible to add another ref counter without breaking the
rules, I don't like the idea. At least we should first ask to have GSource
changed in such a way that it's possible to make better language bindings.


I hope for a reply from someone who has commented on this bug before I became
interested: Has anyone filed a glib bug, requesting a change in GSource?
I find it surprising if no one has, but I can't find such a bug, neither open
nor closed.
Comment 25 Phillip Susi 2013-03-06 14:37:38 UTC
You can add a new public data member at the end without breaking the ABI.  Heck, if your private data members are all after the public ones, you can add or remove private members without breaking ABI.

It seems to me there's nothing wrong with GSource, but the problem is that the C++ wrapper tried to take a shortcut in its implementation based on a misunderstanding of the way GSource works.  Specifically it assumed that it did not need to keep its own reference counter because it thought that the destroy notify callback would only be called after the GSource ref count dropped to zero, and this is not the case.
Comment 26 José Alburquerque 2013-03-07 07:02:31 UTC
In reviewing this bug and the related C and C++ API (not thoroughly though), another practical solution that could be used to address this bug presently is to have the C API call the GSource's finalize function (when it is time for its finalization) with an additional 'gpointer user_data' parameter passing in the data stored in the GSource's callback data field (that was set with a call to g_source_set_callback()).

Glib::Source could then have a finalize_vfunc() function accepting the data, extracting the wrapper from it and then correctly freeing it.  The following two patches (one for glib and the other for glibmm) show how it would work.

The weak referencing idea was a quick idea that came to me as I quickly thought about how to ensure that wrappers are correctly deleted, but in reviewing the bug and the C/C++ API a little more, it seems like it would be supplying functionality that can already be achieved with the finalization notification.

I admit, in reviewing the patches they may look a little hackish but they do work.

I don't really want to spend more time on this bug, though, unless we really need a fix and we have to do it right away.  That's why I've been slow to respond.  However, if these patches seem good and they need to be made better for filing a bug against glib, I can fix them accordingly.
Comment 27 José Alburquerque 2013-03-07 07:03:50 UTC
Created attachment 238273 [details] [review]
glib patch calling the GSource's finalize vfunc with the additional user_data parameter
Comment 28 José Alburquerque 2013-03-07 07:05:31 UTC
Created attachment 238274 [details] [review]
glibmm patch implementing the finalize vfunc to correctly free the wrapper
Comment 29 Kjell Ahlstedt 2013-03-08 15:49:36 UTC
(In reply to comment #26)

Thanks for the patches, but I didn't really ask for patches. I just wanted to
know if my suspicion is right that no one has asked for a change in glib. This
bug was filed more than 4 years ago, and several persons have commented on it.

> I admit, in reviewing the patches they may look a little hackish but they do
> work.

I agree they look hackish. Glib::Source::destroy_notify_callback() writes to
GSource::callback_data, which is marked <private>. It does not write the value
that was stored there before destroy_notify_callback() was called.
g_source_set_callback() stores a GSourceCallback* in GSource::callback_data,
and the pointer to the Glib::SourceCallbackData is stored in
GSourceCallback::data.

If I were a glib developer, I'd be very reluctant to accept your glib patch.
From the glib point of view it looks as if the finalize function shall include
a new parameter which is either 0 (if the destroy notify function has been
called) or a pointer to a GSourceCallback instance, a class defined in gmain.c
(if the destroy notify function has not been called before the call to
finalize).
Looking at g_source_unref_internal() I get the impression that
SourceCallbackData::destroy_notify_callback() can be called either before or
_after_ the finalize function (the presently nonexistent
Source::finalize_vfunc()). If I'm right, your patch will not work in all
situations.

I also wonder if it's an API break to add a new parameter to the finalize
function. I don't really know, I'm just wondering.

I will probably file a glib bug suggesting a new user_data element, as
described in comment 22.


(In reply to comment #25)
> You can add a new public data member at the end without breaking the ABI. 
> Heck, if your private data members are all after the public ones, you can add
> or remove private members without breaking ABI.

I'm confused and sceptical. If you add a data member to a class, even it is
added after all other data members, won't the offsets of all data members
declared in subclasses be changed?

You might reply that in this case it doesn't matter, because the only
subclasses of Source with data members are TimeoutSource and IOSource. Those
data members are private, and the classes contain no inline methods. All
accesses to data members with changed offsets will be from the glibmm library,
and they will be updated when the library is rebuilt.
But can you be sure that no application program defines its own subclass with
data members? That's possible, and (I suppose) permitted.
Comment 30 José Alburquerque 2013-03-08 16:28:28 UTC
I was just writing the following comment to the solution I was proposing when there was a collision with the most recent comment (above I guess).  Here's what I was writing:

Actually, I think that having the C API provide functions such as g_source_set_bindings_data() and g_source_get_bindings_data() (as was sort of described in comment 22 by Kjell) would probably make the most easy and clear solution.  In other words (similar to what was described and assuming, of course, that there's no ABI break by the code):

  struct _GSourcePrivate
  {
    ...
    void* bindings_data;
  };


  void* g_source_get_bindings_data(const GSource* source)
  {
    return source->priv->bindings_data;
  }

  void g_source_set_bindings_data(GSource* source, void* data)
  {
    source->priv->bindings_data = data;
  }

The GSource internals make it difficult to keep the user data set with a call to g_source_set_callback() accessible until the finalization process takes place.  The g_source_destroy_internal() function (which is called by g_source_destroy()) NULLs out the user data before finalization takes place, thus making difficult for the data to be accessed then:

static void
g_source_destroy_internal (GSource      *source,
			   GMainContext *context,
			   gboolean      have_lock)
{
 ...
 if (!SOURCE_DESTROYED (source))
    {
      ...
      source->callback_data = NULL;
      source->callback_funcs = NULL;
      ...
    }
    ...
}

I thought that removing the NULLing lines would make it possible to keep the data until finalization but that causes unpredictable memory access problems causing the most simple programs (that somehow rely on glib?) to crash.  Even a program such as kdbg (which I wouldn't think uses glib) crashes.

If the data were accessible at finalization, it would not be necessary to use a finalize_vfunc() with an additional 'gpointer user_data' parameter because the data would be accessible through the GSource.  Thus the C API's method of invoking the finalize function would not have to change for the accommodation of bindings.  But that does not look easily achievable.

The weak referencing is still a possibility but easier than that, I think, are the bindings data functions already described.

As far as having the wrapper keep its own reference count, I still remain dubious that adding a new 'ref_count' member to Glib::Source would not be an ABI break because other experienced glibmm developers have never mentioned such a possibility as adding new data members to classes under any circumstances without breaking ABI when trying to fix things that could easily be fixed that way.
Comment 31 Phillip Susi 2013-03-08 16:36:45 UTC
*kicks gnome bugzilla for refusing to accept replies via email*

On 3/8/2013 10:49 AM, glibmm (bugzilla.gnome.org) wrote:
> I also wonder if it's an API break to add a new parameter to the finalize
> function. I don't really know, I'm just wondering.

Yes, that's definitely an API break, unless you can stash a flag so you 
know whether you should call the finalize function with, or without an 
argument.

> I'm confused and sceptical. If you add a data member to a class, even it is
> added after all other data members, won't the offsets of all data members
> declared in subclasses be changed?

Ugh, yea... any classes derived from this one would break.  I'm really 
starting to get annoyed at C++.

The only other way I can see to fix this is a little hack I recall being 
used frequently in Windows 3.x.  You allocate a few bytes of writable, 
executable memory, and create a binder there.  You write some code there 
that pushes the address of the glibmm::Source onto the stack ( or shoves 
it into the correct register ) then jumps to Source::finalize_func, and 
you pass the address of the binder to GSource to invoke on finalization.
Comment 32 José Alburquerque 2013-03-08 16:57:40 UTC
(In reply to comment #31)
> *kicks gnome bugzilla for refusing to accept replies via email*
> 
> On 3/8/2013 10:49 AM, glibmm (bugzilla.gnome.org) wrote:
> > I also wonder if it's an API break to add a new parameter to the finalize
> > function. I don't really know, I'm just wondering.
> 
> Yes, that's definitely an API break, unless you can stash a flag so you 
> know whether you should call the finalize function with, or without an 
> argument.

As commented, if it were possible to keep the callback user data until finalization the call to the finalize function would not have to be changed.  It would be an API break if the call required that the second argument actually be declared by the callee.  But I think it's possible to not have the callees declare the new argument and still call the function that way.  I tested that and there were no problems.  (I think it has to do with how the call stack, heap, etc. are configured when calling C functions.)

Suffice it to say though that I still think that the bindings data solution seems the easiest and quickest to solve this issue.
Comment 33 José Alburquerque 2013-03-11 05:59:05 UTC
For the sake of completeness, I wanted to see if the finalization method would be a feasible solution and I think I found that it could be.

The g_source_destroy_internal() function not only NULLs out the callback data and the callback functions, but it also unreferences the callback data:

static void
g_source_destroy_internal (GSource      *source,
               GMainContext *context,
               gboolean      have_lock)
{
 ...
 if (!SOURCE_DESTROYED (source))
    {
      ...
      source->callback_data = NULL;
      source->callback_funcs = NULL;

      if (old_cb_funcs)
	{
	  UNLOCK_CONTEXT (context);
	  old_cb_funcs->unref (old_cb_data);
	  LOCK_CONTEXT (context);
	}
      ...
    }
    ...
}

Since this process already takes place in the g_source_unref_internal() function which g_source_destroy_internal() calls, the NULLing out and unreffing can (and should) be removed from g_source_destroy_internal().  The same NULLing and unreffing lines in the g_source_unref_internal() function should be moved after finalization so that the data can be accessed during finalization.

Once that's done, glibmm's Source can include a finalize_vfunc() that correctly frees the wrappers.  The following two updated patches show what the initial changes would be like and verifies that the concept works.  I know that I said that adding setter/getter functions for bindings data would easier and clearer, but I wanted to fully explore the finalization idea.
Comment 34 José Alburquerque 2013-03-11 06:01:22 UTC
Created attachment 238554 [details] [review]
glib: Allow bindings to correctly free their GSource wrappers.
Comment 35 José Alburquerque 2013-03-11 06:02:46 UTC
Created attachment 238555 [details] [review]
Glib::Source: Partially correct the deletion of the wrappers.
Comment 36 Kjell Ahlstedt 2013-03-11 09:10:27 UTC
I came to the opposite conclusion when I studied Glib::Source once more:
We must not depend on a finalize callback function in Source::vfunc_table_.
If the Glib::Source instance has been constructed by the default constructor,
it would be possible to depend on such a finalize callback, after some change
of GSource in glib. The patch in comment 16 shows that it's possible even
without changing GSource.

Unfortunately there is also the constructor
Source::Source(GSource* cast_item, GSourceFunc callback_func).
When that constructor is used, we can't set our own GSourceFuncs callbacks
(prepare, check, dispatch, finalize). Comment 17 shows that at least one
application uses this constructor.

I now intend to propose an added qdata API in GSource, similar to
g_object_get_qdata(), g_object_set_qdata(), g_object_set_qdata_full(),
g_object_steal_qdata(). That's what glibmm uses for classes derived from
GObject. Weak references would also be acceptable, but a qdata API is better.
Comment 37 Kjell Ahlstedt 2013-03-11 14:29:25 UTC
Created attachment 238581 [details] [review]
patch: Glib::Source: Fix the destruction and deletion.

I have filed glib bug 695631.
This is a glibmm patch that can be committed if the glib patch is accepted.

In the glib bug I suggest a qdata API, like g_object_set_qdata_full() and
friends. I also suggest a different qdata API and weak references as
alternatives. I prefer one of the qdata APIs.
Comment 38 Phillip Susi 2013-03-11 14:52:01 UTC
So what was wrong with using a weak reference again?  That should not interfere with the other constructor.
Comment 39 Kjell Ahlstedt 2013-03-11 15:30:58 UTC
There's nothing wrong with weak references. A weak reference can fix the
problem discussed in this bug. That's why I've mentioned it in the glib bug
as an alternative to the qdata API, in case the glib developers prefer it.

Copied from the glib bug 695631:
  Weak references, like g_object_weak_ref() and g_object_weak_unref(), would
  be acceptable as an alternative, but a qdata API is better. Data stored with
  g_object_set_qdata() or g_object_set_qdata_full() is available in all
  callback functions, not only when the object is being finalized.

In the patch in comment 37, I've used g_source_get_qdata() in Source::
get_wrapper(). It's not necessary, but it's a tiny simplification.

Do weak references have an advantage over the qdata API, an advantage that I'm
not aware of?
Comment 40 José Alburquerque 2013-03-11 18:02:29 UTC
(In reply to comment #39)
> Do weak references have an advantage over the qdata API, an advantage that I'm
> not aware of?

Probably none.  Just two ways of addressing this issue, I think.
Comment 41 José Alburquerque 2013-03-12 02:18:06 UTC
(In reply to comment #36)
> I came to the opposite conclusion when I studied Glib::Source once more:
> We must not depend on a finalize callback function in Source::vfunc_table_.
> If the Glib::Source instance has been constructed by the default constructor,
> it would be possible to depend on such a finalize callback, after some change
> of GSource in glib. The patch in comment 16 shows that it's possible even
> without changing GSource.
> 
> Unfortunately there is also the constructor
> Source::Source(GSource* cast_item, GSourceFunc callback_func).
> When that constructor is used, we can't set our own GSourceFuncs callbacks
> (prepare, check, dispatch, finalize). Comment 17 shows that at least one
> application uses this constructor.

One way to work around this is that if a finalize function of the cast_item's GSourceFuncs exists, store it in a new member variable of the SourceCallbackData struct replacing the function with our custom finalize_vfunc().  Our finalize_vfunc(), once called, can check for the existence of the replaced finalize vfunc, call it, complete the destruction process and return.

This would require almost no change to the C API except the correction of the deletion of the callback data from when destruction of the source takes place to when the last reference of it is lost (which I think would be the proper place do that).

Of course, it requires more work on our part and it's probably not as easy to understand as using the additions requested in the glib bug, but I think it would work.
Comment 42 Kjell Ahlstedt 2013-03-12 08:43:44 UTC
Let's wait for a while and see if we get a reaction to bug 695631.

There are alternatives that don't require a qdata API or weak references in
GSource, but they all seem far-fetched and ugly. One of the goals of glib and
its relatives (gtk+, etc.) is that it shall be easy to make language bindings.
That goal has been met for most of glib, but there are a few difficult classes.
GSource is one difficult-to-wrap class.

With just a little bit of help from glib, it will be _easy_ to wrap GSource
without causing memory leaks or illegal memory accesses.
Comment 43 Kjell Ahlstedt 2013-03-20 11:59:08 UTC
Created attachment 239348 [details] [review]
patch: Glib::Source: Fix the destruction and deletion.

The glib bug 695631 has been closed with WONTFIX. We must find a solution
without modification of glib.

The patch shows my suggested solution. It's a variation of Phillip's suggestion
in comment 21. Glibmm keeps its own reference counter, and unreferences the
GSource instance just once, when glibmm's ref count has reached 0.
Glib::IOChannel uses this strategy in some situations.

As discussed previously we can't add a ref_count member to Glib::Source now,
because it would break ABI. It complicates the solution. I've added a
std::map<const Glib::Source*, ExtraSourceData> that stores the ref_count.

If you think that the mutex, protecting the map, slows down execution too much,
you should know that g_source_ref() and g_source_unref() also use a mutex.

If you think that searching the map slows down execution too much, you may be
right if there are 1000s of sources.
Comment 44 Phillip Susi 2013-03-20 13:24:54 UTC
I'd prefer to avoid using a map all the time if it isn't too difficult.  Is there a reason you don't think the weak ref idea is better?
Comment 45 Kjell Ahlstedt 2013-03-20 15:00:51 UTC
I think both weak references and the qdata API I proposed are better than a
ref_count in a map. But both solutions require changes in gmain.c, and the glib
developers don't want it changed. It's true I only mentioned weak references
in passing in bug 695631. I'm not sure the glib developers would oppose it as
much as they oppose the patches I and José proposed.

Both José and I have spent much more time on this bug than it's worth. I'm not
going to push more for a change in gmain.c. You're welcome to continue the
discussion yourself, for example by reopening bug 695631 and discussing the
matter there. I'd be glad if you manage to get either weak references or a
qdata API into GSource.
Comment 46 Phillip Susi 2013-03-20 15:42:15 UTC
Huh?  Neither the ref_count in a map, nor weak references require any changes to glib.
Comment 47 Kjell Ahlstedt 2013-03-20 17:27:45 UTC
Have I overlooked something? The only glib functions I have found, dealing with
weak references, are g_object_weak_* and g_weak_ref_*. They all work only with
a GObject. A GSource is not a GObject. g_weak_ref_* don't even offer a chance
to register a callback function, which is what we want.

What else is there? How can we set a weak reference on a GSource instance, and
have a callback function called when the instance is finalize? Or do you have
anything else in mind?
Comment 48 Phillip Susi 2013-03-20 18:46:45 UTC
What the hell?  I thought everything in glib was supposed to be derived from GObject?!

Well, the refcount in a map at least won't require any changes to glib.
Comment 49 Kjell Ahlstedt 2013-04-02 08:05:29 UTC
José, do you plan an addition to this bug? You say in bug 695631 comment 12
that you see an easy way to fix this bug. Do you see a better way than my patch
in comment 43? If not, I suggest I push that patch.


Why don't we use the pimpl idiom for private data in glibmm classes?
(pimpl = pointer to implementation, http://en.wikipedia.org/wiki/Pimpl)
Then we could add private data without breaking ABI.
(Unfortunately it's not possible to misuse Glib::Source::gobject_ as a general
pimpl pointer without breaking ABI, because Glib::Source::gobj() is inlined.)
Comment 50 José Alburquerque 2013-04-03 04:18:59 UTC
Created attachment 240459 [details] [review]
Glib::Source: Preliminary patch to correct the wrapper destruction.

I did have a slightly different idea.  It too has to do with referencing/unreferencing but I did not think of keeping a separate reference count.  What I was thinking is to sort of take note of when the GSource is about to lose its last references and then delete the wrapper then.  This patch exemplifies the idea more or less.  The patch is not complete because the case for when an existing GSource is wrapped is not included but I believe that the case that is included (the one when the Glib::Source is created) does work.

The places that I modified to take note of when the GSource is about to lose its last references and then free the wrapper were in the Glib::Source::unreference() method and in the Glib::Source::dispatch_vfunc().

In the Glib::Source::unreference() method, if there is a final reference, the wrapper is destroyed.

As far as the Glib::Source::dispatch_vfunc(), it is called from g_main_dispatch():

static void
g_main_dispatch (GMainContext *context)
{
  ...

  for (i = 0; i < context->pending_dispatches->len; i++)
    {
      GSource *source = context->pending_dispatches->pdata[i];

      ...

      if (!SOURCE_DESTROYED (source))
	{
          ...

	  gboolean (*dispatch) (GSource *,
				GSourceFunc,
				gpointer);
          ...

	  dispatch = source->source_funcs->dispatch;

          ...

	  need_destroy = ! dispatch (source,
				     callback,
				     user_data);

          ...

	  if (need_destroy && !SOURCE_DESTROYED (source))
	    {
	      g_assert (source->context == context);
	      g_source_destroy_internal (source, context, TRUE);
	    }
	}
      
      SOURCE_UNREF (source, context);
    }

  ...
}

When the Glib::Source::dispatch_vfunc() returns false, this signals that the GSource should be destroyed.  If that is the case it will lose one reference count from the call to g_source_destroy_internal() and then one other reference from the SOURCE_UNREF() macro call in g_main_dispatch().  Thus before returning, the Glib::Source::dispatch_vfunc() checks for these conditions (whether the GSource will be destroyed and if so whether it has two final references).  If that is the case the wrapper is also freed.

As far as dealing with the wrapping of an existing GSource, that would be a little more complicated.  I think it would be necessary to do something like what I described in comment 41 with respect to the finalize_vfunc().  I've not explored this completely.  I stopped because I thought it was more complicated than what an easy solution might be.  However, of course, if it's necessary and it works I can complete the patch.

As far as pushing your patch, I'm not sure that would be a good idea.  I modified the timeout example in gtkmm-documentation with the following patch and it crashed.  I'm not sure if I did something wrong.
Comment 51 José Alburquerque 2013-04-03 04:20:45 UTC
Created attachment 240460 [details] [review]
gtkmm-documentation: timeout: Modified to test Gllib::Source.
Comment 52 Kjell Ahlstedt 2013-04-03 15:40:47 UTC
Created attachment 240493 [details] [review]
patch: Glib::Source: Fix the destruction and deletion.

Many thanks for detecting the bug in my patch in comment 43.
I had not realized that SourceCallbackData::destroy_notify_callback() can be
called _after_ all the RefPtr<Source> refs are gone. The C++ wrapper must be
kept until both all RefPtr<Source> have been deleted _and_
SourceCallbackData::destroy_notify_callback() has been called.

This new patch fixes that problem.

Your patch in comment 50 does not cause crashes, as far as I can see, but it
can cause memory leaks. There are situations when the C++ wrapper is not
deleted.
Run your modified gtkmm-documentation/book/timeout program. Manually remove a
running timer and/or quit the program while a timer is running. The timer's
wrapper is not deleted. The timer is not stopped by result == 0 in Source::
dispatch_vfunc() but by the call to g_source_destroy() in
SourceConnectionNode::notify() when the slot is disconnected.

(In reply to comment #50)
> As far as dealing with the wrapping of an existing GSource, that would be a
> little more complicated.

Your fix works also in this case, provided there is only one reference to the
existing GSource when it's wrapped, and no extra refs are created later except
from Source::reference().
E.g. the call to Source::Source(GSource*, GSourceFunc) from
IOSource::IOSource(const Glib::RefPtr<IOChannel>&, IOCondition) would be ok.
I don't think we need to bother with other cases.

Which solution shall we choose?

The main drawback with my solution is the std::map that stores the extra ref
counter. It can easily be removed when we can break ABI. If we could do that,
I would recommend my solution. Now it's more or less a tie.

Your solution needs some more fixing to avoid the memory leak. It's a drawback
that you must know how many refs GSource holds internally when Source::
dispatch_vfunc() is called. I suppose the glib developers can change that
without notice. Possible, but not probable.
Comment 53 José Alburquerque 2013-04-03 16:53:04 UTC
Of course, I posted the patch to exemplify what I was thinking of.  It was an initial proof of concept.  Yes, there would be other things that would have to be fixed.  I pretty much knew that; I just wanted to have what I was thinking of posted so it could be seen more or less how it would work.

When I noticed the crash with your patch I thought there could be a problem with keeping a separate reference count (I really never liked that idea).  I also don't like the addition of the std::map<> because if there were many sources (though I couldn't conceive why right now) it could cause execution delays when the sources are referenced/unreferenced.

From the discussion in the glib bug, there may in fact be a much simpler solution than both of the ones we're discussing.  That's why I had not posted my idea.  I still think we should wait a little and see if that other solution mentioned in the glib bug exists and would in fact fix the problem.
Comment 54 Kjell Ahlstedt 2013-04-04 07:09:34 UTC
Ok, I'll wait until you present a better solution, or say you give up.
I doubt that there is a good solution that does not require changes in GSource.

(In reply to comment #52)
>> As far as dealing with the wrapping of an existing GSource, that would be a
>> little more complicated.
>
> Your fix works also in this case, ...

I was wrong again. I forgot that Source::dispatch_vfunc() is not called for
those GSource instances.
Comment 55 José Alburquerque 2013-04-04 13:29:24 UTC
If it is I that has to provide a better solution, I doubt that I will be able to come up with something better than either the most recent solution I have proposed or the ones that have already been proposed before.  I don't vehemently object to the separate reference count idea, I just didn't like the idea much from the beginning, but that's just preference, nothing more.

However, even if I liked the idea and thought that the patch could be pushed, it is not up to me to decide that, the glibmm maintainers decide that, in which case I think you would still have to wait for their approval.  Personally, I don't mind which solution is used.  I myself have just provided possible solutions as well which the glibmm maintainers can approve or not themselves also.  I'm not bothered either way.

Again, I think waiting for the glibmm maintainers'  decision would be best.  If you feel you don't need their approval and want to push the patch yourself then you feel free to do that if you like.  It won't bother me at all.
Comment 56 Murray Cumming 2013-04-04 13:40:40 UTC
>  the glibmm maintainers decide that,

I trust your and Kjell's judgement, particularly on this. I have no plan to wade through this and choose between competing solutions.

Thanks for all the hard work as usual.
Comment 57 José Alburquerque 2013-04-04 13:54:35 UTC
In that case then let's wait a little, consider this a little more and see what would best before rushing to commit anything.  We may be able to improve on the ideas presented before a final commit.
Comment 58 Kjell Ahlstedt 2013-04-07 16:40:40 UTC
Created attachment 240893 [details] [review]
patch: Glib::Source: Fix the destruction and deletion.

My patch in comment 52 has a flaw. I assumed that the destroy notify function
is unconditionally called when g_source_destroy() is called on a source that
is not already destroyed. That is not true. If the source has not been attached
to a main context, the destroy notify function is not called until the source
is finalized. This new patch does not make such an assumption.

My patch is ugly, but it's the best I can do. It's ridiculously difficult to
avoid both memory leaks and accesses to deallocated memory without modifying
GSource. With a suitable modification of GSource, it would be easy.
I gave up too easily when bug 695631 was closed.

(In reply to comment #55)
> If it is I that has to provide a better solution, I doubt that I will be able
> to come up with something better than ...

I misunderstood bug 695631 comment 12. I thought it meant that you had a great
idea how to easily fix this bug without modifying GSource. Hope you don't feel
forced to do more on this. We've both done more than what's reasonable.
Comment 59 Murray Cumming 2013-04-07 16:53:19 UTC
The change in glib was not accepted because they were told that it could be done in glibmm instead. If that's no longer true, you can try asking for the change in glib again, explaining the new situation.
Comment 60 Kjell Ahlstedt 2013-04-07 17:29:30 UTC
It's not impossible to do it in glibmm alone, but it would be easier and
cleaner if glibmm could register a callback function with user_data (a pointer
to the C++ wrapper), a function that is called when the GSource instance is
finalized. Just like it's done with all classes derived from GObject.

Right now I'm convinced that my patch in comment 58 works. But it's the third
version. I can't be sure I still have confidence in it in a week.
I think it's wise to wait for a while, like José has suggested.
Comment 61 Phillip Susi 2013-04-15 13:57:45 UTC
FYI, this has been causing gparted to crash randomly due to heap corruption.  This basically makes the glibmm GSource wrappers unusable.

Please change the priority of this to high/critical, since it causes crashes and makes the component seriously broken.
Comment 62 Kjell Ahlstedt 2013-04-15 17:21:15 UTC
The segfaults reported in gparted bug 697727 have been tracked down to this bug
in Glib::Source.

My patch in comment 58 is not pretty, but I still think it works. It can be
improved when we can break ABI. Then we can get rid of the std::map.

How about this way forward?
I push the patch very soon. If anyone finds a better non-ABI-breaking solution
later, that solution can replace my present solution in a later version of
glibmm.

But first: Phillip, is it possible for you (or someone else) to check if my
patch fixes the segfault problems in gparted?
If you try to apply my patch to the latest version of glibmm from the git
repository, you will find that the changes to ChangeLog don't apply. That's
unimportant. You can skip those changes (git apply --exclude=ChangeLog ...).
Comment 63 José Alburquerque 2013-04-15 17:37:22 UTC
(In reply to comment #62)
> The segfaults reported in gparted bug 697727 have been tracked down to this bug
> in Glib::Source.
> 
> My patch in comment 58 is not pretty, but I still think it works. It can be
> improved when we can break ABI. Then we can get rid of the std::map.
> 
> How about this way forward?
> I push the patch very soon. If anyone finds a better non-ABI-breaking solution
> later, that solution can replace my present solution in a later version of
> glibmm.

That's fine with me if your patch works.

> 
> But first: Phillip, is it possible for you (or someone else) to check if my
> patch fixes the segfault problems in gparted?
> If you try to apply my patch to the latest version of glibmm from the git
> repository, you will find that the changes to ChangeLog don't apply. That's
> unimportant. You can skip those changes (git apply --exclude=ChangeLog ...).
Comment 64 Phillip Susi 2013-04-16 15:08:44 UTC
Due to the nature of race conditions and non deterministic effects of writing to freed memory, it isn't very reproducible.  I myself haven't been effected by it.
Comment 65 Curtis Gedak 2013-04-16 15:50:58 UTC
The problem with the segfault identified with GParted 0.15.0 in bug #697727 seems to occur most frequently with Debian distributions.

If you can instruct me on how to compile and test gparted with this patch, then I would be happy to help out.  Please note that I have never compiled any of the glibmm libraries from scratch.

Also as Phillip points out correctly, it is difficult to know for sure if the patch actually fixes the problem, mainly because there is no sequence of steps that is guaranteed to reproduce the problem.
Comment 66 Mike Fleetwood 2013-04-17 12:46:34 UTC
Provided that the proposed patch can be applied to glibmm 2.24.2 in
Debian 6; it would arguably be better from the point of view of
verifying that the patch fixes bug #697727 with GParted 0.15.0, to just
rebuild Debian's libglibmm package with the patch applied.  That might
be an easier task with the Debian package maintainers having already
taken care of how to build glibmm from source.

(Though I don't know how to rebuild dpkg's myself as I'm an rpm boy).
Comment 67 Kjell Ahlstedt 2013-04-18 13:57:58 UTC
I have pushed the patch in comment 58.
https://git.gnome.org/browse/glibmm/commit/?id=746f8ba0259cb88c0c6b667f4a663766fcfba359

It's not meaningful to test it with gparted. The crashes in gparted are not
caused by the problem discussed in this bug. They are caused by
Glib::signal_idle().connect() and Glib::signal_child_watch().connect(),
which are not thread-safe. See bug 396958 and 697727.

My patch fixes the problem in this bug, but it's quite ugly. If no one finds a
better solution, it shall at least be improved when we can break ABI.
I keep this bug open, but I change its title and lower its severity.
Comment 68 Phillip Susi 2013-04-18 14:23:32 UTC
If the reference counting is fixed, then it should solve the gparted crash because the object won't be deleted until the reference is dropped, even if the GSource is destroyed.
Comment 69 Kjell Ahlstedt 2013-04-18 18:32:44 UTC
The patch that I pushed changes the methods
  SourceCallbackData::destroy_notify_callback()
  Glib::Source::reference()
  Glib::Source::unreference()

The code causing crashes in gparted is, according to your patch

  Glib::signal_idle().connect( sigc::mem_fun(*this,
    &copy_blocks::set_progress_info) );

  Glib::signal_child_watch().connect( sigc::mem_fun(
    status, &utils_execute_command_status::store_exit_status ), pid );

These calls create neither a SourceCallbackData nor a Glib::Source.
None of the changed functions are called.
Take Glib::signal_idle().connect() as an example. It creates a Glib::SignalIdle
(temporarily) and a SourceConnectionNode.

It is possible to replace

  Glib::signal_idle().connect( sigc::mem_fun(*this,
    &copy_blocks::set_progress_info) );
by
  Glib::RefPtr<Glib::IdleSource> idle_source = Glib::IdleSource::create();
  idle_source->connect(sigc::mem_fun(*this, &copy_blocks::set_progress_info));
  idle_source->attach(Glib::MainContext::get_default());

With such replacements, I believe the code would actually become thread-safe,
provided
1. The thread-unsafe sigc::connection, returned by Glib::IdleSource::connect()
   is discarded, and hopefully deleted, before Glib::IdleSource::attach() is
   called, attaching the source to a MainContext running in another thread.
   (As in the code above.)
2. copy_blocks is not derived from the thread-unsafe sigc::trackable.

I suppose this is not an option for gparted now, but probably in the future.
You want to be able to use gparted together with any version of glibmm that has
been released in the last 3 years or so, if I understand the situation
correctly. I very much doubt that the comment 58 patch will ever be applied to
all those versions.
Comment 70 Kjell Ahlstedt 2016-11-21 15:52:24 UTC
If Glib::RefPtr is replaced by std::shared_ptr (bug 755037), it may be possible
to simplify Glib::Source. At least I hope so.
Comment 71 Kjell Ahlstedt 2016-12-03 09:33:44 UTC
In the ABI-breaking master branch: I have removed the extra_source_data map
and added instance data in Glib::Source. I don't think it's possible to make
Source simpler without modification of glib.

If std::shared_ptr replaces Glib::RefPtr, Source::ref_count_ will be replaced by
shared_ptr's use_count. But I think Source will still need the keep_wrapper_
counter and a specialized deleter.

Closing this lengthy bug. Future problems connected to shared_ptr can be
discussed in bug 755037.