Bug 512348 – Glib::Thread::create() does not appear to be thread safe

After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.

Bug 512348 - Glib::Thread::create() does not appear to be thread safe


Summary:	Glib::Thread::create() does not appear to be thread safe


Status:	RESOLVED FIXED

Product:	glibmm
Classification:	Bindings
Component:	threads
Version:	2.14.x
Hardware:	Other All

Importance:	Normal normal
Target Milestone:	---
Assigned To:	gtkmm-forge
QA Contact:	gtkmm-forge

URL:
Whiteboard:

Depends on:
Blocks:

Reported:	2008-01-27 11:08 UTC by Chris Vine
Modified:	2015-09-16 13:41 UTC

See Also:
GNOME target:	---
GNOME version:	2.19/2.20

Attachments
Example of one solution (9.85 KB, text/plain) 2008-01-27 11:09 UTC, Chris Vine		Details
Documentation patch for Glib::Thread::create() (1.14 KB, patch) 2008-06-17 10:12 UTC, Chris Vine	none	Details \| Review
Corrected patch (1.73 KB, patch) 2008-06-17 10:26 UTC, Chris Vine	committed	Details \| Review
Documentation patch (10.58 KB, patch) 2011-03-18 18:08 UTC, Chris Vine	needs-work	Details \| Review
ChangeLog diff (398 bytes, patch) 2011-03-18 18:19 UTC, Chris Vine	none	Details \| Review
Supplementary documentation patch (2.05 KB, patch) 2011-03-29 18:21 UTC, Chris Vine	none	Details \| Review
Revised documentation patch (9.94 KB, patch) 2011-09-14 09:06 UTC, Chris Vine	needs-work	Details \| Review
Second revise of documentation patch (10.77 KB, patch) 2011-10-25 12:47 UTC, Chris Vine	none	Details \| Review
Third revise of documentation (10.78 KB, patch) 2011-10-25 12:53 UTC, Chris Vine	needs-work	Details \| Review
Fourth revise of documentation (10.84 KB, patch) 2011-11-22 10:35 UTC, Chris Vine	none	Details \| Review
Test case for std::bind and sigc::trackable (806 bytes, text/plain) 2013-02-17 19:49 UTC, Chris Vine		Details
Test case for std::bind and Glib::Threads::Thread::create() (712 bytes, text/plain) 2013-02-17 19:55 UTC, Chris Vine		Details
Modified test cases for std::bind and lambda expressions. (882 bytes, application/x-compressed-tar) 2013-02-18 14:10 UTC, Kjell Ahlstedt		Details
Tutorial patch for std::bind/boost::bind (3.46 KB, patch) 2013-02-18 19:58 UTC, Chris Vine	none	Details \| Review
patch: Amend the "Multi-threaded programs" chapter. (5.27 KB, patch) 2013-02-19 17:47 UTC, Kjell Ahlstedt	none	Details \| Review

Description Chris Vine 2008-01-27 11:08:36 UTC

Please describe the problem:
Glib::Thread::create() constructs a new slot and passes it by pointer to the new thread through the thread function call_thread_entry_slot().  Where the method bound to the slot belongs to a class object derived from sigc::trackable, its creation appears to manipulate a list of notification callbacks (a std::list<> object) maintained by the sigc::trackable object.  The same slot is disposed of in the new thread (a different thread), which also invokes the same list.

That is not of itself problematic since clearly the deregistration of
the slot in call_thread_entry_slot() cannot occur concurrently with its
registration.  However, it could occur at the same time that the
original thread is creating new slots for the same target object whose
method was bound to the slot: in that case, there would be undefined
behaviour.

See http://mail.gnome.org/archives/gtkmm-list/2008-January/msg00261.html for a more detailed explanation.

Steps to reproduce:
1. 
2. 
3. 


Actual results:


Expected results:


Does this happen every time?


Other information:
It is difficult to see a way around this with the current libsigc++ implementation - it would probably require a new argument on slot creation allowing sigc::trackable to be ignored.

In the meantime I have knocked up an implementation of a callback class which avoids this and which I will post and is trivial to use (but it is probably not the way glibmm will want to go).  It also disposes of bug 396958

Comment 1 Chris Vine 2008-01-27 11:09:38 UTC

Created attachment 103813 [details]
Example of one solution

Comment 2 Chris Vine 2008-01-27 22:00:42 UTC

"That is not of itself problematic ...".

Actually, quite apart from what further slots the original thread may create, the temporary slot bound to the const reference argument of Glib::Thread::create() in that thread has to die at some stage, and thus apparently through its destructor invoke sigc::trackable functions if bound to a method of an object deriving from trackable.

The standard says the bound temporary must last until the call to that function has returned, but does not specify when after that the destruction of the object is to occur.  Even if on a particular compiler it occurs immediately after that return, it is at least possible that it might occur around the time that function call_thread_entry_slot() deletes the slot it has received if the thread call is short; and in particular on multi-processor machines before caches have synchronised to enable memory visibility.

Comment 3 Jonathon Jongsma 2008-04-20 17:53:16 UTC

Just to help me understand this bug report, have you ever actually seen this issue manifest itself in a program, or is this mostly theoretical at the moment?

Comment 4 Chris Vine 2008-04-20 20:12:38 UTC

Short explanation:

No, it's not theoretical.  It involves undefined behaviour (according to POSIX) and like most thread unsafe code the occasions of failure will be infrequent, random and difficult to attribute to any particular code path.

It arises because on slot construction the list object maintained by the underlying sigc::trackable object is manipulated, and the same object is manipulated in a different thread on slot destruction without memory synchronisation.

If the thread function is to continue to be passed as a sigc::slot object, the best course is to document that the slot shouldn't represent a non-static method of a class object derived from sigc::trackable.  In the long term, the solution is to pass the callback by something like boost::function (or the callback object referred to in bug #396958).

Long explanation:

Libsigc++ advertises that if a slot created via sigc::mem_fun represents a non-static method of a class object derived from sigc::trackable, the slot will automatically be invalidated and disconnected if that object is destroyed.

I have verified empirically that that is the case.  It would be achieved on slot creation by a call to sigc::trackable::add_destroy_notify_callback() to register the slot with the relevant sigc::trackable target object belonging to the object whose method is bound to the slot.  That function manipulates the std::list<sigc::trackable_callback> object held by the sigc::trackable object's sigc::trackable_callback_list member (it adds the relevant trackable_callback to the list using std::list<sigc::trackable_callback>::push_back()).  On slot destruction the reverse would happen - a call would be made to sigc::trackable::remove_destroy_notify_callback(), which removes the relevant trackable_callback from the list by calling std::list<sigc::trackable_callback>::erase().

The "may" in the title arises from the fact that I have not been able to penetrate the sigc++ code to identify the code path through which the calls to sigc::trackable::add_destroy_notify_callback()/remove_destroy_notify_callback() are made.  However, as I say, I have established empirically that automatic slot disconnection does occur.  According to POSIX (and also Windows for those using that OS), modifying a memory location in two different threads, or modifying it in one and reading it in another, without synchronisation gives rise to undefined behaviour.

Chris

Comment 5 Jonathon Jongsma 2008-04-22 03:07:51 UTC

My use of the word 'theoretical' was not intended to be pejorative.  I was just curious about whether you started this analysis because you noticed a program behaving strangely, or whether you discovered this issue just by examining code.  

I understand most of what you've said here fairly well.  The thing that's not completely clear to me yet is what situations are vulnerable and what situations are not vulnerable.  

I had originally understood the problem to only affect classes who have a bound member used as a thread creation slot and then also have other members bound to other slots.  But your comment #2 above suggests that this may actually be a problem for any thread creation slot that is bound to a member function of a sigc::trackable object regardless of whether additional slots are bound to other member functions or not.  And if I understand your analysis, it seems correct to me, though it seems unlikely to affect anything other than *very* short thread functions.

So it appears that the 'safe' situations are the following:
- using static and standalone functions
- member functions of objects that do not derive from sigc::trackable
- member functions of sigc::trackable objects that are relatively long-running and you never bind another member to a slot

Does this sound correct to you?

Comment 6 Chris Vine 2008-04-22 20:46:31 UTC

Yes, I think that is a fair summary, although the third point hides a little complexity.

For a fair non-summary on your third point, we both accept that if a std::list object can be manipulated (pushed onto, popped from or examined by eg a call to std::list<>::front()) by more than one thread without synchronisation then in due course it will reach an invalid state (empirical evidence suggests that the first time it seriously blows up will be when you are demonstrating your program to a prospective client).  Each std::trackable object contains a std::list object containing destroy notify callbacks.  New destroy notify callbacks appear to be pushed, and old ones popped, whenever a new slot referencing a non-static method of the object deriving from sigc::trackable is created or deleted.

Therefore, we would be OK in the case of a non-static thread method of an object deriving from sigc::trackable if:

(a) when Glib::Thread::create() is called and until the temporary bound to the Glib::Thread::create() function by the call to sigc::mem_fun has been destroyed, no other thread could create a new slot referencing a non-static method (any non-static method, not just the thread function) of that object until memory has re-synchronised, the object containing the std::list object remains in existence until that time, and no other slot previously bound to a non-static method of that object is destroyed in another thread until that time[1], which is an unknown time in the absence of specific synchronisation to deal with the point, but your "relatively long-running" addresses this; and

(b) there is no unsynchronised memory affecting the std::list object (by the creation or deletion of any other slot objects or the going out of existence of the object containing the std::list object) when the slot copied in Glib::Thread::create() is later deleted in a different thread in call_thread_entry_slot().

What this really shows is that because libsigc++ is not thread safe (nor in my opinion should it be), it is not the right tool for this particular job.  It is however the right tool for many other jobs.

[1] Including the slot object deleted in call_thread_entry_slot() in the new thread.

Comment 7 Chris Vine 2008-06-17 10:12:40 UTC

Created attachment 112897 [details] [review]
Documentation patch for Glib::Thread::create()

Here is a documentation patch.

Comment 8 Chris Vine 2008-06-17 10:26:22 UTC

Created attachment 112898 [details] [review]
Corrected patch

Sorry the last patch failed to patch the second Glib::Thread::create() function.

Comment 9 Murray Cumming 2008-06-18 07:36:47 UTC

Committed. Thanks.

Are there other things to do for this bug, or can it be closed?

Comment 10 Chris Vine 2008-06-18 19:49:24 UTC

It would be nice to overload Glib::Thread::create() with a version taking a new callback class as an argument, such as in the example posted in earlier comments in this bug report, or to do so when c++0x comes out by using the std::function callback wrapper, and deprecate the version taking a sigc::slot object.  (This would also provide a route for dealing with bug #396958.)

Whether you want to keep the bug alive for that purpose I am not sure.  If not I should close it.

Comment 11 Jonathon Jongsma 2008-06-18 20:10:08 UTC

I think it would probably be good to actually fix this at some point, though I'm still not sure which way we should do it.  So i guess I'd recommend leaving it open.

Comment 12 Murray Cumming 2008-06-19 05:56:54 UTC

> when c++0x comes out by using the std::function callback wrapper, and deprecate 
> the version taking a sigc::slot object.

So std::function is safer than sigc::slot? Can't we make sigc::slot as safe as std::function?

Comment 13 Chris Vine 2008-06-19 10:10:02 UTC

The sigc::slot object itself is thread safe, but sigc::trackable, with which the slot is registered, is not.

At the sigc level, the best thing would be to add functionality to be able to "switch off" trackability for particular slots.  I remember some time ago looking at whether passing the address of the slot to sigc::trackable::remove_destroy_notify_callback() might do it (I think I found it would not, but I remember also thinking that the sigc code was rather opaque so that might be wrong).  If it does do it, if the trackable object itself could be identified in Glib::Thread::create(), sigc::trackable::remove_destroy_notify_callback() could be called in Glib::Thread::create() on slot_copy immediately after:

  sigc::slot_base *const slot_copy = new sigc::slot<void>(slot);

However I think I also concluded that information about the trackable object had been lost at that stage, but I am afraid it is some time since I looked at this.

In any event, if this is to be dealt with at the sigc level, a better solution would be either (i) to be able to pass an additional argument to sigc::mem_fun() on slot creation which could, if changed from the default, prevent slot registration/deregistration in the first case, (ii) to have a compile time option which includes code which protects operations on the std::list object maintained by each sigc::trackable object with a mutex.

The latter gives rise to dependency issues.  It is the best solution however as it would allow different slots representing methods of the same object to be created in different threads.  (Possibly there ought to be a more general health warning about this in the Glib::Thread documentation, as I suspect there might be a quantity of thread-unsafe code around because of sigc::trackable.)

Chris

Comment 14 Chris Vine 2008-06-19 10:17:09 UTC

Even better, a third solution would be to be able to pass a custom container object to the sigc::trackable constructor as a constructor argument, or to have the container type as a template type (as with std::queue) - it would then be possible to pass in a thread-safe container (I would choose std::set rather than std::list anyway).

That would avoid any dependency issues.

Comment 15 Daniel Elstner 2009-06-22 13:43:18 UTC

Wait a moment -- can't you just *not derive* your class from sigc::trackable and that's it?  There used to be a class_slot()  in older versions of sigc++, but I think since the switch to the boost syntax it is now implicitly untracked if the object does not derive from trackable.

Comment 16 Chris Vine 2009-06-22 14:38:27 UTC

Comment #15:  Of course, and that is what the documentation patch I submitted for Glib::Thread::create() (now in glibmm-2.20) says.

However, though now documented, this is a trap for the unwary - trackability is, or should be, irrelevant to the passing of a slot to start a new thread.  Rather than document the problem, it would be better to prevent it by "switching off" trackability for the slot copied to call_thread_entry_slot().  This cannot be done at present.

Comment 17 Murray Cumming 2011-02-21 11:34:54 UTC

Would someone like to summarize the current state of this bug, to make my life easier?

Comment 18 Chris Vine 2011-02-21 17:10:28 UTC

Summary:

1.  This bug is a subset of the lack of thread safety of libsigc++ in general, and sigc::trackable in particular.  In this instance, starting a thread via Glib::Thread::create() by calling a non-static member function of a class object deriving from sigc::trackable will cause undefined behaviour (and I believe I have come across a case where this has caused program malfunction).

2.  The temporary fix, adopted since glibmm-2.20, is to document that no thread should be started in respect of such a function.

3.  The permanent fix is either (a) to provide a new custom functor object other than sigc::slot to pass to Glib::Thread::create(), (b) to adopt tr1 and use std::tr1::function to pass to Glib::Thread::create() as an optional overload in those cases where std::tr1 is supported, or (c) to wait for glibmm to move to C++11/12 and either use std::function or abandon glib threads entirely and use std::thread.

I suspect that none of the solutions in 3 is likely to seem attractive to you, so I suggest we stick with 2.

A propos, I think glibmm/gtkmm would benefit greatly from some explanation about how to use gtkmm and libsigc++ in as safe a way as is practicable, because that is not obvious (even Glib::Dispatcher has significant hidden gotchas).  I am willing to provide documentation patches with that in mind if there is a prospect of them being merged.  There would be an issue about where to put this: probably as an addendum to the documentation on Glib::Thread and Glib::Dispatcher.

Comment 19 Murray Cumming 2011-02-21 20:09:49 UTC

(In reply to comment #18)
>  I am willing to provide documentation patches with that in mind if
> there is a prospect of them being merged.  There would be an issue about where
> to put this: probably as an addendum to the documentation on Glib::Thread and
> Glib::Dispatcher.

Or how about a chapter or appendix in the gtkmm book, with a link to it from the API documentation wherever it's relevant? You are the best judge of what's most appropriate.

Comment 20 Murray Cumming 2011-03-15 10:07:18 UTC

Chris?

Comment 21 Chris Vine 2011-03-15 19:55:54 UTC

I'll do something this week.

A new section 27.3 (recommended techniques) seems the best spot.  I don't think it deserves a chapter to itself.

Where is the source for the tutorial available?  (It doesn't seem to be in the tarball.)

Comment 22 Murray Cumming 2011-03-15 21:32:10 UTC

(In reply to comment #21)
> Where is the source for the tutorial available?  (It doesn't seem to be in the
> tarball.)

It's in the gtkmm-documentation module, in git, and released as a tarball.

Comment 23 Chris Vine 2011-03-18 18:08:49 UTC

Created attachment 183736 [details] [review]
Documentation patch

As I had no comments following my request for comments in the gtkmm mailing list, here is a patch to the gtkmm tutorial.

This patch is against gtkmm-documentation-2.23.0.  In the end, because of its length I have incorporated it as a separate chapter, but if you want to put it somewhere else that's fine.

It is an adjusted (slightly expanded) version of the one I posted to the mailing list.

It might be worth making a link to this from the documentation for Glib::Thread::create().  Is it OK for the glibmm documentation to link to other things in this way, and if so do you have any examples of how to do so?  (If it is too complex it is probably not worthwhile: a textual cross-reference could be made instead.)

Comment 24 Chris Vine 2011-03-18 18:19:05 UTC

Created attachment 183741 [details] [review]
ChangeLog diff

Here is a separate patch for the ChangeLog.  You will probably want to redate it with whatever date you apply the main patch and merge it into git.

Comment 25 Chris Vine 2011-03-29 18:21:44 UTC

Created attachment 184615 [details] [review]
Supplementary documentation patch

This is a supplementary patch (it applies over the one of two weeks ago), the purpose of which is to make point 1 under "The rules" a little more explicit and helpful.

Comment 26 Murray Cumming 2011-09-13 13:55:01 UTC

Review of attachment 183736 [details] [review]:

I have not finished reviewing this, but here is a start.

::: gtkmm-documentation-2.23.0/docs/tutorial/C/gtkmm-tutorial-in.xml.orig
@@ +7378,3 @@
+provides the normal set of thread launching functions, mutexes,
+condition variables and scoped locking classes required for writing
+multi-threaded programs using C++. Those using recent versions of

We generally try not to talk about API by first mentioning that it wraps glib API. Ideally, people don't need to think about glib. And anyway it's a general issue for everything, not just to be mentioned for this part of the API.

@@ +7381,3 @@
+g++'s C++0x implementation can also make use of the thread facilities
+to be provided by the forthcoming C++11/12 standard in combination
+with <application>glibmm</application> and &gtkmm;.

This should mention specific API, ideally with a link.

Also, are you saying that people could/should use it instead, or in combination?

@@ +7388,3 @@
+multiple threads of execution, arising from the fact that
+<application>libsigc++</application> is not thread-safe.  Amongst
+other things, a class inheriting from

Should the libsigc++ reference documentation have a comment about this?

@@ +7405,3 @@
+connection.  None of these complex interactions is protected by a
+mutex or other means of synchronization.
+</para>

Yikes. That's a long bug report that's incomprehensible to the casual reader. I'd skip it and just link to a libsigc++ bug report or documentation.

@@ +7421,3 @@
+Although code written for the X11 backend of GTK+ can use the GDK
+global lock, as invoked by <function>gdk_threads_enter()</function>
+and <function>gdk_threads_leave()</function>, so as to address GTK+ in

What is X11-specific about gdk_threads_enter/leave()? Their documentation doesn't mention any platform specifics.

@@ +7423,3 @@
+and <function>gdk_threads_leave()</function>, so as to address GTK+ in
+more than one thread, because of <application>libsigc++</application>
+this will not usually work with &gtkmm;.  Instead, if a worker thread

I would clarify "work".

@@ +7424,3 @@
+more than one thread, because of <application>libsigc++</application>
+this will not usually work with &gtkmm;.  Instead, if a worker thread
+needs to invoke a &gtkmm; function, it should do so via

I would first mention that we discourage calling GTK+ from multiple threads anyway, though gdk_threads_enter/leave() is a possible solution when you absolutely must. Only then would I mention that there's a problem with that in gtkmm.

@@ +7440,3 @@
+<classname>sigc::signal</classname> object should be regarded as owned
+by the thread which created it.  Only that thread should connect slots
+with respect to the signal object, and only that thread should

Can "with respect" be removed here without loss of meaning?

@@ +7444,3 @@
+<methodname>operator()()</methodname> on it.  It follows (amongst
+other things) that any signal object provided by a &gtkmm; widget
+should only be operated on in the main GUI thread.

Replace "operated on" with "connected to"?

Comment 27 Chris Vine 2011-09-14 09:06:21 UTC

Created attachment 196472 [details] [review]
Revised documentation patch

"We generally try not to talk about API by first mentioning that it wraps glib API. Ideally, people don't need to think about glib. And anyway it's a general issue for everything, not just to be mentioned for this part of the API."

Done.

"[Comments about C++11 threads] should mention specific API, ideally with a link.  Also, are you saying that people could/should use it instead, or in combination?"

There is no linkable documentation that I know of about C++11 threads.  They can be used in combination or instead of glibmm threads (and because Glib::Thread::create() is not thread safe on non-static methods of classes deriving from trackable possibly should be).  However I can see that this could be seen as off-topic and if there must be documentation on C++11 threads if they are to be referred to, the best thing is just to omit the text.

"Should the libsigc++ reference documentation have a comment about [its lack of thread safety]?"]

I think so, but I think it is best to do one thing at a time.  Once this text is agreed it can be carried across.

"Yikes. That's a long bug report that's incomprehensible to the casual reader.  I'd skip it and just link to a libsigc++ bug report or documentation."

It's more of an architectural report than a bug report: it explains why the basic design of libsigc++ is inimicable to thread safety.  That is why boost didn't try to make its first, quite similar, signal library thread safe, but instead adopted a completely different approach in its signal2 module.  A libsigc++ bug report saying "please rewrite this library" is probably not going to be helpful or well received.  (As an aside, if the casual reader cannot understand the text in question then I doubt they should be programming with threads using C++ because they are the things anyone programming with shared resources must think about.)

Instead I have shortened it in the attached redraft.  The more advanced information could go in a footnote, but I won't do that unless it is likely to be accepted (and footnotes are possible using docbook, I have no idea).

"What is X11-specific about gdk_threads_enter/leave()? Their documentation doesn't mention any platform specifics."

The GDK global lock is specific to the X11 implementation - see http://developer.gnome.org/gdk/stable/gdk-Threads.html#gdk-Threads.description , fourth paragraph ("Unfortunately ...").

"I would clarify 'work'"

Done.  I have had to refer to "relevant signal object" and "relevant trackable object" without explaining what "relevant" means, which would take a significant amount of text - in effect a relevant signal object is any gtkmm widget's signal proxy whose methods might be invoked by a worker thread or which holds a slot of a relevant tracked object, and a relevant trackable object is any object derived from sigc::trackable which has any of its non-static methods referenced by a slot held by any signal object.  That risks bringing back in all the gruesome detail that you would rather see omitted.  If we are going to have footnotes, that could perhaps go in a footnote.

"I would first mention that we discourage calling GTK+ from multiple threads anyway, though gdk_threads_enter/leave() is a possible solution when you absolutely must. Only then would I mention that there's a problem with that in gtkmm."

Done.

"Can 'with respect' be removed here without loss of meaning?"

Done.

"Replace "operated on" with "connected to"?"

"Operated on" is correct here - the issue is wider than just connecting to it.  I have also added to this paragraph to explain what is meant by "nulling" a slot, as this originally referred back to the lengthy explanation (now omitted) of libsigc++ thread interactions.

Comment 28 Murray Cumming 2011-10-21 10:40:58 UTC

Review of attachment 196472 [details] [review]:

Here is a partial review of the first part. Sorry for nitpicking, but I think it's really worthwhile to make this very clear and, if possible, provide some clear rules for people to follow, without requiring them to understand all the details.

::: gtkmm-documentation-3.0.3/docs/tutorial/C/gtkmm-tutorial-in.xml.orig
@@ -7132,0 +7137,17 @@
+<chapter id="chapter-multi-threaded-programs">
+<title>Multi-threaded programs</title>
+
... 14 more ...

I would change "None of" to "That's because none of"

is not thread safe -> are not thread safe.

@@ -7132,0 +7137,19 @@
+<chapter id="chapter-multi-threaded-programs">
+<title>Multi-threaded programs</title>
+
... 16 more ...

is protected -> are protected

@@ +7171,3 @@
+Use <classname>Glib::Dispatcher</classname> to invoke gtkmm functions
+from worker threads (this is dealt with in more detail in the next
+section).  Although code written for the X11 backend of GTK+ can use

This sentence is too long and therefore confusing. Also, it might be best as a <note>.

I think that any mention of trackable or slot needs to make clearer that it's a libsigc++ thing. Using sigc:: would probably be the least-awkward way to do that.

@@ +7178,3 @@
+usually be sufficient to achieve thread safety unless every operation
+which might affect a relevant signal object, or might affect any
+relevant trackable object's lifetime or slots representing any

"relevant" is a bit odd here. It might be fine without it.

@@ +7180,3 @@
+relevant trackable object's lifetime or slots representing any
+relevant trackable object's non-static methods, is both within the
+global lock and avoids any out-of-order locking with respect to any

"sequence" is less ambiguous than "order"

@@ +7181,3 @@
+relevant trackable object's non-static methods, is both within the
+global lock and avoids any out-of-order locking with respect to any
+other locks employed in the program.  (As it happens, using

I'd remove the "As it happens", and I'd take this out of brackets.

@@ +7190,3 @@
+<listitem>
+<para>
+Unless special additional synchronization is employed, any particular

We can probably remove "Unless ... is employed" with no great loss of meaning. And "any particular" can be replaced with "A"

@@ +7197,3 @@
+<methodname>operator()()</methodname> on it, or null any of the
+connected slots.  It follows (amongst other things) that without
+additional synchronization any signal object provided by a &gtkmm;

I'd remove the "without additional synchronization" here too.

@@ +7198,3 @@
+connected slots.  It follows (amongst other things) that without
+additional synchronization any signal object provided by a &gtkmm;
+widget should only be operated on in the main GUI thread and any

I'd restructure this, to put "Any Gtk::Widget signal should only be used in the main GUI thread. And any sigc::trackable used by a Gtk::Widget signal should..." at the start of the paragraph, moving the explanation after that.

@@ +7207,3 @@
+<listitem>
+<para>
+Unless special additional synchronization is employed, any

Again, I'd remove the "unless .. is employed".

@@ +7209,3 @@
+Unless special additional synchronization is employed, any
+<classname>sigc::connection</classname> object should be regarded as
+owned by the thread which created the slot and called the method which

"called the method" here is not gramatically correct, but I don't understand what it should be instead.

@@ +7218,3 @@
+<listitem>
+<para>
+A slot which references a non-static method of a class deriving from

Is this "non-static" distinction really useful, even if correct?

@@ +7221,3 @@
+<classname>sigc::trackable</classname> should never be copied to
+another thread, nor destroyed by a different thread than the one which
+created it (one consequence of this is that

I would break the sentence instead of putting part of it in brackets.

@@ +7230,3 @@
+<listitem>
+<para>
+If a particular class object derives from

I'd make that sentence shorter, focusing on the use of mem_fun().

@@ +7243,3 @@
+<listitem>
+<para>
+Although <application>glib</application> is itself thread-safe

thread_init() is now deprecated and unnecessary, in glib 3.1/3.2.. Let's mention it only in a <note> if at all.

@@ +7247,3 @@
+any <application>glibmm</application> wrappers which use
+<application>libsigc++</application> will not be.  So for example, the
+only thread which should call

I'd move the the list to the end of the sentence instead of putting a list in the middle of a sentence.

@@ +7257,3 @@
+and manipulate any <classname>sigc::connection</classname> object
+returned by them, is the thread in which the main loop to which the
+connection is made runs.  (The connect*_once() variants could be made

I would remove this connect_*_once() issue and just leave a <!-- TODO --> comment pointing to a separate bug report about it.

@@ +7267,3 @@
+</orderedlist>
+
+<para>

This overall message belongs near the top.

Comment 29 Chris Vine 2011-10-25 12:47:30 UTC

Created attachment 199924 [details] [review]
Second revise of documentation patch

"I would change "None of" to "That's because none of""

Done

"is not thread safe -> are not thread safe."

Done.

"is protected -> are protected"

Done

"This sentence is too long and therefore confusing. Also, it might be best as a
<note>."

I have made this a footnote rather than a note, and shortened it a little but I think if any more is removed it is impossible to make the point.  Your previous comments invited me to explain what was meant by my previous "will not usually work" and I can think of no way of explaining it in any meaningful way which is shorter.  The alternative is to go back to my former shorter but unexplanatory version.  I think we should stay with it as it is now: I think your previous suggestion was right.

Since there is now a footnote I have also included a footnote for the second paragraph incorporating the text of the first draft.  Those who want to know more can read the footnote (it is actually the core of the problem and does help show to the knowledgeable that we know what we are talking about); those that are content just to read the advice in the tutorial needn't look at the footnote.

"I think that any mention of trackable or slot needs to make clearer that it's a
libsigc++ thing. Using sigc:: would probably be the least-awkward way to do
that."

Done, unless "the/a slot" or "slots" refer back to a reference to "sigc::slot" in the immediately preceding text.  I am not completely convinced that it improves clarity but four eyes are better than two.

""relevant" is a bit odd here. It might be fine without it."

Done.  A competent programmer will implicitly insert "relevant".  A less competent one perhaps should have a more absolute rule.

""sequence" is less ambiguous than "order""

Done.

"I'd remove the "As it happens", and I'd take this out of brackets."

I have taken all the words in brackets out as part of the shortening exercise.

"We can probably remove "Unless ... is employed" with no great loss of meaning.
And "any particular" can be replaced with "A""

Done.  Given that we are deprecating the use of the GDK global lock, and the global lock is the only way to apply external synchronisation to the signal of a gtkmm widget, I think we can lose it.

"I'd remove the "without additional synchronization" here too."

As above.

"I'd restructure this, to put "Any Gtk::Widget signal should only be used in the
main GUI thread. And any sigc::trackable used by a Gtk::Widget signal
should..." at the start of the paragraph, moving the explanation after that."

I don't think "Gtk::Widget signal" is synonymous with "any signal object provided by a gtkmm widget", because the latter (my version) covers the additional signals of derived objects.  Aside from that, I think your version really does obscure the meaning: what does "any ... signal should only be used" mean?  It will only be "used" in the main GUI thead in one sense, because that is the one which emits.  It is all the other things that can be done with a signal object which is the problem.  I do think "operated on" is a better description and ties in with the second sentence of this paragraph.

"Again, I'd remove the "unless .. is employed"."

As above.

"called the method" here is not gramatically correct, but I don't understand
what it should be instead.

I have reworded this.

"Is this "non-static" distinction really useful, even if correct?"

It is correct and I think it is useful.  Trackability does not apply to static methods, because there is no object associated with them.  I would not want to discourage starting threads with static methods: that may often be the best solution given sigc::trackable's lack of thread safety.  That is, start the thread with a static method and if you need to pass in the object concerned to the thread function do so by binding in a pointer to it.

"I would break the sentence instead of putting part of it in brackets."

Done.

"I'd make that sentence shorter, focusing on the use of mem_fun()."

I think doing it that way risks the user failing to understand the point at issue.  If it needs shortening, I would prefer to omit the reference to sigc::mem_fun().

"thread_init() is now deprecated and unnecessary, in glib 3.1/3.2.. Let's
mention it only in a <note> if at all."

I have omitted it.

"I'd move the the list [of connection methods] to the end of the sentence instead of putting a list in
the middle of a sentence."

Done

"I would remove this connect_*_once() issue and just leave a <!-- TODO -->
comment pointing to a separate bug report about it."

Done.

"This overall message belongs near the top."

I think I preferred the original, but not by much, so done.

Comment 30 Chris Vine 2011-10-25 12:53:06 UTC

Created attachment 199926 [details] [review]
Third revise of documentation

Sorry, I attached an old version for revision 2 which did not move the "overall message" to the start of "The Rules" section.  (I was mulling it over in my mind and then failed to regenerate the patch.)  Apart from that it is identical.

Please look at this revision (revision 3) and not revision 2.

Comment 31 Murray Cumming 2011-11-18 10:09:17 UTC

Review of attachment 199926 [details] [review]:

It's looking pretty good.

::: gtkmm-documentation-3.0.3/docs/tutorial/C/gtkmm-tutorial-in.xml.orig
@@ +7150,3 @@
+However, care is required when writing programs based on &gtkmm; using
+multiple threads of execution, arising from the fact that
+<application>libsigc++</application>, and in particular

Doesn't GTK+ itself strongly discourage use of GTK+'s API from from non-GUI threads? Not mentioning that makes it looks like it's all our fault.

Also, is there some bug or discussion somewhere about why libsigc++ can't be made to be thread-safe, so we can link to that?

@@ +7162,3 @@
+keeping track of slots representing any of its non-static methods
+(more particularly it keeps a list of callbacks which will null the
+connected slots on its destruction).  Each

I've just noticed that you use double spaces after the period. That's unnecessary. Please fix them.

@@ +7284,3 @@
+case where the slot does not relate to a non-static method of a class
+deriving from <classname>sigc::trackable</classname>, but that has not
+yet been implemented.  Update this when that has happened. -->

That should have a link to a bug report, so that it could happen, and so that you have a way to know later if it has happened, when reading that comment later.

Comment 32 Chris Vine 2011-11-22 10:35:06 UTC

Created attachment 201915 [details] [review]
Fourth revise of documentation

"Doesn't GTK+ itself strongly discourage use of GTK+'s API from from non-GUI threads? Not mentioning that makes it looks like it's all our fault.  Also, is there some bug or discussion somewhere about why libsigc++ can't be made to be thread-safe, so we can link to that?"

I have not seen use of the GDK global lock actively discouraged with GTK+, and http://developer.gnome.org/gdk/stable/gdk-Threads.html does not do so, nor is any special mention given to using g_idle_add() and cognates as the alternative.  Employing the global lock can be useful under X11 with, for example, some graphics heavy uses.  Some high performance users definitely invoke the global lock and there are a few cases where (if I happened not to be using gtkmm) I would do so also.

The problem with thread safety is not gtkmm's "fault" but it does arise from libsigc++'s lack of thread safety which in turn springs from libsigc++ automating a lot of stuff behind the user's back.  Design simplicity and minimisation of side effects are important for writing thread-safe code, and libsigc++ has, for its own reasons (mainly because it was conceived at a time when use of threading in desktop applications was not widespread) gone in the opposite direction.  It has in my opinion become the single biggest blemish with gtkmm.

libsigc++ could be made thread safe but it would involve work throughout that library.  The trackability model adopted by libsigc++ makes it difficult and, because of the locking that would be required even within slot objects, somewhat less than ideally efficient: but it is certainly not impossible to make it adequately thread-safe while maintaining the API.  There was some discussion of this on the message thread beginning with http://mail.gnome.org/archives/gtkmm-list/2010-July/msg00087.html but I doubt the value in linking to it here.  The conclusion I reached, reported at http://mail.gnome.org/archives/gtkmm-list/2010-July/msg00119.html in that message thread, was that because of its design the whole libsigc++ library would need attention rather than just sigc::trackable, which I think was also the conclusion of Paul Davis and Krzysztof Kosiński.

"I've just noticed that you use double spaces after the period. That's unnecessary. Please fix them."

Done.

"That should have a link to a bug report, so that it could happen, and so that you have a way to know later if it has happened, when reading that comment later."

Done.

Comment 33 Murray Cumming 2012-03-13 09:00:17 UTC

(In reply to comment #32)
> Created an attachment (id=201915) [details] [review]
> Fourth revise of documentation
> 
> "Doesn't GTK+ itself strongly discourage use of GTK+'s API from from non-GUI
> threads? Not mentioning that makes it looks like it's all our fault.  Also, is
> there some bug or discussion somewhere about why libsigc++ can't be made to be
> thread-safe, so we can link to that?"
> 
> I have not seen use of the GDK global lock actively discouraged with GTK+, and
> http://developer.gnome.org/gdk/stable/gdk-Threads.html does not do so, nor is
> any special mention given to using g_idle_add() and cognates as the
> alternative.  Employing the global lock can be useful under X11 with, for
> example, some graphics heavy uses.  Some high performance users definitely
> invoke the global lock and there are a few cases where (if I happened not to be
> using gtkmm) I would do so also.

Doesn't this confirm this?
http://mail.gnome.org/archives/gtk-devel-list/2012-March/msg00005.html

Comment 34 Chris Vine 2012-03-13 19:39:05 UTC

Comment #33: Presumably so, when gtk+-3.4 is released (if deprecation is to arise then).

Where do you want to go with all this?  The question of the gdk global lock is somewhat tangential to the thread-safety issues dealt with in the tutorial text, which is concerned principally with libsigc++'s thread safety.  The text with respect to which you raise this point - the second paragraph - has no relevance to the gdk global lock.  (The deprecation of the gdk global lock might call for the omission of footnote 2, but that is not the text in relation to which you raised the point: by all means cut it if you wish.)

Comment 35 Murray Cumming 2012-07-16 08:54:15 UTC

(In reply to comment #34)
> Comment #33: Presumably so, when gtk+-3.4 is released (if deprecation is to
> arise then).

gdk_threads_enter() isn't deprecated yet:
http://developer.gnome.org/gdk/unstable/gdk-Threads.html#gdk-threads-enter
though it still seems to be planned.


> Where do you want to go with all this?  The question of the gdk global lock is
> somewhat tangential to the thread-safety issues dealt with in the tutorial
> text, which is concerned principally with libsigc++'s thread safety.  The text
> with respect to which you raise this point - the second paragraph - has no
> relevance to the gdk global lock.  (The deprecation of the gdk global lock
> might call for the omission of footnote 2, but that is not the text in relation
> to which you raised the point: by all means cut it if you wish.)

So maybe you want to omit mention of that API, at least for now.

Comment 36 Kjell Ahlstedt 2013-02-14 15:46:44 UTC

What a pity if this discussion of a new "Multi-threaded programs" chapter dies
here, when Chris has made such an effort!

I can push this new chapter to the gtkmm tutorial, if no one objects. But some
details have changed since the discussion started.

1. gdk_threads_enter() and gdk_threads_leave() have been deprecated in
   gtk+ 3.6. See code http://git.gnome.org/browse/gtk+/tree/gdk/gdk.c.
   (They are not yet marked deprecated at
   http://developer.gnome.org/gdk/unstable/gdk-Threads.html#gdk-threads-enter
   which is a bit odd.)

   These deprecated functions should not be mentioned in the tutorial, i.e.
   remove footnote 2 in the "The rules" section, as suggested in comment 35.

2. Change Glib::Thread to Glib::Threads::Thread.

3. "The connect*_once() variants could be made thread-safe for any case where
   the slot does not relate to a non-static method of a class deriving from
   sigc::trackable, but that has not yet been implemented: Bug 396958."

   I suppose the reference should have been to bug 396963. That bug has been
   fixed. The only remaining thread-unsafety in the connect*_once() methods
   relates to non-static methods of a class deriving from sigc::trackable.

4. "One other point to note is that the user code must ensure that any
   Glib::Dispatcher object remains in existence from the time it is emitted on
   by a worker thread to the time that the receiver thread completes executing
   its connected slots for that emission."

   This sentence can be removed, because bug 651942 has been fixed.

I can make these changes, if Chris does not want to work with this bug any
more.

Comment 37 Kjell Ahlstedt 2013-02-14 15:49:40 UTC

The original problem in this bug, the thread-unsafety of Glib::Thread::
create(), can most easily be fixed by adding

  static Thread* create(const std::function<void()>& func,
    const std::string& name = std::string());

to Glib::Threads::Thread, as suggested in comment 10 and comment 18.

It requires a C++11 capable compiler. If must be protected by some kind of
#ifdef. This was discussed on the gtkmm-list in connection with move
constructors, starting at
https://mail.gnome.org/archives/gtkmm-list/2012-November/msg00072.html

We need to decide how to handle new code in glibmm and gtkmm that uses C++11
features. I can make a patch for the Thread::create() case, so we have
something to comment on.

Comment 38 Kjell Ahlstedt 2013-02-14 17:32:44 UTC

(In reply to comment #36)
> 1. gdk_threads_enter() and gdk_threads_leave() have been deprecated in
>    gtk+ 3.6. See code http://git.gnome.org/browse/gtk+/tree/gdk/gdk.c.
>    (They are not yet marked deprecated at
>    http://developer.gnome.org/gdk/unstable/gdk-Threads.html#gdk-threads-enter
>    which is a bit odd.)

Oops, I didn't notice that it's a link to gdk2. Here's a better link:
http://developer.gnome.org/gdk3/stable/gdk3-Threads.html#gdk-threads-enter

Comment 39 Chris Vine 2013-02-14 19:46:30 UTC

Comment #36 and #38: I agree with all your changes.  If you have commit access I should make them (I had rather assumed this was destined for the great bit bucket in the sky and that I was wasting my time), and take joint credit.  To be honest you take more interest in glibmm than I do nowadays.

Comment #37: Using std::function requires either C++03-TR1, or C++11.  I suggest having the functor argument as a templatized type which will work (after a fashion) with any version of C++, as an overload of the existing create functions.  The nice thing in C++11 is that you can use collapsible references to provide perfect forwarding.  In C++98/03 you can at least take it by reference to const of the template type, and it would take any callable object.  I cannot see that this could be less efficient than taking a sigc::slot argument.  It would probably be necessary to keep the existing thread create functions as sigc::slot specializations.  The overload resolution rules (which prefer matching non-template functions to a templated version) will ensure everything works correctly.

However this point also applies elsewhere.  I have done some work on task composition using thread pools with asynchronous result handling of late, and I noticed that glibmm's thread pool suffers from the same problem as do the thread creation functions.  (So I ended up not using glibmm's thread pool.)

Comment 40 Kjell Ahlstedt 2013-02-15 10:24:49 UTC

(In reply to comment #39)
> Comment #36 and #38: I agree with all your changes.  If you have commit
> access I should make them.

Yes, I have commit access. I can push it to the git repository, when you have
made the necessary changes.

From my comment 36:
"The only remaining thread-unsafety in the connect*_once() methods
relates to non-static methods of a class deriving from sigc::trackable."

This is slightly misleading. It's thread-unsafe if you use sigc::mem_fun(),
whether the method is static or non-static. If it's static, you can use
sigc::ptr_fun() instead. Then it will be thread-safe, because no object is
involved.

> Comment #37: Using std::function requires either C++03-TR1, or C++11.  I
> suggest having the functor argument as a templatized type which will work
> (after a fashion) with any version of C++, as an overload of the existing
> create functions.  ......

Can you show with some code what you mean? Either a patch, or a sketch where
I can fill in the details.

One possibility that does not require much work would be to take a copy of all
libsigc++ code that relates to sigc::slot, call it something else, and remove
everything that has to do with sigc::trackable. I just don't like the idea of
adding new code that mimics something that C++11 provides (in this case
std::function).

Comment 41 Chris Vine 2013-02-15 22:00:50 UTC

On comment #37 I suggest the functor is passed in a similar way to how the standard algorithms take their operators or predicates.  Similar, but not the same, because the standard algorithms keep their copy of the functor as an argument and so take it by value (if they took it by reference to const they could only execute operator()() const).  glibmm does its own copying (on the heap) of the slot argument and passes the heap allocated slot as the data argument to g_thread_new().  This means the functor can be taken by reference to const.  In C++11 it could be taken by collapsing reference, as does the equivalent std::thread constructor.

(I use the expression "functor" in lieu of "function object" to avoid confusion with std::function, a particular type of function object.  As our thread functor has void return type technically it is not a functor in the strict sense).

So I would envisage an overload for Threads::Thread::create() as follows:

 template <class F>
 static Thread* Glib::Threads::Thread::create(const F& func);

documenting that F is a callable object which takes no argument (which could of course be a std::function<void()> object, but could also be any function object not taking an argument including, in C++11, the result of std::bind or a lambda expression).

In C++11 this could comprise (but doesn't have to):

 template <class F>
 static Thread* Glib::Threads::Thread::create(F&& func);

which will take both lvalues and rvalues and if an rvalue carry out a move operation if a move operator is available, assuming appropriate use of std::forward.  (The C++11 standard states that the functor object created by a lambda expression must have a copy constructor and may have a move constructor: I think it probably says the same about the return value of std::bind although I have not looked that up.)

The overload for sigc::slot must still be retained, even though the function template could take it, as otherwise ABI compatibility would be broken - the template function is not compiled into the library, so a sigc::slot version is still required which is compiled in.

Taking a template argument is more efficient than using std::function.  std::function requires an additional allocation and an additional virtual dispatch.  That is why the equivalent C++11 library functions always take their callable objects as a template parameter not as a std::function object, and that is very much the recommended approach.  std::function is really for cases where you cannot for some reason or other use static polymorphism via templates.

On comment #36 I would much prefer it if you were to make the changes concerned.  We both agree what they are.

Comment 42 Chris Vine 2013-02-15 22:20:31 UTC

Actually I now see the fundamental problem with my suggestion in comment #41: you need to do something with the object pointed to by the data argument of g_thread_new(), and you cannot if you do not know what it is.  So use std::function, as you suggest.

Comment 43 Chris Vine 2013-02-16 17:16:42 UTC

I suppose I ought to add for completeness that, as I am sure you know, you can get around the g_thread_new() problem using type erasure, which is almost certainly what std::thread does if handing off to pthread_create(), and what std::function does internally, viz:

struct Eraser {
  virtual void exec() = 0; // not const so we can execute mutable lambdas
};

template <class F>
class Wrapper: public Eraser {
public:
  void exec() {func();}
  template <class Arg> Wrapper(Arg&& arg): func(std::forward<Arg>(arg)) {}
private:
  F func;
};

template <class F>
Eraser* make_eraser(F&& func) {return new Wrapper<F>{std::forward<F>(func)};}

Glib::Threads::Thread::create() can then call make_eraser() and pass the return value to g_thread_new()'s data argument.  g_thread_new()'s thread function can then call static_cast<Eraser*>(data)->exec().

This will save one allocation and one level of indirection compared with using std::function to do the type erasure for you, and is probably not worth the effort.

E&OE on the example code: it is based on some of my own classes but not copy-and-pasted, so may contain typos.

Comment 44 Kjell Ahlstedt 2013-02-17 13:08:02 UTC

I have pushed your new chapter to the gtkmm tutorial with the changes we
agreed on.
http://git.gnome.org/browse/gtkmm-documentation/commit/?id=3ba65250fb2313ba2662b55f0e23b136ee73c0ff

The second sentence in comment 39 (If you have commit access I should make
them) made me believe that you wanted to do the changes yourself. Perhaps I
misunderstood.

When I add 

  static Thread* create(const std::function<void()>& func,
    const std::string& name = std::string());

normal calls to Glib::Threads::Thread::create() become ambiguous. This method
must probably have a different name.

Comment 45 Chris Vine 2013-02-17 18:41:21 UTC

Ah, the ambiguities of modal auxiliaries used to construct subjunctive expressions in English.  My apologies.

On the second ambiguity (overload resolution), in glibmm-2.34.1, Glib::Threads::Thread::create() has no second argument.  I guess you are using the development version in git.

However, I do not think that affects your issue.  I cannot see how an ambiguity arises if you actually pass a std::function object since there is then a perfect match, but perhaps you are passing a lambda expression or the result of std::bind expecting implicit conversion, in which case there would be an ambiguity if sigc::slot has a conversion operator for these also.  The latest libsigc++ documentation says that it "supports arbitrary functor types", and inspecting the source shows that sigc::slot does indeed now have an overload for its constructor which is templated and takes any callable object (which is not dissimilar to the approach I mentioned above for callable objects).  Hence the ambiguity.

In which case a change of name may be inevitable, since it would be desirable to permit implicit conversion from a lambda or std::bind-produced function object to the std::function argument.  What a bummer.  However, the template and type erasure approach may provide the answer: on overload resolution, I think a template match takes precedence over a template match with implicit conversion.  If that is right, then the constructor taking any arbitrary function object would be preferred except where a sigc::slot object is passed (for a sigc::slot object the non-template match is preferred over a template match), so that would appear to solve the problem without having to rename the function.

By the way, if you do do it that way and so use type erasure, my Eraser class should have a virtual destructor: my example omitted that by mistake.  It would also be trivial to convert the type eraser to using reference to const for argument passing instead of collapsible references in order to cover C++98/03, albeit at the expense of losing the possibility of move construction of rvalues in C++11 to avoid the copy operation.  If you would like to test this approach and need help in implementing this, let me know.

Comment 46 Chris Vine 2013-02-17 18:57:58 UTC

A further point arises here.  Since sigc::slot now takes any function object for its argument, it means presumably that in C++11 any slot can be constructed safely for a non-static member function of an object deriving from sigc::trackable by using std::bind instead of sigc::mem_fun, so as to eliminate the tracking.

It may be that that is what users should be recommended to do.  This means changing the new text in the gtkmm tutorial so that:

* instead of referring to slots representing non-static member functions of classes deriving from sigc::trackable, it refers to slots constructed with sigc::mem_fun representing non-static member functions of classes deriving from sigc::trackable.

* it points out that the problems with the thread safety of sigc::trackable can be avoided in C++11 by constructing the slot by using std::bind or a lambda expression.

This probably means it is not necessary to add further overloads for Glib::Threads::Thread::create().  But this would need to be tested first.

Comment 47 Chris Vine 2013-02-17 19:49:33 UTC

Created attachment 236487 [details]
Test case for std::bind and sigc::trackable

Here is a test case which shows that std::bind appears to do all that is necessary to work around sigc::trackable.  In this test case, the third printed call to Test::non_static_func() is technically undefined behaviour, because the Test object has been destroyed at that point, but that doesn't in practice matter because the method does not operate on any data members of Test.  It demonstrates that trackability is ignored where std::bind is employed.

Comment 48 Chris Vine 2013-02-17 19:55:37 UTC

Created attachment 236488 [details]
Test case for std::bind and Glib::Threads::Thread::create()

And here for completeness is a test case for std::bind used with Glib::Threads::Thread::create().

On comment #40, I was surprised when you mentioned that slots for static member functions could be created with sigc::mem_fun, and when I tried that in the previous test case it won't compile (so it is commented out).  Are you sure you are right about that?  I think your added paragraph on this should perhaps be omitted.

If you would like me to amend the tutorial text now in git to cover std::bind, let me know.  If so, can you give me the tutorial's git repository address for cloning?

Comment 49 Kjell Ahlstedt 2013-02-18 14:10:44 UTC

Created attachment 236589 [details]
Modified test cases for std::bind and lambda expressions.

(In reply to comment #45)
> Glib::Threads::Thread::create() has no second argument. I guess you are using
> the development version in git.

Yes, I'm using the development version in git. It has a second create() method
with two arguments. (Bug 689863)

> I cannot see how an ambiguity
> arises if you actually pass a std::function object

I did not pass a std::function object. I compiled the example program
http://git.gnome.org/browse/glibmm/tree/examples/hread/thread.cc,
with the calls

Glib::Threads::Thread *const producer = Glib::Threads::Thread::create(
      sigc::mem_fun(queue, &MessageQueue::producer));
Glib::Threads::Thread *const consumer = Glib::Threads::Thread::create(
      sigc::mem_fun(queue, &MessageQueue::consumer));

We must not make changes that break such programs.

(In reply to comment #46)
> Since sigc::slot now takes any function object for
> its argument, it means presumably that in C++11 any slot can be constructed
> safely for a non-static member function of an object deriving from
> sigc::trackable by using std::bind instead of sigc::mem_fun, so as to
> eliminate the tracking.
>
> It may be that that is what users should be recommended to do.

Yes, I think this is a good idea. Even if your type erasure in comment 43
should work, it's too much job for this problem. Next time ABI can be broken in
glibmm, all supported compilers probably understand std::function, and then the
present create() methods can be replaced with 

  static Thread* create(const std::function<void()>& func,
    const std::string& name = std::string());

However I've experimented with your sigc.cpp program, and I noticed a problem
with std::bind() when combined with the
SIGC_FUNCTORS_DEDUCE_RESULT_TYPE_WITH_DECLTYPE macro, which is necessary if the
program connects std::bind() expressions or C++11 lambda expressions to a
sigc::slot that does not return void. The program did not compile. I don't know
why. I've seen similar problems in the test program
http://git.gnome.org/browse/libsigc++2/tree/tests/test_cpp11_lambda.cc.
Some calls with std::bind() are commented out there, because they did not
compile. C++11 lambda expressions can be used in such cases. Both possibilities
(std::bind() and lambda expressions) should be mentioned in the gtkmm tutorial
and in the documentation of Glib::Threads::Thread::create().

I've attached modified versions of your sigc.cpp and thread.cpp files.

(In reply to comment #48)
> I was surprised when you mentioned that slots for static member
> functions could be created with sigc::mem_fun.

I was wrong. I had not tested it. I assumed that because static and non-static
methods can be called as test->any_method(), sigc::mem_fun() would not notice
any difference between a static and a non-static method. But that's wrong.
Even if the syntax can be the same, the calls are not identical.

> I think your added paragraph on this should perhaps be omitted.

Yes, it shall be omitted.

> If you would like me to amend the tutorial text now in git to cover
> std::bind, let me know.  If so, can you give me the tutorial's git repository
> address for cloning?

git clone git://git.gnome.org/gtkmm-documentation

If you have a git account:
git clone ssh://[login@]git.gnome.org/git/gtkmm-documentation

Comment 50 Chris Vine 2013-02-18 19:58:31 UTC

Created attachment 236660 [details] [review]
Tutorial patch for std::bind/boost::bind

Here is a patch to cover the use of std::bind() and boost::bind() to safely call Glib::Threads::Thread::create().

I notice that the example file you refer to (thread.cc) displays the very incorrect code that this documentation warns about.  It starts threads for the MessageQueue::producer() and MessageQueue::consumer() methods from slots constructed with sigc::mem_fun(), despite MessageQueue deriving from sigc::trackable.  It shows how insidious this problem is: it has no doubt passed the inspection of several experienced programmers, yet is not thread safe.

Comment 51 Chris Vine 2013-02-18 20:07:18 UTC

I should say that I could not get git's gtkmm-documentation to build.  It seems to produce a defective configure file (perhaps my installed autoconf is insufficiently recent).  It would be helpful if you could do a test build to make sure it is syntactically correct.

Comment 52 Kjell Ahlstedt 2013-02-19 17:47:07 UTC

Created attachment 236809 [details] [review]
patch: Amend the "Multi-threaded programs" chapter.

No problem with your patch. Still, I've made two small changes:
1. Mention C++11 lambda expressions as a third alternative.
2. Modified the paragraph on the connect*_once() methods. If I'm not mistaken,
   they can also be made thread-safe by avoiding sigc::mem_fun().

If you have no objections, I'll push this patch.

I'll also change the documentation of Glib::Threads::Thread::create() and the
connect*_once() methods, and remove the inheritance from sigc::trackable in
http://git.gnome.org/browse/glibmm/tree/examples/thread/thread.cc.

Then, at last, I can close this bug.

Comment 53 Chris Vine 2013-02-19 22:51:02 UTC

Yes that is excellent: it looks fine.  Thanks for your help on that.  You might want to add yourself to the authors, but I leave that to you.

You could leave the bug open to remind yourself to move to std::function on an API/ABI break, but again that is a matter for you.

Comment 54 Kjell Ahlstedt 2013-02-20 09:42:53 UTC

I've pushed the patch in comment 52, and a patch that updates the description
of Glib::Threads::Thread::create().
The glibmm patch also adds a TODO comment in threads.hg, recommending a change
to std::function at the next ABI break. I think that's enough of a reminder,
so I close this bug.
The descriptions of Glib::SignalTimeout::connect(), connect_once(), etc. should
also be updated, but that's more closely related to bug 396958 and bug 396963.

Comment 55 Chris Vine 2013-02-20 11:26:38 UTC

Great.

The description of Glib::ThreadPool::push() also needs to be updated.  The same rule applies to that function as to Glib::Threads::Thread::create() and *connect_once() and so forth.

Comment 56 Kjell Ahlstedt 2013-02-20 17:59:08 UTC

I've updated the description of Glib::ThreadPool::push().
http://git.gnome.org/browse/glibmm/commit/?id=52009d77cc4247ce40f0bb4caaeab69d8bb983f4

Comment 57 Murray Cumming 2015-09-16 06:56:54 UTC

Please see https://bugzilla.gnome.org/show_bug.cgi?id=755091 about updating this for C++11 instead of using Glib::Threads::*.

Comment 58 Murray Cumming 2015-09-16 07:01:43 UTC

Also, now that C++11 makes it easier, maybe we should start an ABI-breaking libsigc++ that is thread-safe, so people don't need to create their own. I forget the name(s) of the thread-safe versions of libsigc++ that are out there.

Comment 59 Murray Cumming 2015-09-16 07:03:57 UTC

For reference: https://mail.gnome.org/archives/libsigc-list/2014-September/msg00001.html "[sigc] remarks on using (or not using) sigc++"

Comment 60 Chris Vine 2015-09-16 13:41:29 UTC

Comment #58:

There is no thread safe version of libsigc++ out there.  (There is an old now unmaintained library called libSigCX, aka libSigC++ extras, based I believe on libsigc++-1.2, but it was not really thread safe - it suffered from the same deficiencies as libsigc++.  What it in effect did was to combine libsigc++ with a Glib::Dispatcher type interface to execute connected slots in a program's main loop.)

The only well known thread-safe signal/slot implementation with trackability which is out there is boost::signals2.  The documentation claims that boost::signals2::signal and boost::signals2::connection and its associated trackability are thread safe, although Paul Davis said in the attachment to comment #59 that he found that some aspects were not thread safe after all (note: boost::signals is not thread safe, but boost::signals2 says it is). Leaving that aside boost::signals2 has a reputation for being somewhat heavyweight, and its major deficiency in my view is that, where a slot represents an object method, then for a connection to be trackable it requires the object concerned to be constructed on free store.

There are various mini-implementations out there.  Paul Davis said he has written one for ardour.  I have a minimal one with SafeEmitter and Releaser ( http://cxx-gtk-utils.sourceforge.net/2.2/emitter_8h.html ).

When I last looked at libsigc++ to think about how to resolve this issue, I came to the conclusion that making it thread safe would require a complete redesign, and in particular trackability should be a property of a particular connection and not of the object from which a slot happens to be created.