GNOME Bugzilla – Bug 302327
Crash with multiple inheritance and trackable argument bound by reference
Last modified: 2005-05-11 07:04:12 UTC
Distribution/Version: Debian unstable, x86 This is the corollary of bug 169225 (Crash with multiple inheritance and trackable signal handler). It demonstrates a similar issue for trackable arguments that are bound by reference.
Created attachment 45792 [details] The test case This test case is very similar to the one in bug 169225 (only 2 lines have changed).
Created attachment 45793 [details] The backtrace
Thanks. I'm going to take a break and let you play with this for a few days, using the ideas from the other bug. Do tell your bosses that I work freelance if they need priority.
Given that we have workarounds in our code to avoid bug 169225 (that we frequently hit) and bug 302327 (that we rarely hit), we are fine if fixing them has a low priority. In fact, we hesitated to even report these bugs. But we took the time to report them, produce a test case, analyze the issue and offer solutions, because we want to be nice players of the open source game. We benefit from using the library, and we want to give back anything we can, so other users of the library can benefit as well. That being said, at a meeting yesterday, we evoked to our manager the possibility of paying contractors to fix bugs we hit in 3rd-party software. The proposal was welcomed, and will be dealt with on a per-case basis in the future.
I've added a simpler test case in cvs. If we use the same fix as for mem_fun(), we need extra template parameters for each bound parameter. So this template <class T_action, class T_functor, class T_type1,class T_type2,class T_type3,class T_type4,class T_type5> void visit_each(const T_action& _A_action, const bind_functor<-1, T_functor, T_type1,T_type2,T_type3,T_type4,T_type5>& _A_target) would need a is_base_and_derived<> template parameter for each T_type* tempate parameter, and we'd need template specializations for every possible combination of true/false values for that paremeter. That doesn't feel like a good idea.
> I've added a simpler test case in cvs. What is the name of the file? Apparently it hasn't propagated to the anon cvs server yet. > and we'd need template specializations for every possible combination > of true/false values for that paremeter. That doesn't feel like a good idea. Agreed. The last thing we need is exponential complexity, especially as the max number of bound params can trivially be increased. So let's try to think about a different way of handling this. You know the old saying: "any problem in computer science can be solved by adding an extra layer of indirection". So here an idea: let's create a new class in the sigc namespace, tentatively called a LimitPointer<T_type>, whose constructor takes a T_type *. o If T_type does not inherit from trackable, then a LimitPointer<T_type> instance would behave _exactly_ like a T_type * instance. o If T_type inherits from trackable, then a LimitPointer<T_type> instance would behave _exactly_ like a trackable * instance. You see where I'm going: the LimitPointer class would be the only one to use the "is_base_and_derived" template specialization magic, and would of course implement visit_each() properly for both possibilities. Essentially, this involves moving the bulk of your fix to mem_ptr into this LimitPointer class, then simply making mem_ptr store a LimitPointer<T_obj> instead of a T_obj */trackable * pair. Similarly for bound arguments, instead of having the bind_functorX::boundX_ be of type T_typeX *, they would become of type LimitPointer<T_typeX>. This way, the visit_each code for bind_functorX does not have to change (which avoids the exponential specialization): instead, calling visit_each on each individual LimitPointer will do the right thing. I haven't tried to implement any of this, and I could be completely off, but I thought I could share this idea. I hope it can be made to work, and solve the whole 'cast to a trackable *' class of issues once for all, in a way that scales. What do you think?
Yes, that's a good idea.
I started to play with the idea last night. It works fine for mem_ptr. Then I tried to tackle the bind_functor. It is a bit harder there: instead of a LimitPointer, I use a BoundArgument type there. The BoundArgument has 3 facets: bound by value, boiund by ref and does not derive from trackable, bound by ref and derives from trackable. To implement the last 2, I use a LImitPointer. The problem is to know whether an arg is bound by ref or not: that information (the reference_wrapper type) is lost in the T_bound types. So I have to make that info available one level deeper, and for now I'm fighting with type conversions issue. I hopeful I will be able to solve them tonight.
I spent half of the night on this, but I now have a patch that: o Is easy to understand (which helps to verify its correctness) o Is not intrusive (very few lines of existing code are modified) o Fixes the bug (and passes 'make check') I'm planning to clean it up/comment it tomorrow (wednesday) night, and then I will attach it to this bug for you to review.
Fantastic. Well done. I look forward to it.
> I've added a simpler test case in cvs. You didn't reply to my question above. What is the name of that new file in the CVS tree? Can you attach it to this bug, so I can use it as my "official" test case, instead of using the test case from comment #1? Also I realized that we need to create another test case for return values that are bound by reference (bind_return). I need to extend my patch a bit to handle this case as well. I will do this tonight.
Created attachment 46035 [details] test_virtualbase_delete_ref_param.cc Sorry, I had forgotten to add it to cvs. Among other things, I renamed the classes, because "Dialog" would cause confusion.
Tonight I have polished my fix (I tried to match your coding style), I have made the necessary additions to it to handle the "bind_return by reference" case as well. I'm having some trouble generating a diff against the latest CVS snapshot: the instructions at http://libsigc.sourceforge.net/devel.shtml#cvs only allow me to retrieve an outdated tree (about a week old). When I browse the tree from the web at http://cvs.gnome.org/viewcvs/, I can see the latest and greatest files, but I have no way to mass-download them all. I will try to find a solution. Oh, and I came up with a better test case. o No need for a signal whatsoever o Test for the 3 cases (handler, param, return value) in 1 file: there is no point having 3 separate files for that.
Created attachment 46046 [details] The latest test case
Yes, anoncvs has an annoying delay. But, by now, I'd expect that you can get everything except the test case that I added yesterday. Here is a snapshot in case it helps: http://www.murrayc.com/temp/libsigc++-2.0.11cvssnapshot.tar.gz
Thanks for providing the snapshot. I grabbed it from your web site, so you can delete it. I will generate the diff tonight.
Created attachment 46124 [details] [review] Proposed patch For all tests that passed before the change, the output of 'make check' is exactly the same. All tests pass after the change. The change is explained in details in ChangeLog, please read that first. The patch applies cleanly to the CVS snapshot you provided in comment #15. After you apply the patch, don't forget to manually 'cvs remove' the files "tests/test_virtualbase_delete*.cc".
This is wonderful work. I've been using it for a couple of days now, without problems, though I did have to fix a build file to install a new header. I'm very pleased that this does not seem to break API/ABI, and I plan to commit this soon. Could you explain your reasons for choosing some of the names? I don't have better suggestions yet, and I might be happy with these names with some explanation: - Why "limit_reference<>" for the class that holds the bound parameter? - Why "invoke()" to get the pointer to be used by visit_each()? By the way, your documentation suggests that it was previously called visit(). > Put correct but unused code under #if 0. I'm not sure why we would want to keep this. Is it replaced by code elsewhere? > tests/test_bind_ref.cc: Slots must use 'Foo &' types. We were lucky > this broken usage worked before this change. The change in > type_traits.h made this bug obvious, by preventing the code to compile. So, this just means that a slot declaration may not use sigc::reference_wrapper<> in it's list of template types? That does not seem too bad. I doubt that anybody is doing that, and it seems easy to fix. Are you using this internally now?
> though I did have to fix a build file to install a new header I just tested things by using configure/make/'make check', so it would not surprise me at all if 'make install' and a bunch of other targets might need a few little adjustments. > - Why "limit_reference<>" for the class that holds the bound parameter? There seems to be some confusion on your side here. It was the first time I was using Doxygen, and I didn't know of a good way to put global comments in each of the new files I created, to explain this: There are really 2 classes: o limit_reference<Foo>: this is an object whose basic job is to store a reference Foo &, but make sure that if Foo inherits from trackable, then visit_each will "limit" itself to the trackable reference instead of the derived reference. Hence the name limit_reference. If I could give it a long name, it would be something like "when possible, limit reference to trackable while visiting". It comes in 2 flavors: a) Foo inherits from trackable: in this case, not only the derived reference is stored in limit_reference, but also the trackable reference. This way, we can later retrieve the trackable reference w/o asking the compiler to do an implicit conversion. In order to retrieve the derived reference (so that you can invoke methods or members of it), you use invoke(). In order to retrieve the trackable reference (so that you can call visit_each() on it), you use visit(). b) Foo does not inherit from trackable. That is the degenerated case, and both invoke() and visit() just return the stored reference. Now we use a m4 macro to build limit_reference.h, because there are actually 4 types of references that are needed by mem_fun, depending on the qualifiers that are associated with the method passed in mem_fun: Foo &, const Foo &, volatile Foo &, const volatile Foo &. These lead to 4 types: limit_reference<Foo>, const_limit_reference<Foo>, volatile_limit_reference<Foo>, const_volatile_limit_reference<Foo>. This is no different than what the mem_fun code does. So now all instead of having the mem_fun code store references directly, it uses the limit_reference class instead. So for handler classes that don't inherit from trackable, the extra memory cost is 0. For handler classes that inherit form trackable, the extra memory cost is one machine word (the trackable reference the limit_reference object stores). As you can see, mem_fun does not hold a bound parameter, and that is why it just uses a limit_reference. o bound_argument<Foo>: this is an object whose basic job is to store a bound argument (hence the name). a) Foo is a wrapped reference to a class Bar (reference_wrapper<Bar>): then this object is implemented on top of a limit_reference. At the time the slot is invoked, the invoke() method of this object allows to retrieve the argument (a Bar &). At the time the slot is visited, we simply visit the limit_reference, which will do the right thing wrt trackability. b) Foo is a wrapped const reference to a class Bar (const_reference_wrapper<Bar>): then this object is implemented on top of a const_limit_reference. At the time the slot is invoked, the invoke() method of this object allows to retrieve the argument (a const Bar &). At the time the slot is visited, we simply visit the const_limit_reference, which will do the right thing wrt trackability. c) Finally, there is the degenerated case, where Foo is something else, i.e. an argument that is bound by value. In that case, bound_argument just stores a copy of that value, and both invoke() and visit() simply return it. This object is used by the bind_functor<> and bind_return_functor<> objects, depending on whether the argument is bound as a parameter or as a return value. Again, because of the limit_reference, the extra overhead is a machine word for objects bound by reference that inherit from trackable, and 0 for everything else. > By the way, your documentation suggests that it was previously called visit(). As explained above, invoke() and visit() are really 2 different things in both of my new classes: invoke() will always return the true contained object, whereas visit() will always return a "limited to trackable" object that is suitable for visit_each(). I you think I used one instead of the other in the doc, please tell me were precisely in the patch. > I'm not sure why we would want to keep this. Is it replaced by code elsewhere? No, it is not replaced. It is just not needed anymore. I just thought that because it is correct code, we might want to keep it around in case we start to need it again in the future. But really, I don't have a strong opinion on this, I also hesitated to simply delete it. Do whatever you want :) > So, this just means that a slot declaration may not use > sigc::reference_wrapper<> in it's list of template types? That does not seem > too bad. Absolutely. o Slot declaration should always be using the real Foo & type in its list of template types. o and bind/bind_return declarations should always be using the wrapped reference_wrapper<Foo> type. The reference_wrapper<> stuff is an internal trick to allow to bind arguments by reference, so its use should be limited to just that. My change strictly enforces that, and I consider it a good thing. > I doubt that anybody is doing that, and it seems easy to fix. That is exactly what I think. > Are you using this internally now? Not yet: in order to use a new version of libsigc++ in our product, we have to rebuild libsigc++ and _everything_ that uses it, including the whole glibmm/gtkmm stuff, and then we need to check these new binaries in our repository, and then we need to QA the entire product to check for always possible regressions. Doing so takes quite a bit of time, so we are hesitant to do it twice: once now, and once you release the official 2.0.12. So we will probably wait until you release the official 2.0.12.
Thanks. I am happy enough with the slightly obscure names with that much explanation (which I've added to the doxygen comments) and I can't think of better names. I have removed the #ifdef 0 stuff completely. Expect a new tarball soon.