GNOME Bugzilla – Bug 419462
Inline short functions used in every vfunc invocation
Last modified: 2018-05-22 12:07:34 UTC
ObjectBase::_get_current_wrapper() and ObjectBase::is_derived_() are used in every vfunc and default signal handler invocation. Inlining them speed up these calls slightly.
Created attachment 84777 [details] [review] Inlines _get_current_wrapper() and is_derived_()
Created attachment 84799 [details] [review] Replaced g_object_get_qdata by g_datalist_id_get_data This patch additionally replaces the g_object_get_qdata() call in ObjectBase::_get_current_wrapper() by a direct call to g_datalist_id_get_data(). This is what g_object_get_qdata() does anyway, but the G_IS_OBJECT typecheck in the latter function needs more time than the actual lookup.
Created attachment 84832 [details] [review] fastvfuncs_combined.patch This patch combines the above patch and the patch from #419461.
I mean, in bug #418571 .
The use of inline seems to remove the function from the library: regexxer: symbol lookup error: /opt/gnome218/lib/libglademm-2.4.so.1: undefined symbol: _ZN4Glib10ObjectBase20_get_current_wrapperEP8_GObject So we should probably _add_ a new inlined function and use it. I'll wait for the results of tests with those test tarballs though, I think, before making new tarballs.
gcc also has a -fkeep-inline-functions option that might help.
I think it would be safer just to restore the function and add a renamed one. I've done that locally and I'm testing it now.
Created attachment 85174 [details] [review] fastvfuncs_combined_noabibreak.patch This one does not break ABI.
I committed commented-out code for the inline functions, so that it's easy for people to try. I have committed the changes in #418571.
Let's look at this again if/when we do a glibmm ABI break.
Created attachment 293066 [details] Test program with time measurement I've made some measurements with the attached test program. If you want to test the program without modifications, you must add Glib::ObjectBase::is_derived_inline_() in glibmm/glib/glibmm/objectbase.h. Average of some measurements when the test program is compiled with g++, optimization level -O2 (which is used when glibmm is built) and glib version 2.43.2. The glib version is important because G_IS_OBJECT() was optimized in glib 2.41.1. See bug 730984 and bug 731335. 1000000 repetitions CPU time with Glib::ObjectBase::_get_current_wrapper(): 64 ms CPU time with get_current_wrapper(): 61 ms CPU time with get_current_wrapper_no_get_qdata(): 56 ms CPU time with get_current_wrapper_inline(): 60 ms CPU time with get_current_wrapper_no_get_qdata_inline(): 54 ms If glib/gobject/gobject.c is compiled with the G_IS_OBJECT() version used before glib 2.42.1, the result is 1000000 repetitions CPU time with Glib::ObjectBase::_get_current_wrapper(): 69 ms CPU time with get_current_wrapper(): 64 ms CPU time with get_current_wrapper_no_get_qdata(): 55 ms CPU time with get_current_wrapper_inline(): 62 ms CPU time with get_current_wrapper_no_get_qdata_inline(): 53 ms Conclusions: The difference between all these versions of the functions are small. Inlining saves very little execution time. Bypassing g_object_get_qdata() and calling g_datalist_id_get_data() directly saves more time, but still not very much, about 10% with the present (from glib 2.41.1) version of G_IS_OBJECT(). And it requires access to the private data member GObject.qdata.
Created attachment 293151 [details] Test program 2 with time measurement The measurements in comment 11 show that calls to Glib::ObjectBase:: _get_current_wrapper() take more CPU time than the calls to the local get_current_wrapper(). That made me suspect that a call to a function in an object library file takes more time than a call to a function in the executable file, ./example. Comparisons between inline and non-inline functions would then be misleading. I've now made a second test program, requiring a glibmm patch that adds all compared functions to Glib::ObjectBase. The surprising result is that it doesn't matter much if the called function is in a separate lib file or part of the executable program. What matters is in which order the loops with the function calls are executed. The first loop takes more time than a following identical loop! I don't know why. Perhaps it just shows how difficult it can be to make time measurements when the compiler is allowed to optimize the code. Here are averages of some measurements with the second test program and the present version of G_IS_OBJECT(). 1000000 repetitions CPU time with Glib::ObjectBase::_get_current_wrapper(): 64 ms CPU time with Glib::ObjectBase::_get_current_wrapper(): 62 ms CPU time with Glib::ObjectBase::_get_current_wrapper_no_get_qdata(): 56 ms CPU time with Glib::ObjectBase::_get_current_wrapper_inline(): 60 ms CPU time with Glib::ObjectBase::_get_current_wrapper_no_get_qdata_inline(): 54 ms The conclusions in comment 11 still hold.
-- GitLab Migration Automatic Message -- This bug has been migrated to GNOME's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glibmm/issues/4.