GNOME Bugzilla – Bug 726999
Unify and refactor GValue marshaling code paths
Last modified: 2014-08-18 08:33:54 UTC
There are a handful of different GValue marshaling code paths that are similar enough to unify. Some of these were initially copies of each other which diverged over time. This leads to an extra maintenance burden and in some cases a loss of performance due to the chaining of some of the functions in different contexts. An example of this performance loss can be observed when comparing accessing properties with GObject.props vs. GObject.get_property: >>> from gi.repository import Gtk >>> b = Gtk.Button() >>> %timeit b.props.sensitive 100000 loops, best of 3: 11.2 us per loop >>> %timeit b.get_property('sensitive') 1000000 loops, best of 3: 1.79 us per loop This is a fairly large performance gap. The reason is the GObject.props accessor looks at introspection information to marshal the GValue to Python. Whereas GObject.get_property is basically a thin wrapper around g_object_get_property. Beyond performance, this also means the two different techniques can potentially return different information. The GObject.props accessor essentially goes through a few additional switch statements to marshal the GValue: GObject.props (PyGProps_getattro): pygi_get_property_value_real pygi_lookup_property_from_g_type (potentially slow access to repository) g_object_get_property Conversion of GValue to GIArgument (duplication of _pygi_argument_from_g_value) pygi_argument_to_object (full conversion to PyObject using GI info) vs. GObject.get_property (pygobject_get_property): g_object_get_property pyg_param_gvalue_as_pyobject (completely marshaling path than above) As a first part it would be good to unify the switch statements from _pygi_argument_from_g_value and pygi_get_property_value_real, which look like direct duplications with divergence over time: https://git.gnome.org/browse/pygobject/tree/gi/pygi-value.c?id=3.12.0#n26 https://git.gnome.org/browse/pygobject/tree/gi/pygi-property.c?id=3.12.0#n108 Further work should be done to clarify ownership semantics. Marshaling from a GValue should always dup the contents regardless of transfer annotation in the context of get_property and signal to-py arg marshaling. Currently accessing an annotated property attempts to use transfer annotations which isn't entirely correct. Notably bug 692747 describes the problems caused by using transfer with get_property. Code example: https://git.gnome.org/browse/pygobject/tree/gi/pygi-property.c?id=3.12.0#n225 This can be broken down as follows: * get_property - Always dup GValue contents and marshal, then call GValue.unset. * signal to-py marshaling - Always dup GValue and don't call unset. * function call return and out args - Unclear, may need to follow transfer annotations * callback input args - Unclear, may need to follow transfer annotations The following is a list of functions all very much related that need to be analyzed for similarities (use grep to find their definitions): pygi_argument_from_g_value pygi_argument_to_object pygi_get_property_value pygi_set_property_value pyg_value_as_pyobject pyg_value_from_pyobject pygi_argument_to_array The following is a list of related commits can be used for historical information to ensure we have tests for the related functionality they introduced: https://git.gnome.org/browse/pygobject/commit/?id=3606eb20 https://git.gnome.org/browse/pygobject/commit/?id=ee62df4d https://git.gnome.org/browse/pygobject/commit/?id=e14ebab6 https://git.gnome.org/browse/pygobject/commit/?id=8cfd596c https://git.gnome.org/browse/pygobject/commit/?id=9f50fd21 https://git.gnome.org/browse/pygobject/commit/?id=0d099bdb https://git.gnome.org/browse/pygobject/commit/?id=b6fefd62 https://git.gnome.org/browse/pygobject/commit/?id=216caf59 https://bugzilla.gnome.org/show_bug.cgi?id=684062 https://bugzilla.gnome.org/show_bug.cgi?id=696143
Marking bug 727004 as a dependency since it could completely get rid of GValue marshaling for GI signal closures.
Pushed a bunch of unittest refactoring to aid this effort: https://git.gnome.org/browse/pygobject/commit/?id=bf0a5c3 https://git.gnome.org/browse/pygobject/commit/?id=115bc88 https://git.gnome.org/browse/pygobject/commit/?id=c691d86 https://git.gnome.org/browse/pygobject/commit/?id=15b7953 https://git.gnome.org/browse/pygobject/commit/?id=36caa74
Created attachment 282863 [details] [review] Refactor boxed wrapper memory management strategy Change pygi_boxed_new() to accept "copy_boxed" instead of "free_on_dealloc". This changes memory management so the PyGIBoxed wrapper owns the boxed pointer given to it. Use __del__ instead of dealloc for freeing the boxed memory. This is needed for edge cases where objects like GSource can trigger the finalized callback during de-alloc, resulting in the PyObjects references counts being manipulated and triggering a re-entrant de-alloc. Add hack to keep Gtk.TreeIter.do_iter_next/previous implementations working which rely on pass-by-reference. See also: https://bugzilla.gnome.org/show_bug.cgi?id=734465 https://bugzilla.gnome.org/show_bug.cgi?id=722899
Created attachment 282864 [details] [review] Never dup data structures when marshaling from g_object_get_property() Always use transfer-none with the results of g_object_get_property() and assume g_value_unset() will cleanup the results. This gives us control over memory of properties and limits property anotations to value typing.
Created attachment 282865 [details] [review] Never dup data structures when marshaling signal in arguments Always assume transfer-none of GValue arguments to signal handlers. A signal handler with arguments marked as transfer-full does not make any sense, so assume they are always transfer-none.
Created attachment 282866 [details] [review] Merge pygi_get_property_value and _pygi_argument_from_g_value Merge duplicated GValue marshaling code which has diverged over time (commits 3606eb20, ee62df4d, e14ebab6, 8cfd596c, 9f50fd21, 0d099bdb, and 216caf59). Use _pygi_argument_to_array within pygi_get_property_value. This is needed in the new code for supporting GI_TYPE_TAG_ARRAY and also fixes bug 669496. Side effects of this change also include support for properties holding G_TYPE_FLAGS and G_TYPE_PARAM.
Patches so far fix bug 722899, bug 657442, and bug 669496 along with some other un-reported leaks: 61 bytes in 8 blocks are definitely lost in loss record 11,008 of 26,691 at 0x4C2745D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xF5E9D9E: g_malloc (gmem.c:97) by 0xF5EA0C6: g_malloc_n (gmem.c:338) by 0xF605431: g_strdup (gstrfuncs.c:356) by 0xF376E00: g_value_dup_string (gvaluetypes.c:1136) by 0xE7096B4: pygi_get_property_value (pygi-property.c:179) by 0xE6E9151: PyGProps_getattro (pygobject.c:296) by 0x4F3C14C: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40869: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F42174: PyEval_EvalCodeEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40573: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F42174: PyEval_EvalCodeEx (in /usr/lib64/libpython3.3m.so. 216 (120 direct, 96 indirect) bytes in 3 blocks are definitely lost in loss record 8,326 of 10,730 at 0x4C2745D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xF5E9D9E: g_malloc (gmem.c:97) by 0xF6029EA: g_slice_alloc (gslice.c:1007) by 0xF5AB8F2: g_array_sized_new (garray.c:193) by 0xE70998C: pygi_get_property_value (pygi-property.c:245) by 0xE6E9151: PyGProps_getattro (pygobject.c:296) by 0x4F3660A: ??? (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40B06: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40869: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40869: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F42174: PyEval_EvalCodeEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40573: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) 106 bytes in 9 blocks are definitely lost in loss record 19,183 of 26,691 at 0x4C2745D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xF5E9D9E: g_malloc (gmem.c:97) by 0xF5EA0C6: g_malloc_n (gmem.c:338) by 0xF605431: g_strdup (gstrfuncs.c:356) by 0xF376E00: g_value_dup_string (gvaluetypes.c:1136) by 0xE7136D1: _pygi_argument_from_g_value (pygi-value.c:83) by 0xE70A751: pygi_signal_closure_marshal (pygi-signal-closure.c:121) by 0xF349142: g_closure_invoke (gclosure.c:768) by 0xF365EFF: signal_emit_unlocked_R (gsignal.c:3553) by 0xF365202: g_signal_emit_valist (gsignal.c:3309) by 0xF365755: g_signal_emit (gsignal.c:3365) by 0xF065087: g_settings_real_change_event (gsettings.c:277) 88 (80 direct, 8 indirect) bytes in 2 blocks are definitely lost in loss record 17,121 of 26,691 at 0x4C2745D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xF5E9D9E: g_malloc (gmem.c:97) by 0xF6029EA: g_slice_alloc (gslice.c:1007) by 0xF602A2A: g_slice_alloc0 (gslice.c:1032) by 0x115745EB: regress_test_boxed_new (regress.c:1794) by 0x11574724: regress_test_boxed_copy (regress.c:1857) by 0xF36F38C: _g_type_boxed_copy (gtype.c:4249) by 0xF346E59: g_boxed_copy (gboxed.c:352) by 0xF3471C0: g_value_dup_boxed (gboxed.c:456) by 0xE709792: pygi_get_property_value (pygi-property.c:206) by 0xE6E9151: PyGProps_getattro (pygobject.c:296) by 0x4F3C14C: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) 48 bytes in 2 blocks are definitely lost in loss record 2,947 of 10,730 at 0x4C2745D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xF5E9D9E: g_malloc (gmem.c:97) by 0xF6029EA: g_slice_alloc (gslice.c:1007) by 0x1135AFD7: gi_marshalling_tests_boxed_struct_copy (gimarshallingtests.c:3700) by 0xF36F38C: _g_type_boxed_copy (gtype.c:4249) by 0xF346E59: g_boxed_copy (gboxed.c:352) by 0xF3471C0: g_value_dup_boxed (gboxed.c:456) by 0xE709792: pygi_get_property_value (pygi-property.c:206) by 0xE6E9151: PyGProps_getattro (pygobject.c:296) by 0x4F3660A: ??? (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40B06: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0) by 0x4F40869: PyEval_EvalFrameEx (in /usr/lib64/libpython3.3m.so.1.0)
(In reply to comment #7) > Patches so far fix bug 722899, bug 657442, and bug 669496 along with some other > un-reported leaks: Correction, this doesn't fix bug 669496 but rather adds better array support for properties (bug 688232).
Created attachment 282961 [details] [review] Unify property getters Consolidate duplicate logic into pygi_get_property_value(). Use the function for GObject.get_property(), GObject.get_properties(), and GObject.props. Remove overridden expected failures in TestCGetPropertyMethod which now work due to the unification. https://bugzilla.gnome.org/show_bug.cgi?id=733893
Created attachment 282963 [details] [review] Break pyg_value_as_pyobject into two functions Add pygi_value_to_py_basic_type() which is limited to handling basic types that don't need introspection information when marshalling to Python. Add pygi_value_to_py_structured_type() for marshalling of structured data which can eventually accept GI type hints.
Created attachment 282964 [details] [review] Fast path property access for basic types Attempt marshalling with pygi_value_to_py_basic_type() prior to looking at GI info. This gives a quick conversion for basic types like bools, ints, and strings without having to go through GIArgument and GI conversions. This gives approximately a 3x performance boost for accessing these types with the unified GValue marshaller.
Review of attachment 282961 [details] [review]: Note that while this patch unifies all property access, it also slows down get_property() due to its code path now attempting to use introspection information. This is fixed in later patches which optimize the unified marshaller.
Review of attachment 282964 [details] [review]: Some micro benchmarks for this patch showing a 3x speedup for basic types (no difference for structured types). pygobject was compiled with -O3: # Setup from gi.repository import Gtk button = Gtk.Button(name='spam') box = Gtk.Box() box.add(button) # Before patch: %timeit button.props.name 100000 loops, best of 3: 6.94 us per loop %timeit button.props.margin_start 100000 loops, best of 3: 7.19 us per loop %timeit button.props.parent 100000 loops, best of 3: 10.6 us per loop # After patch: %timeit button.props.name 100000 loops, best of 3: 2 us per loop %timeit button.props.margin_start 100000 loops, best of 3: 2.3 us per loop %timeit button.props.parent 100000 loops, best of 3: 10.2 us per loop
Created attachment 282969 [details] [review] tests: Add failing tests which verify exceptions raised in property getters https://bugzilla.gnome.org/show_bug.cgi?id=575652
Comment on attachment 282969 [details] [review] tests: Add failing tests which verify exceptions raised in property getters wrong ticket
Attachment 282863 [details] pushed as 8517504 - Refactor boxed wrapper memory management strategy Attachment 282864 [details] pushed as 04816f7 - Never dup data structures when marshaling from g_object_get_property() Attachment 282865 [details] pushed as 142ff19 - Never dup data structures when marshaling signal in arguments Attachment 282866 [details] pushed as abdfb0f - Merge pygi_get_property_value and _pygi_argument_from_g_value Attachment 282961 [details] pushed as b0236d6 - Unify property getters Attachment 282963 [details] pushed as 8f4b06f - Break pyg_value_as_pyobject into two functions Attachment 282964 [details] pushed as 2601011 - Fast path property access for basic types