GNOME Bugzilla – Bug 678046
GObject.threads_init() provokes a segmentation fault after the python interpreter exits
Last modified: 2012-06-25 13:17:25 UTC
Created attachment 216336 [details] [review] Tested patch, this fixes the crash. This happens when pygobject_data_free () function is called after the python interpreter shuts down, we can't do python calls after that. Benzea did the findings because of a bug in Sugar, and commented in this SugarLabs ticket: http://bugs.sugarlabs.org/ticket/3670 Benzea also came with a patch that fixes the bug, I attach it here too. TestCase: (implies WebKitGTK+) #! /usr/bin/env python from gi.repository import GObject GObject.threads_init() from gi.repository import WebKit session = WebKit.get_default_session() import sys sys.exit(0)
So, what is happening in more detail: 1. Python initializes Webkit 2. The Webkit default session is wrapped by python 3. Python interpreter quits (I am not sure if the session *python* object stays alive) 4. WebKit does some final cleaning up see http://trac.webkit.org/browser/trunk/Source/WebKit/gtk/webkit/webkitglobals.cpp#L280 5. pygobject needs to clean up some data (the closures) and tries to use the interpreter 6. CRASH There are other ways to reproduce this, for example creating a weakref results in a crash (even without threading, because pygobject actually tries to call python code): #! /usr/bin/env python from gi.repository import WebKit session = WebKit.get_default_session() def test(*args): print("sadf") wref = session.weak_ref(test) Well, fun little bug. It seems to me like pygobject needs to guard against these things. Another cases where this could happen in a more extreme way is an embedded python interpreter.
Looks like this is basically a duplicate of bug #677091. Though that bug goes one step further in saying that pygobject needs to survive reinitialization.
Created attachment 216518 [details] [review] Do not do any python calls when GObjects are destroyed after the python interpreter has bee Patch against master.
Thanks for the patch Benjamin, I rebased on master. Does work fine for me to handle the issue we were seeing.
Thanks Simon. Can you please do a before/after "make check.valgrind" comparison to check that it does not introduce leaks?
Ok, good that you used the term 'introduce new leaks' :) The Leak and Heap summary is the same before and after applying the patch, for reference if I have done my testing correctly: {{{ ==13711== HEAP SUMMARY: ==13711== in use at exit: 1,570,633 bytes in 16,846 blocks ==13711== total heap usage: 91,543 allocs, 74,697 frees, 12,895,745 bytes allocated ==13711== ==13711== 20 bytes in 1 blocks are definitely lost in loss record 211 of 2,667 ==13711== at 0x4009059: malloc (vg_replace_malloc.c:263) ==13711== by 0x4911377A: PyObject_Malloc (obmalloc.c:939) ==13711== by 0x490E354D: PyCObject_FromVoidPtr (cobject.c:28) ==13711== by 0x4EADAA2: init_glib (glibmodule.c:802) ==13711== by 0x49181AAF: _PyImport_LoadDynamicModule (importdl.c:53) ==13711== by 0x4917F887: load_module (import.c:1831) ==13711== by 0x4917FF18: import_submodule (import.c:2595) ==13711== by 0x4918044F: ensure_fromlist (import.c:2506) ==13711== by 0x491807E7: import_module_level.isra.9 (import.c:2174) ==13711== by 0x49180C67: PyImport_ImportModuleLevel (import.c:2188) ==13711== by 0x49165ED3: builtin___import__ (bltinmodule.c:49) ==13711== by 0x4910F078: PyCFunction_Call (methodobject.c:85) ==13711== ==13711== 20 bytes in 1 blocks are definitely lost in loss record 212 of 2,667 ==13711== at 0x4009059: malloc (vg_replace_malloc.c:263) ==13711== by 0x4911377A: PyObject_Malloc (obmalloc.c:939) ==13711== by 0x490E354D: PyCObject_FromVoidPtr (cobject.c:28) ==13711== by 0x55CB4DD: init_gobject (gobjectmodule.c:2408) ==13711== by 0x49181AAF: _PyImport_LoadDynamicModule (importdl.c:53) ==13711== by 0x4917F887: load_module (import.c:1831) ==13711== by 0x4917FF18: import_submodule (import.c:2595) ==13711== by 0x4918044F: ensure_fromlist (import.c:2506) ==13711== by 0x491807E7: import_module_level.isra.9 (import.c:2174) ==13711== by 0x49180C67: PyImport_ImportModuleLevel (import.c:2188) ==13711== by 0x49165ED3: builtin___import__ (bltinmodule.c:49) ==13711== by 0x4910F078: PyCFunction_Call (methodobject.c:85) ==13711== ==13711== 28 bytes in 1 blocks are definitely lost in loss record 555 of 2,667 ==13711== at 0x4009059: malloc (vg_replace_malloc.c:263) ==13711== by 0x4911377A: PyObject_Malloc (obmalloc.c:939) ==13711== by 0x4919D958: _PyObject_GC_Malloc (gcmodule.c:1445) ==13711== by 0x4919DABB: _PyObject_GC_NewVar (gcmodule.c:1477) ==13711== by 0x49125D59: PyTuple_New (tupleobject.c:106) ==13711== by 0x490CE244: objargs_mktuple (abstract.c:2706) ==13711== by 0x490D2DEB: PyObject_CallFunctionObjArgs (abstract.c:2756) ==13711== by 0x490D3653: PyObject_IsSubclass (abstract.c:3041) ==13711== by 0x49178775: PyErr_GivenExceptionMatches (errors.c:119) ==13711== by 0x49178804: PyErr_ExceptionMatches (errors.c:137) ==13711== by 0x4918056F: ensure_fromlist (import.c:2463) ==13711== by 0x491807E7: import_module_level.isra.9 (import.c:2174) ==13711== ==13711== 28 bytes in 1 blocks are definitely lost in loss record 556 of 2,667 ==13711== at 0x4009059: malloc (vg_replace_malloc.c:263) ==13711== by 0x4911377A: PyObject_Malloc (obmalloc.c:939) ==13711== by 0x4919D958: _PyObject_GC_Malloc (gcmodule.c:1445) ==13711== by 0x4919DABB: _PyObject_GC_NewVar (gcmodule.c:1477) ==13711== by 0x49125D59: PyTuple_New (tupleobject.c:106) ==13711== by 0x49184F2A: do_mktuple (modsupport.c:263) ==13711== by 0x49184A37: do_mkvalue (modsupport.c:298) ==13711== by 0x4918509B: va_build_value (modsupport.c:537) ==13711== by 0x49185377: Py_BuildValue (modsupport.c:485) ==13711== by 0x5A94334: inittesthelper (testhelpermodule.c:558) ==13711== by 0x49181AAF: _PyImport_LoadDynamicModule (importdl.c:53) ==13711== by 0x4917F887: load_module (import.c:1831) ==13711== ==13711== 28 bytes in 1 blocks are definitely lost in loss record 557 of 2,667 ==13711== at 0x4009059: malloc (vg_replace_malloc.c:263) ==13711== by 0x4911377A: PyObject_Malloc (obmalloc.c:939) ==13711== by 0x4919D958: _PyObject_GC_Malloc (gcmodule.c:1445) ==13711== by 0x4919DABB: _PyObject_GC_NewVar (gcmodule.c:1477) ==13711== by 0x49125D59: PyTuple_New (tupleobject.c:106) ==13711== by 0x49184F2A: do_mktuple (modsupport.c:263) ==13711== by 0x49184A37: do_mkvalue (modsupport.c:298) ==13711== by 0x4918509B: va_build_value (modsupport.c:537) ==13711== by 0x49185377: Py_BuildValue (modsupport.c:485) ==13711== by 0x5A9439A: inittesthelper (testhelpermodule.c:568) ==13711== by 0x49181AAF: _PyImport_LoadDynamicModule (importdl.c:53) ==13711== by 0x4917F887: load_module (import.c:1831) ==13711== ==13711== 36 bytes in 1 blocks are definitely lost in loss record 1,125 of 2,667 ==13711== at 0x4009059: malloc (vg_replace_malloc.c:263) ==13711== by 0x4911377A: PyObject_Malloc (obmalloc.c:939) ==13711== by 0x51C6A57: ??? ==13711== ==13711== LEAK SUMMARY: ==13711== definitely lost: 160 bytes in 6 blocks ==13711== indirectly lost: 0 bytes in 0 blocks ==13711== possibly lost: 592,648 bytes in 4,455 blocks ==13711== still reachable: 977,825 bytes in 12,385 blocks ==13711== suppressed: 0 bytes in 0 blocks ==13711== Reachable blocks (those to which a pointer was found) are not shown. ==13711== To see them, rerun with: --leak-check=full --show-reachable=yes ==13711== ==13711== For counts of detected and suppressed errors, rerun with: -v ==13711== ERROR SUMMARY: 688 errors from 688 contexts (suppressed: 0 from 0) }}} @Benjamin: you were saying there will be likely a few other places that needs guarding. If you have some already in mind can you point those out? Thanks.
Well, any point where gobject could call into python is a candidate. So signal emission and any callback to free memory; and there are a lot of those. I'll have a look at what functions I think are candidates for changes. One is pygobject_weak_ref_notify which would fix the example code given in comment #1. I'll post a list tomorrow.
OK, the rather long list of functions that I came up with yesterday is below. Considering that at least in this bug we can assume that no glib mainloop is running at the point, one can ignore most of these (and leave that for bug #677091). I am mostly concerned about the callbacks that are responsible for cleaning things up, which should be the following subset: pyg_iowatch_data_free pyg_source_finalize pyg_destroy_notify pyg_toggle_notify pygobject_data_free pygobject_weak_ref_notify pyg_closure_invalidate pygbinding_closure_invalidate And the rest of potential candidates that I found: iowatch_marshal pyg_iowatch_marshal child_watch_func child_watch_dnotify _pyglib_destroy_notify _pyglib_handler_marshal pyg_signal_watch_prepare pyg_signal_watch_check pyg_signal_watch_dispatch destroy_g_group pyg_source_prepare pyg_source_check pyg_source_dispatch _log_func (already guarded) pygobject_unwatch_closure pygbinding_marshal pyg_closure_marshal pyg_signal_class_closure_marshal
Simon, are you using a Python interpreter with --enable-valgrind support? (E. g. Debian's and Ubuntu's ones do not have this, it's on my list to get added for the -dbg variants). TBH I do not fully understand Simon's patch, e. g. why it can just drop the pyg_{begin,end}_allow_threads calls. In principle it looks ok to me, though, so Simon, if you feel confident about it please push.
pyg_{begin,end}_allow_threads cannot be used if the interpreter has been finalized. So instead of using them directly, the patch manually expands them to: + PyThreadState *_save = NULL; [SNIP] + if (pyg_threads_enabled) + _save = PyEval_SaveThread(); [SNIP] + if (pyg_threads_enabled) + PyEval_RestoreThread(_save);
Thanks Benjamin and Simon! This clears it up, pushed to master now.