GNOME Bugzilla – Bug 794360
(SIGSEGV) Deleting an account that contains subaccounts causes a segmentation fault
Last modified: 2018-06-30 00:05:32 UTC
archlinux 64bit 4.15.6-1-ARCH gnucash 2.6.19-2 and gnucash-git 2.7.4-1 Segmentation fault when deleting an account that contains multiple accounts. With the default accounts, deleting "Auto" triggers it This error happened on both of my arch machines with 2.6 and 2.7 Stacktrace: Thread 1 "gnucash" received signal SIGSEGV, Segmentation fault. 0x00007ffff650c0d6 in gtk_widget_is_sensitive () from /usr/lib/libgtk-x11-2.0.so.0 Thread 1 "gnucash" received signal SIGSEGV, Segmentation fault. 0x00007ffff650c0d6 in gtk_widget_is_sensitive () from /usr/lib/libgtk-x11-2.0.so.0 (gdb) bt full
+ Trace 238475
I'm not able to replicate this, though admittedly the bleedingest-edge distro I have is Debian Unstable. Please debug it enough to figure out exactly what is segfaulting and why.
Created attachment 369896 [details] output of gnucash-valgrind G_DEBUG=gc-friendly was commented out as it prevent the crash
(In reply to mxovd from comment #2) > Created attachment 369896 [details] > output of gnucash-valgrind > > G_DEBUG=gc-friendly was commented out as it prevent the crash I'm not able to accurately interpret the valgrind/callgrind outputs. There seems to be a lot of errors caused by guile and libgc Since using G_DEBUG=gc-friendly prevent the crash, the problem might be with the gc package. I tried to downgrade it to the same version as debian unstable but the issue persists. Do you have any pointers on where I should start looking for debugging this issue?
I also had a try to replicate this but I too could not get it to fail. Could you explain the structure of the accounts involved, maybe there is a place holder or different currencies involved, do any accounts have transactions in them and which account do you delete and then the options on the dialog. Maybe easier with a screen shot. I had a look at your valgrind output and if you do a search for "delete-account" you see this one that may be the cause... gnc_plugin_page_account_tree_cmd_delete_account (gnc-plugin-page-account-tree.c:1467) If you have gdb, try 'gdb gnucash' and then at the prompt 'run --g-fatal-warnings' and bt for a backtrace when it happens. Bob
I'm using the default accounts (common) on a fresh install, with a fresh database. I just press next on every prompt. Any account containing subaccounts will trigger the crash. Once I press delete, I have to choose between deleting the subaccounts or moving them. Whatever the choice I make, once I press delete, it throws a segfault. I ran gdb with the --g-fatal-warnings, here's the first output at the sigtrap and then at the sigsegv Thread 1 "gnucash" received signal SIGTRAP, Trace/breakpoint trap. 0x00007ffff74f1982 in ?? () from /usr/lib/libglib-2.0.so.0 (gdb) bt
+ Trace 238488
What version of gtk are you using, does this happen for both xml and sqlite backends, still can not reproduce on 2.7.7
I'm using gtk3 3.22.28-1 and gtk2 2.24.32-1 arch packages It happens for sqlite3 and xml. This bug is affecting all my arch installs (two on i3wm and one on gnome3) so i assume it might be distro specific since no one else seems to be affected. Do you think that this issue could be related?: https://trac.sagemath.org/ticket/24575 Note that arch doesn't have a package for gnucash as it is using webgtk2 which is deprecated. I assume that once gnucash3 is release it will be reintegrated in the arch packages.
OK, I am seeing that issue from line 1467 in the gnucash.trace file but it does not crash for me so will investigate and try to come up with a fix.
I think I have found what was causing the error for me which was the sensitivity of the account selectors were being retrieved for possibly invalid widgets. I have created PR319 to fix. Hopefully these changes will be in the next release and will fix your problem. If you can, you might like to make the changes locally to prove. Bob
I have some issues with the PR, comments there. What problem did you see?
That was for Bob. For mxovd, valgrind and guile don't get along well at all. I once tried to make an exceptions file for it but gave up after 4 hours. Install the debug symbols for glib and gtk and build GnuCash with -DCMAKE_BUILD_TYPE=Debug. Set a breakpoint on gnc_plugin_page_account_tree_cmd_delete_account. When you get there, set a watchpoint on &sa_trans_mas. It should get changed only once, at line 1431 in 2.7.x or line 1278 in 2.6.x. After it's assigned, do `p *sa_trans_mas` to make sure that it looks like a well-formed GtkWidget and set another watch on sa_trans_mas. Set a breakpoint on line 1463 (2.7) or 1310 (2.6) and continue. If either watch fires before it stops that will be the problem. Otherwise when it stops do `p *sa_trans_mas` again to see if it's still valid. Continue and see if it crashes. If it doesn't then we may be looking at an optimization bug.
Created attachment 369953 [details] GDB output with debug symbols for GTK3 - GLIB2 and GNUCASH 2.7.7 I built glib2, gtk3 and gnucash with debug flags and ran gdb. The value is changed at line 1431. I then set a watch on sa_trans_mas and a break at 1463. I let it run for 1.5-2h but it never made it to the breakpoint. I'm not sure if something is wrong or if I should have let it run longer. I did a run with only line breakpoints, printing locals at braks and bt full at the end. I joined the output to this post. Tell me if you need more info
I just realized that sa_trans_mas changed at 1463 output at 1432: sa_trans_mas = 0x555555bb7f40 [GNCAccountSel] $1 = {parent_instance = {g_type_instance = {g_class = 0x555555a4d400}, ref_count = 1, qdata = 0x2}, priv = 0x555555bb7e50} output at 1462: sa_trans_mas = 0x555555bb7f40 $2 = {parent_instance = {g_type_instance = {g_class = 0xaaaaaaaaaaaaaaaa}, ref_count = 2863311530, qdata = 0xaaaaaaaaaaaaaaaa},
Comment on attachment 369953 [details] GDB output with debug symbols for GTK3 - GLIB2 and GNUCASH 2.7.7 You didn't let it run for 1.5-2 hours. It *crashed* and you didn't notice for 2 hours. It also hit all three breakpoints, but since they all have line numbers it seems you didn't set a watchpoint. Doesn't matter, it wasn't the sa_trans_mas that changed, it was what sa_trans_mas pointed to that changed; I think that's probably from being packed into the freed sa_trans_mas_hbox at line 1433. A watch wouldn't have caught that. That matches my code analysis and what I explained on Bob's PR.
The run I posted wasn't the one that did not finish. Since setting a watch on sa_trans_mas was taking too long, I ran GDB with only line breakpoints to get to the crash. Do you need the output of the ran that i interrupted? (im on a different machine but I can reproduce it again if needed)
No, as I said the watchpoint wouldn't trigger because it isn't sa_trans_mas that gets deallocated, it's sa_trans_mas_hbox. It would be worthWhile to check my hypothesesis by looking at sa_trans_mas before and immediately after it's packed into sa_trans_mas_hbox to make sure that that is indeed when it shows that the parent has been freed.
Created attachment 369991 [details] GDB account_tree_cmd_delete_account - line 1433-1435 Ran GDB with prints before and after the line 1433 tell me if you need something else, thanks
Created attachment 369992 [details] GDB account_tree_cmd_delete_account - line 1433-1435
So it's not there. It's actually getting freed at line 1447, g_object_unref(G_OBJECT(builder)); Since it's in the hierarchy of a toplevel created by GtkBuilder that's contrary to the documentation. It's late and I don't want to go digging through the sources right now, so I'll speculate instead that since it's never made sensitive builder treats it specially, though I suppose it's also possible that since it's created in code and added builder expects that we should have reffed it. With the Gtk and Glib I'm using at the moment (on a Mac) the type information is NULLed at free and that's sufficient to prevent the crash in GTK_IS_WIDGET. On your system where the freed memory is set to 0xaaaaaaaaaaaaaaaa it crashes. ISTR that was a debugging behavior in old versions of GLib, but it seems to have been removed at some point; glib now NULLs memory to help garbage-collecting tools.
That would explain why export G_DEBUG=gc-friendly prevent the crash It seems that the reason I have this problem is the --enable-debug flag According to glib docs,the default is --enable-debug=minimum on stable release as --enable-debug=no is not safe. I compiled Glib with "minimum" and "no", and both do not crash. The PKGBUILD for glib in arch uses --enable-debug=yes since 2.54.3-1 (jan-08-2018, current version) but it was changed back to minimum for 2.56.0 (8 days ago, not released yet) This seems to be the reason why arch is crashing but other distros aren't.
PR319 solved the issue. https://github.com/Gnucash/gnucash/pull/319 Thank you very much!
GnuCash bug tracking has moved to a new Bugzilla host. This bug has been copied to https://bugs.gnucash.org/show_bug.cgi?id=794360. Please update any external references or bookmarks.