After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 779792 - SIGSEGV on autosave_timeout_cb
SIGSEGV on autosave_timeout_cb
Status: RESOLVED OBSOLETE
Product: GnuCash
Classification: Other
Component: Engine
2.6.15
Other Mac OS
: Normal normal
: ---
Assigned To: gnucash-core-maint
gnucash-core-maint
Depends on:
Blocks:
 
 
Reported: 2017-03-09 08:33 UTC by andy.apcs
Modified: 2018-06-29 23:55 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Diagnostic report (61.74 KB, text/plain)
2017-03-09 08:33 UTC, andy.apcs
Details

Description andy.apcs 2017-03-09 08:33:36 UTC
Created attachment 347530 [details]
Diagnostic report

Seg fault occurred twice in one data-entry session.

Workflow:

- Business->Vendor->New Bill...
- Choose vendor and date
- Enter line items
- Post Invoice (whilst in View Bill)
- Pay Invoice (whilst in View Bill)
- Close View Bill
- Repeat

The payment account view (MasterCard) tab was open during data entry.

Here's an excerpt from the crash reports:


~/Library/Logs/DiagnosticReports/Gnucash-bin_2017-03-08-150547_ZakBook.crash
----------------------------------------------------------------------------

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x00000000000001d5
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [0]

VM Regions Near 0x1d5:
--> 
    __TEXT                 0000000000001000-0000000000016000 [   84K] r-x/rwx SM=COW  /Applications/Gnucash 2.6.15-1/Gnucash.app/Contents/MacOS/Gnucash-bin

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libgncmod-gnome-utils.dylib   	0x002d2931 gnc_set_busy_cursor + 97
1   libgncmod-gnome-utils.dylib   	0x0030896a gnc_file_save + 186
2   libgncmod-gnome-utils.dylib   	0x002f6c21 autosave_timeout_cb + 1009


~/Library/Logs/DiagnosticReports/Gnucash-bin_2017-03-08-150547_ZakBook.crash
----------------------------------------------------------------------------

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x0000000000000041
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [0]

VM Regions Near 0x41:
--> 
    __TEXT                 0000000000001000-0000000000016000 [   84K] r-x/rwx SM=COW  /var/folders/*/Gnucash.app/Contents/MacOS/Gnucash-bin

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libgncmod-gnome-utils.dylib   	0x002d7931 gnc_set_busy_cursor + 97
1   libgncmod-gnome-utils.dylib   	0x0030d96a gnc_file_save + 186
2   libgncmod-gnome-utils.dylib   	0x002fbc21 autosave_timeout_cb + 1009
:

----------------------------------------------------------------------------
Speculation:

Both appear to be null-pointer issues - dereferencing sizeable structure types with offsets:

  0x1d5
  0x041
Comment 1 John Ralls 2017-03-09 15:00:28 UTC
Not null, invalid. The only dereference in gnc_set_busy_cursor is of a node pointer in a linked list of top-level windows, so it would seem that that list is getting corrupted somewhere.

Were there any windows other than the main one open at the time? Tabs don't count, just top=level windows.
Comment 2 andy.apcs 2017-03-09 15:54:53 UTC
From memory, no.

Do these count:
  - New Bill: Business->Vendor->New Bill...
  - Process Payment: View Bill->Pay Invoice

The only other windows I remember using are:
  - Find Bill: Business->Vendor->Find Bill...
  - Find Vendor: Business->Vendor->Find Vendor
    - Find Bill: Find Vendor->Vendor's Bills

I'm pretty sure I closed these windows once I'd located the bill.

At the time of the crashes, I was bashing the workflow in the original post.

There were outstanding payments which I have noticed causing a "reminder" window to appear (but usually get obscured by the main window).

One other (probably unrelated) point - after the first crash, I could not restart the app from the dock, used the command-line instead.  Just had to do the same to flesh-out this post.
Comment 3 John Ralls 2017-03-09 17:04:31 UTC
It's more likely that the invalid node pointer would be caused by closing a window. The mechanism I have in mind would be that the memory for the node is deleted but the previous node's next pointer isn't NULLed so the list iteration goes to the now invalid node and its next pointer has been reused for something else so that *that* next pointer is invalid, leading to the crash.

I'm not familiar enough with the business code to know which of your candidates are dialog boxes and which are windows. I'll ask Geert to have a look, he's far more familiar with that code.

If something pops up behind it's almost surely a dialog box, but it's odd that it would go behind the main window. Usually they go between the main window and the window that they apply to because they haven't had "transient_for" set on them so they get the default, which is the main window.

What happens when you try to restart from the dock? Does it also fail if you launch it from a Finder window?

Oh, and what's with the /var/folders/*/Gnucash.app paths in the second crash report? That's a really strange place to put an application bundle.
Comment 4 andy.apcs 2017-03-09 17:36:37 UTC
Ahh, the Dock issue isn't relevant!  When I upgraded last (a month or more ago) I cleared out older versions and forgot to update the Dock link - so the link was pointing to a deleted version (DOH).

I'm not sure where the /var/folders... come from.  When I add to the dock, the file ~/Library/Preferences/com.apple.dock.plist is updated with a link to:

/private/var/folders/6n/chgbg9gx1nz9_vzr7gjlzc0c0000gn/T/AppTranslocation/3949052A-523B-4888-AFEE-429B6450046A/d/Gnucash.app

I guess OSX generates symlinks to the binaries - for whatever reason.

(The plist is in binary format so needs:
> plutil -convert xml1  ~/Library/Preferences/com.apple.dock.plist
to render it readable).

Anyhow, that's all off-topic now!

I'm guessing all the "windows" I proposed are actually dialog boxes - but I'm sure Geert will know the facts.
Comment 5 John Ralls 2017-03-09 17:42:00 UTC
> I'm guessing all the "windows" I proposed are actually dialog boxes - but I'm sure 
> Geert will know the facts.

He didn't offhand so I'm grepping through them now.
Comment 6 John Ralls 2017-03-09 18:15:12 UTC
Nope. All dialogs. But I'm mistaken about dialogs not being toplevels, so that list could get corrupted by the incorrect destruction of a dialog as well as of a window.

Do you remember which step in the workflow you were in when it crashed?

Is it crashing regularly so that you can easily attach a post-crash gnucash.trace (`sudo find /var/folders -name gnucash.trace` is the easiest way to find it on a Mac)?
Comment 7 andy.apcs 2017-03-11 07:59:32 UTC
I don't remember it being a specific step - but then it's triggered by a timeout so is asynchronous to what I was doing.  During the second occurrence, I was redoing transactions I'd lost due to the first crash (3-4 cycles of the workflow) so that narrows it down to "New Bill" and "Process Payments".

My original post has identical .crash filenames.  That's incorrect, the two files are:

~/Library/Logs/DiagnosticReports/Gnucash-bin_2017-03-08-143401_ZakBook.crash
~/Library/Logs/DiagnosticReports/Gnucash-bin_2017-03-08-150547_ZakBook.crash

From the .log files:
- 14:33:05 posted a bill invoice
- 14:33:18 posted a bill payment
  crash
- 14:42:12 posted a bill invoice
- 14:42:29 posted a bill payment

I still have the checkpoint and log files so can try to recreate the crash using the data:

*.gnucash.20170308142826.gnucash
*.gnucash.20170308142827.log
*.gnucash.20170308143810.log
*.gnucash.20170308144548.gnucash

I'll find some time over the weekend to give this a shot.
Comment 8 John Ralls 2017-03-11 15:47:00 UTC
The trigger is asynchronous but the state of the toplevel list isn't. It's a list of the toplevels that you have open at the moment the function is called, plus apparently one invalid one.

But I was thinking backwards. The convention for adding items to a list is to *prepend*, so if you're creating and destroying dialogs in a cycle the behavior will be like pushing and popping a stack, and the other end of the list--where the corruption is most likely because of an un-nulled next pointer--isn't going to be affected. That window should be the splash screen or the tip-of-the-day dialog, both of which are created and usually destroyed before the main window. That's not a likely scenario for an intermittent crash.

That suggests that it's some other heap corruption that happens to be stomping on the list. When you try to recreate the crash, please set the following environment variables (most easily done by running from the command line and prepending them, and yes, they're really camel case):
MallocLogFile=$HOME/crashmalloc.log MallocStackLoggingNoCompact=1 MallocScribble=1 MallocCorruptionAbort=1

After the crash there will be at least one stack log file in /tmp. Please attach it here along with crashmalloc.log.
Comment 9 andy.apcs 2017-03-11 18:29:12 UTC
No luck recreating the crash.

- performed 32 cycles of the workflow
- 7 checkpoint file created

The GUI response was slugged by the additional instrumentation - not sure if that had an effect.

Will have another stab over the weekend if time permits.
Comment 10 andy.apcs 2017-03-13 08:02:23 UTC
Had another crash - but not under instrumented conditions unfortunately.

Crashed Thread:        0  Dispatch queue: com.apple.main-thread

Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
Exception Codes:       KERN_INVALID_ADDRESS at 0x000000000000017d
Exception Note:        EXC_CORPSE_NOTIFY

Termination Signal:    Segmentation fault: 11
Termination Reason:    Namespace SIGNAL, Code 0xb
Terminating Process:   exc handler [0]

VM Regions Near 0x17d:
-->
    __TEXT                 0000000000001000-0000000000016000 [   84K] r-x/rwx SM=COW  /var/folders/*/Gnucash.app/Contents/MacOS/Gnucash-bin

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libgncmod-gnome-utils.dylib         0x002d4931 gnc_set_busy_cursor + 97
1   libgncmod-gnome-utils.dylib         0x0030a96a gnc_file_save + 186
2   libgncmod-gnome-utils.dylib         0x002f8c21 autosave_timeout_cb + 1009
3   libglib-2.0.0.dylib                 0x02eaaebc g_timeout_dispatch + 28
4   libglib-2.0.0.dylib                 0x02eaa4f1 g_main_context_dispatch + 209
5   libglib-2.0.0.dylib                 0x02eac72a g_main_context_iterate + 410
6   libglib-2.0.0.dylib                 0x02ead7b7 g_main_loop_run + 263
7   libgtk-quartz-2.0.0.dylib           0x02674e91 gtk_main + 177
8   libgncmod-gnome-utils.dylib         0x0030fe51 gnc_ui_start_event_loop + 81
9   Gnucash-bin                         0x000145b5 inner_main + 693
10  libguile-2.0.22.dylib               0x023a9cb9 invoke_main_func + 57
11  libguile-2.0.22.dylib               0x0237ddb2 c_body + 18
12  libguile-2.0.22.dylib               0x0241b164 vm_regular_engine + 27060
13  libguile-2.0.22.dylib               0x024131a2 scm_c_vm_run + 114
14  libguile-2.0.22.dylib               0x023872e1 scm_call_4 + 65
15  libguile-2.0.22.dylib               0x0240ea4b scm_catch_with_pre_unwind_handler + 91
16  libguile-2.0.22.dylib               0x0237dd7a scm_i_with_continuation_barrier + 138
17  libguile-2.0.22.dylib               0x0237de5f scm_c_with_continuation_barrier + 79
18  libguile-2.0.22.dylib               0x0240bb4b with_guile_and_parent + 203
19  libgc.1.dylib                       0x0254921f GC_call_with_stack_base + 31
20  libguile-2.0.22.dylib               0x0240bbbb scm_i_with_guile_and_parent + 43
21  libguile-2.0.22.dylib               0x023a9c5a scm_boot_guile + 58
22  Gnucash-bin                         0x0001419d main + 1037
23  Gnucash-bin                         0x000131e6 start + 54


I had just requested a "Reports->Income & Expense->Equity Statement".  No bill invoices or payments.  I do leave the app running - on a laptop that sleeps when not in use - but that wouldn't have influenced the second reported crash (2017-03-08-150547).

Unlikely but...could the open menu dropdown be the culprit?  I was just about to or had just click on "Equity Statement" on the drop-down (can't say for sure) when the crash happened.

Will bite the bullet and use instrumented configuration for future use!
Comment 11 John Ralls 2017-03-13 16:23:25 UTC
I leave GnuCash running for weeks at a time with no ill effects, but I don't use the business features and I seldom run reports so I'm not likely to encounter the problem you're having.

On MacOS the menus are delegated to the OS, so not only are they not toplevel windows (which they wouldn't be anyway) they're not even Gtk widgets.

Remember, though, that I don't think that the problem is an incorrect list operation or improper destruction of a window. Rather I think that something else is stomping on the memory allocated for one of the list nodes, corrupting its pointers.

Did you remember to preserve the gnucash.trace from the crash before restarting GnuCash?
Comment 12 andy.apcs 2017-03-14 10:00:48 UTC
DOH!
Comment 13 John Ralls 2018-06-29 23:55:10 UTC
GnuCash bug tracking has moved to a new Bugzilla host. The new URL for this bug is https://bugs.gnucash.org/show_bug.cgi?id=779792. Please continue processing the bug there and please update any external references or bookmarks.