GNOME Bugzilla – Bug 632166
sql backend silently fails and loses data
Last modified: 2018-06-29 22:45:51 UTC
Created attachment 172382 [details] Missing currency in txn 3945ba14b29720dd408c583a31447bbd While testing svn gnucash I tried converting my large and old data file from xml to sql (sqlite, specifically). When reopening the resulting database I noticed that a large number of transactions were completely missing and no error was indicated. Rooting through everything, I discovered that my source xml file had a single transaction (out of >8000 total transactions) that was missing a currency field. This small omission caused the sql backend to fail during gnc_sql_save_commodity() (in save_transaction()). This error aborts the save of all following transactions and returns an error upwards. This error eventually ends up in gnc_sql_sync_all() where it aborts the saving of the rest of the file and then the error was promptly dropped and not reported to the user. This caused my conversion to appear like it worked, but to fail very spectacularly. Attached is a small sample xml file with a missing currency in a transaction. Open it up, run check & repair (it won't fix it), resave it as xml (it won't fix it, but it'll save it), save it as sql and watch it drop the transaction (guid 3945ba14b29720dd408c583a31447bbd) from the resulting sql database without any comment. So, there are two bugs here: First (and most importantly), the loss of the error condition and the apparent success when there was actually significant data loss. Second, better handling of wrong-but-not-wrong input data. I'll open a separate bug for this if desired. Thanks.
Well, hmm. Using the latest SVN on both OSX and Debian Lenny, your test file saves correctly to sqlite -- both transactions are present. I also modified your test by adding by hand another transaction after the supposedly defective one, and that file saved correctly as well. Please update your svn sandbox and test again.
Hmm.. I was able to reproduce this with both an updated svn (trunk@19668) build under Linux and the official Windows 2.3.15 build from gnucash.org with sample file I attached here. What else can we try?
Can you do a debug build and make a stack trace of the failure showing the exact lines in gnc_sql_save_commodity() and gnc_sql_sync_all() (and everything else, of course). With that I might be able to figure out how to force an error so that I can at least make the failure be not silent, and maybe to just skip the errant transaction and keep going.
Yep, I'll start on it right now, but I can tell you exactly what's wrong too: save_transaction() (gnc-transaction-sql.c:690) calls gnc_sql_save_commodity( be, xaccTransGetCurrency( pTx ) ) on line 719. xaccTransGetCurrency(pTx) will return NULL because of the missing currency in the transaction (trans->common_currency is NULL). This'll make the call to gnc_sql_save_commodity() have pCommodity == NULL and that'll trigger the second g_return_val_if_fail(). Returning back to save_transaction(), is_ok is set to FALSE and that should abort any further attempts to save any other data. I'll get you a nice backtrace right now.
Here you go: This is the second transaction (the one missing the currency). Breakpoint 3, save_transaction (be=0xbfa270, pTx=0xa97360, do_save_splits=1) at ../../../../sqlite-convert/src/backend/sql/gnc-transaction-sql.c:719 719 is_ok = gnc_sql_save_commodity( be, xaccTransGetCurrency( pTx ) ); (gdb) s xaccTransGetCurrency (trans=0xa97360) at ../../../sqlite-convert/src/engine/Transaction.c:1086 1086 return trans ? trans->common_currency : NULL; (gdb) p trans->common_currency $6 = (gnc_commodity *) 0x0 (gdb) s 1087 } (gdb) s [Thread 0x7fffd7fff710 (LWP 27646) exited] gnc_sql_save_commodity (be=0xbfa270, pCommodity=0x0) at ../../../../sqlite-convert/src/backend/sql/gnc-commodity-sql.c:248 248 gboolean is_ok = TRUE; (gdb) n 250 g_return_val_if_fail( be != NULL, FALSE ); (gdb) n 251 g_return_val_if_fail( pCommodity != NULL, FALSE ); (gdb) n 259 } (gdb) n save_transaction (be=0xbfa270, pTx=0xa97360, do_save_splits=1) at ../../../../sqlite-convert/src/backend/sql/gnc-transaction-sql.c:722 722 if ( is_ok ) (gdb) p is_ok $7 = 0 (gdb) bt
+ Trace 224192
*** Bug 609583 has been marked as a duplicate of this bug. ***
OK, I set breakpoints and watched it happen. The transaction was in the register, though, until I closed the db and reopened it. That step I hadn't done before, which was why I thought that I couldn't reproduce the problem. I've checked in r19673, which adds the notification part. Contrary to what Don Allen reported in 609583, Check & Repair in 2.2.9 doesn't fix this, so that's not a regression. Next, I'll try to repair the null commodity.
(In reply to comment #7) > OK, I set breakpoints and watched it happen. The transaction was in the > register, though, until I closed the db and reopened it. That step I hadn't > done before, which was why I thought that I couldn't reproduce the problem. > > I've checked in r19673, which adds the notification part. I can confirm that I get a nice new error now instead of silent corruption. Thanks! :-) I'm a little wary of just the error message, however. As you pointed out, the data still appears to be good while it's still in the register. It's only on closing/reopening the DB that you discover the corruption. Until you do that, however, you're attached to the newly corrupted database. If a user were to ignore the error (I know, unthinkable :-) and then proceed to look over the data, find it good (since it's still reporting correctly in the register), enter 100 new transactions and then exit, then they'll be in for a bit of a shock (hopefully), when they next open their database. Maybe I'm being overly paranoid, but .... Regardless, thanks for the error dialog as any notification is better than no notification.
After I checked in r19673, Geert pointed out on the mailing list that it introduces a GUI dependency in the back end. In trying to fix the actual problem, I discovered qof_backend_set_error that takes care of that and doesn't break the string freeze. I've rewritten the alerting to use it in r19680. You do have to find the error in gnucash.trace to know what transaction broke, and it aborts the save. That should handle your paranoia. ;-) It turns out that trying to change a transaction in the middle of a save starts another save, which aborts both saves, so fixing the problem on the fly won't work. That being the case, it must be left up to the user to fix the problem and try again. The right place to fix this is in scrub, so I've created a new bug, https://bugzilla.gnome.org/show_bug.cgi?id=632588, to track fixing scrub.
(In reply to comment #9) > You > do have to find the error in gnucash.trace to know what transaction broke, and > it aborts the save. That should handle your paranoia. ;-) Hmm.. Are you sure about that? I added a new transaction and then attempted to save as a sqlite db. The save failed (as expected) and I got "The file/URL sqlite3://path/to/file does not contain GnuCash data or the data is corrupt". Clicking this away opens up a second "Save" dialog (that's noticeably different than the first (there's no "Data Format" option, for one) while leaving the original "Save As..." dialog still open (but unresponsive) in the background. At this point I have two options: 1. Enter a new filename into the "Save" dialog and I get a new XML file. This is a good thing, but it was very non-obvious. 2. Cancel the "Save" dialog. The "Save As..." dialog also exits and I'm returned to the main window, but the window title indicates both the attempted new filename and a lack of a "*". If you exit gnucash at this point, the application will exit cleanly (no prompting about unsaved data), but the new transaction is truly lost. > It turns out that trying to change a transaction in the middle of a save > starts another save, which aborts both saves, so fixing the problem on the > fly won't work. That being the case, it must be left up to the user to fix > the problem and try again. > > The right place to fix this is in scrub, so I've created a new bug, > https://bugzilla.gnome.org/show_bug.cgi?id=632588, to track fixing scrub. I agree, that's completely reasonable.
Hmm. I didn't test that far, no. When I run the test you described I find that the db is created after all, and that the new transaction is in it, but the one with the missing currency isn't. That won't do.
Well! Several bugs in one. I think it's working in a more or less sane way now, except for still needing a better explanation of what went wrong in the message dialog instead of forcing the user to dig through the logs. (r19682, r19687, and r19688 all apply to this ticket in addition to the ones noted above.)
Looks good. I was unable to trick it into losing data, lying to me or doing anything non-intuitive. :-) Thanks a lot for the fix!
GnuCash bug tracking has moved to a new Bugzilla host. This bug has been copied to https://bugs.gnucash.org/show_bug.cgi?id=632166. Please update any external references or bookmarks.