After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 770303 - Incorrect commodity match during import OFX and possible solution
Incorrect commodity match during import OFX and possible solution
Status: RESOLVED FIXED
Product: GnuCash
Classification: Other
Component: Import - OFX
2.6.13
Other Windows
: Normal normal
: ---
Assigned To: gnucash-import-maint
gnucash-import-maint
Depends on:
Blocks:
 
 
Reported: 2016-08-23 22:29 UTC by Geoff
Modified: 2018-06-29 23:50 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
Much abbreviated .gnucash with example commodity and account definitions (4.80 KB, application/x-gnucash)
2016-08-23 22:29 UTC, Geoff
Details
OFX file with SELLSTOCK investment transaction which triggers the problem (4.23 KB, text/ofx)
2016-08-23 22:34 UTC, Geoff
Details
Gnucash trace file relating to the problem (2.34 MB, text/plain)
2016-08-23 22:35 UTC, Geoff
Details
Snipped Gnucash tracefile, focussing on the lines most relevant to the problem (6.92 KB, text/plain)
2016-08-23 22:37 UTC, Geoff
Details
Attempt to code a replacement import-commodity-matcher (untested as yet) (5.90 KB, text/plain)
2016-08-23 22:42 UTC, Geoff
Details

Description Geoff 2016-08-23 22:29:26 UTC
Created attachment 334035 [details]
Much abbreviated .gnucash with example commodity and account definitions

Hi,

First, many thanks to all who created and maintain Gnucash - **much** appreciated.

I am several months into a project to convert 10+ years of history from MSMoney into Gnucash, and have reached the stage where I am importing (relatively) recent investment transactions via OFX. I am encountering a problem with the wrong commodity being "matched" by the import matcher.  As a consequence Gnucash can't find the corresponding stock account, and so asks me to nominate or create an account.  If I select the correct stock account, it is rejected (because it has a different commodity! aka the correct commodity not the wrongly matched one) and the only "solution" from within the Gnucash gui I've found is to create a new stock account, then clean everything up later.  Painful and error prone since it can happen to several different stocks in a single import, and I still have several years worth of imports to do.

I do however *think* I have tracked down the line of C code causing the problem with the help of the debug trace files.  (I say think because I haven't yet managed to compile Gnucash from source on Windows yet, despite several days of reading and following the Gnucash-on-Windows instructions as closely as I can - I will get back to that after posting this report - in the meantime, I am hoping that an "official" patch and compiled version? might become available).

The scenario I am encountering has been reported before, see Bug - 648523 - "OFX import account match failure", but as far as I can ascertain, it is still open.

In more detail.
==============
My gnucash file holds two commodities, one with CUSIP AGK and one with CUSIP AGKRA.  There are two different STOCK accounts, corresponding to each of these commodities. See attached much abbreviated .gnucash file).  

I have an OFX investment transaction (a SELLSTOCK) that pertains to the shorter of these two CUSIPS (namely to AGK). (I will attach the OFX file when I figure out how supply more than one attachment).  

The trace file (again will attach as soon as I can) shows that this is being matched not to the AGK commodity, but to the that AGKRA commodity.

* 11:33:08 DEBUG <gnc.import> [gnc_file_ofx_import] Filename found: C:\Users\Geoff\Documents\MSMoney data files\MSMoney_Data_Programs\Downloads for OFX\Encircle\Encircle OFX\20160819_20160819_160720 TS 141001_141031_pdf2txt__statement_33001594.OFX
* 11:33:08 DEBUG <gnc.import> [gnc_file_ofx_import] Opening selected file
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Default fullname received: AGL ENERGY - ORDINARY - (AGK)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Default mnemonic received: AGK
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking for commodity with exchange_code: AGK

....

* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at namespace ASX
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity ANZ TD 4.500 3.0 Mths - (ANZTD450030M)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity CWLTH BANK - CN 3M PER Q RD T-24 - (CBAPD)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Anz Bank - Pref Share Cps1 - (ANZPB)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity ANZ BANK - CTG PRE 3M PER T - (ANZPA)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Commonwealth Bank - Perp.ExcPERLS V - (CBAPA)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Westpac Coupon Select Deposit - (WCSD)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Austrust cash deposit fund (Tower) - ()
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity RIO TINTO - ORDINARY - (RIO)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Westfield Grp Stapled Securities - (WDC)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Westfield America $3 Cap Notes - ()
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity Westfield Holding Ltd Ord Fully Paid - (WSF)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Looking at commodity AGL ENERGY - RTS 08-SEP-14 FORUS - (AGKRA)
* 11:33:08 DEBUG <gnc.import> [gnc_import_select_commodity] Commodity AGL ENERGY - RTS 08-SEP-14 FORUS - (AGKRA) matches.

The remainder of the problematic behaviour arises from this incorrect match.

The match is done by code in src/import-export/import-commodity-matcher.c

The segment of code which reads:

            if (gnc_commodity_get_cusip(tmp_commodity) != NULL &&
                    cusip != NULL &&
                    strncmp(gnc_commodity_get_cusip(tmp_commodity), cusip, strlen(cusip)) == 0)
            {
                retval = tmp_commodity;
                DEBUG("Commodity %s%s", gnc_commodity_get_fullname(retval), " matches.");
            }

is what determines whether or not a match has been found.  In this code "cusip" is the string from the OFX file (in my example AGK), and (I presume) "gnc-commodity_get_cusip(tmp_commodity)" is the CUSIP of the candidate commodity in the current iteration of the search loop.  The bug occurs if this loop reaches the AGKRA commodity *before* it reaches the correct AGK commodity, because the strncmp test only looks at "strlen(cusip)" characters, ie only at 3 characters in my scenario.  If instead it looked at all the characters (ie at "max(strlen(cusip), strlen(gnc_commodity_get_cusip(tmp_commodity))" , the incorrect match would not occur. 

It seems this problem could occur whenever the cusip of one commodity is a leading substring of the cusip of another commodity - but apparently intermittently, depending on which one is encountered first in the search loop. I have not been able to determine the ordering of the internal table through which Gnucash is searching.

(PS From reading I believe that C does not have a "max" function, so please consider that last as pseudo-code).

I hope that explanation is clear (and enough to help a patch be developed).  I'm not very familiar at all with C but have tried to code a corrected version of import-commodity-matcher.c to check my reading of the code.  However, as mentioned earlier, I am still struggling to get the 'out-of-the-GitHub-box' Gnucash code to compile on my Windows machine, so as yet have not been able to try out my patched version.

Once again, thanks to all who look after Gnucash.

Regards

Geoff
Comment 1 Geoff 2016-08-23 22:34:13 UTC
Created attachment 334036 [details]
OFX file with SELLSTOCK investment transaction which triggers the problem

I have attached the (close to minimal) OFX file which triggers the bug.
Comment 2 Geoff 2016-08-23 22:35:58 UTC
Created attachment 334037 [details]
Gnucash trace file relating to the problem

The full Gnucash tracefile from when the problem occurred
Comment 3 Geoff 2016-08-23 22:37:35 UTC
Created attachment 334038 [details]
Snipped Gnucash tracefile, focussing on the lines most relevant to the problem

Just a cutdown version of the tracefile to help me focus on what was going wrong more easily
Comment 4 Geoff 2016-08-23 22:42:13 UTC
Created attachment 334040 [details]
Attempt to code a replacement import-commodity-matcher (untested as yet)

This C code is untested as yet (still working on getting compile on Windows working).  I'm not a C coder, so there may be both trivial syntax errors, and very awkward and unstylistic code included.  But it's not very long, and may help clarify anything I've explained to poorly (or too verbosely) in the bug report.
Comment 5 Geoff 2016-08-23 22:46:34 UTC
Comment on attachment 334035 [details]
Much abbreviated .gnucash with example commodity and account definitions

I have not attached the full (unzipped).gnucash file since it is quite large, and also contains much personal financial data.  I have seen the Obfuscatescript, but (as far as I know) I do not have Perl installed on my Windows machine, so cannot (easily) use it.  IF a full copy of the .gnucash file is needed to explore the bug, let me know and I will try and provide one.
Comment 6 John Ralls 2016-08-23 23:20:26 UTC
Thanks for the detailed bug report. You've provided more than enough information to work with (as long as the .gnucash file isn't mangled), so there's no need for you to keep banging your head against the C wall.

I'm setting up to test your example files now and if your diagnosis is correct I can fix it with a few minutes work.
Comment 7 John Ralls 2016-08-24 00:31:28 UTC
Unfortunately the much.abbreviated.unzipped.gnucash was a bit too abbreviated; it won't load. Please email me your zipped file from before the example import at jralls at ceridwen dot us. I will treat it as confidential and destroy it as soon as I'm finished verifying that the bug is fixed.
Comment 8 Geoff 2016-08-24 01:37:36 UTC
Hi John

Just emailed you a zipped gnucash file as requested.  Hope the fix works

Geoff

(In reply to John Ralls from comment #7)
> Unfortunately the much.abbreviated.unzipped.gnucash was a bit too
> abbreviated; it won't load. Please email me your zipped file from before the
> example import at jralls at ceridwen dot us. I will treat it as confidential
> and destroy it as soon as I'm finished verifying that the bug is fixed.
Comment 9 John Ralls 2016-08-25 18:05:21 UTC
That worked, your patch--or at least the idea of your patch--fixed the problem, pushed as https://github.com/Gnucash/gnucash/commit/ea3862499b992efed5f24d0ff1be9881a81aa00d. I credited you as the author because you nailed the problem and solution even if you couldn't quite express it in C.

The change will be in the next release, less than a month away.

Thanks!
Comment 10 Geoff 2016-08-25 22:04:57 UTC
Thanks John,

Much appreciate your willingness to look into this so quickly, and fix it.

As you probably deduced, I don't really speak C (in fact I can barely read it - can just manage to piece my way through small segments of code with the help of a "google" dictionary, and some very helpful debug trace messages that told me where to look).  

For the record, I did make progress on trying to compile from source on Windows.  I needed to :

a)untangle and synchronise the various git clones on my local disk, so that I was actually compiling code from the clone and branch I thought I was (I used custom.sh to point at the required branch, once I worked my way through what install.sh was doing), and 

b)crucially, discovered about "make distclean" and how to run it to clean up earlier failed attempts to compile gnucash from source in a windows environment.  Not really surprising that, after my first failed attempts, later attempts to compile my branch off the maint branch were still not happy given they were picking up parts of compiled code from the master branch!

I had got as far as discovering that my "solution" has embarrassingly basic C errors in almost every line I wrote :-( - and was still reading up on how to resolve them. Good for my soul no doubt in the long run.

Your code is much nicer and more elegant.

So ... once again, many thanks indeed.

Looking forward eagerly to the next release

Geoff

PS I will use what I have learnt about compiling on windows to download your patched version, and see if I can get that working as an interim way forward - should be fun now I have an inkling about what I'm attempting to do :-)
Comment 11 John Ralls 2016-08-25 22:12:04 UTC
There should also be a nightly build tomorrow at http://code.gnucash.org/builds/win32/maint/ with the change.
Comment 12 Geoff 2016-08-25 22:26:17 UTC
Brilliant!

Muchly appreciated :-))
Comment 13 John Ralls 2018-06-29 23:50:45 UTC
GnuCash bug tracking has moved to a new Bugzilla host. This bug has been copied to https://bugs.gnucash.org/show_bug.cgi?id=770303. Please update any external references or bookmarks.