GNOME Bugzilla – Bug 363648
COMM_FAILURE error on Windows2000
Last modified: 2008-12-16 19:16:58 UTC
This was encountered during our GnuCash port to Windows, http://wiki.gnucash.org/wiki/Windows . GnuCash uses GConf for its application settings, which in turn uses (AFAIK) ORBit2 for its client-server communication. In GnuCash, when GConf is being used, on Windows2000/SP4 contacting the GConf server *always* fails with error messages like the following: Failed to save key /apps/gnucash/window/pages/account_tree/name_visible: Adding client to server's list failed, CORBA error: IDL:omg.org/CORBA/COMM_FAILURE:1.0 This occurred *only* on some windowses. The problem has been reported on windows2000/SP4, but no such problems have been reported from tests on Windows XP. I thought maybe the ORBit2 teams has some ideas on how to track these down? For testing, I took this Windows2000 machine, compiled ORBit2-2.14.3 from source, and ran "make check" there, which failed in test/everything right after ... Testing DerivedServer ... Testing CORBA_Object_non_existent ... Testing Async invocations ... with the failed assertion "client.c:1982 (testAsync): assertion failed: (ev->_major == CORBA_NO_EXCEPTION)". The error message tells me "Test failed with params: --ORBIIOPIPv4=1 --ORBIIOPUSock=0 --ORBCorbaloc=1; if this is an IPv4 test, can you ping stimmfix?" where "stimmfix" is my hostname, but running "ping stimmfix" shows the replies just fine. Attached you will find the screenshot of the failed assertion, the stdout/stderr log of "make check", and my config.log. The results of make check for ORBit2-2.14.3 and ORBit2-2.14.2 were absolutely identical, so it seems to me this is unrelated to bug#354950. Don't hesitate to ask for any further information that I should give. Thanks a lot.
Created attachment 75079 [details] Screenshot of failed assertion during make check
Created attachment 75081 [details] stdout/stderr log of make check
Created attachment 75082 [details] config.log of compiling ORBit2-2.14.3 from source
According to http://lists.gnucash.org/pipermail/gnucash-devel/2006-October/018699.html this problem occurs on some WinXP systems too (please correct me if these are two separate issues). It does not on mine though.
I can only guess that this has something to do with mapping of names, name service, and whatnot. All that code is quite confusing in the ORBit2 code. 2.14.3 is known to not work on Windows, don't bother with that. Use 2.14.2 or CVS HEAD. Your best way forward is to use gdb and step through the relevant code checking what actually happens.
The error occurs with 2.14.2 just as well; in fact, the original GnuCash COMM_FAILURE has been tested with the 2.14.2 binary from ftp://ftp.gnome.org/pub/gnome/platform/2.16/2.16.0/win32/ . The above "make check" failure, as already mentioned, is identical for both 2.14.2 and 2.14.3. When you say "use gdb", do you mean to use gdb on one of the test programs in the ORBit2 package? If yes, which one?
Created attachment 75090 [details] Gdb output when stepping into the function that fails during make check This is the detailed gdb step-by-step execution of the test case that fails during "make check", ORBit2-2.14.2. I've omitted the introduction part but went straight to the interesting function call, link_connection_do_initiate() from linc2/src/linc-connection.c. The hostname seems to be looked up just fine, but eventually the select() call at linc-connection.c:593 returns because some exception has occurred in the except_fds. Then, errno = WSAECONNREFUSED and everything returns with the COMM_FAILURE exception. Does that help anyone?
Have you verified that the port you are trying to connect to really is in a listening state? (Using netstat -a, for instance.) Could some firewall software be blocking the connection?
netstat -a says that all these ports are listening just fine. So that doesn't seem to be a problem. As another information, someone gave gnucash/gconf a try with ORBit2-2.13.3 and to our surprise this doesn't show this error. I'll attach a stdout/stderr from make check with 2.13.3, where the above error (of 2.14.2) does *not* show up but instead all tests in test/everything/ succeed just fine.
Created attachment 75656 [details] stdout/stderr log of "make check" of ORBit2-2.13.3
I was finally able to see this bug on my PC a few times and it turned out that applying http://cvs.gnome.org/viewcvs/ORBit2/linc2/src/linc-connection.c?r1=1.115&r2=1.117 seems to solve the issue for me. So maybe this bug is really a duplicate of #354950. I cannot explain why this happened to Christian with 2.14.2, though. Tor, may you build ORBit2 CVS HEAD zip files for us and all others experiencing this problem?
I really don't want to build and distribute CVS snapshot binaries unless absolutely necessary. A new official ORBit2 release (2.14.4) would be great, I think. Kjartan or somebody...? If 2.14.2 doesn't work for Christian, how can we be sure that HEAD would work any better?
Applying the patch from comment#11 to ORBit2-2.14.3 doesn't change the failure; the "make check" still doesn't work for me.
I received this error this week and had a chance to interact with the system. The problem, when it happens, affects all applications that use GConf to communicate their configuration. On systems on which it does not happen, the applications communicate their configuration flawlessly. Affected applications include GnuCash, all of gnome-games, gedit, planner, gnumeric. All of them work fine when this error does not take place. CORBA error: IDL:omg.org/CORBA/COMM_FAILURE:1.0 While I was there, knowing that there was a problem with 2.14.0, I turned on GCONF_DEBUG_OUTPUT to see what was amis. The trace is shown below. My interpretation of this trace is that gconfd(the server) could not establish a lock on the IOR file and exited more or less gracefully. It is, therefore, no surprise that the server did not respond when the client (gnome-games/mahjongg in this case) gconf code could not establish a connection to the server. Reference: gconf/gconfd.c:794 and gconf/gconf-internals.c:2492 Downgrading to ORBit2-2.13.3 caused the symptom to go away although having reviewed the differences between 2.13.3 and 2.14.0, I cannot understand why and it is possibly a coincidence, pointing to a race condition and not a problem in ORBit2 at all. starting (version 2.14.0), pid 3992 user 'XXXX YYYY' Adding source `xml:readonly:/mingw/etc/gconf/gconf.xml.mandatory' Adding source `xml:readwrite:C:/msys/1.0/home/XXXX YYYY/.gconf' Adding source `xml:readonly:/mingw/etc/gconf/gconf.xml.defaults' Initializing Markup backend module Directory/file permissions for XML source at root /mingw/etc/gconf/gconf.xml.mandatory are: 777/666 Directory/file permissions for XML source at root C:/msys/1.0/home/XXXX YYYY/.gconf are: 700/600 Directory/file permissions for XML source at root /mingw/etc/gconf/gconf.xml.defaults are: 777/666 Resolved address "xml:readonly:/mingw/etc/gconf/gconf.xml.mandatory" to a read-only configuration source at position 0 Resolved address "xml:readwrite:C:/msys/1.0/home/XXXX YYYY/.gconf" to a writable configuration source at position 1 Resolved address "xml:readonly:/mingw/etc/gconf/gconf.xml.defaults" to a read-only configuration source at position 2 starting (version 2.14.0), pid 3172 user 'XXXX YYYY' Failed to get lock for daemon, exiting: Failed to remove 'C:/DOCUME~1/XXXX~1/LOCALS~1/Temp\gconfd-XXXX YYYY/lock/ior': Invalid argument Performing periodic cleanup, expiring cache cruft GConf server is not in use, shutting down. Unloading text markup backend module. Error releasing lockfile: Failed to remove lock directory `C:/DOCUME~1/XXXX~1/LOCALS~1/Temp\gconfd-XXXX YYYY/lock': Directory not empty Exiting
Good news: With ORBit2-2.14.4 and 2.14.5 this error does *not* occur anymore. "make check" runs successfully. Congrats on this solution.
Oops. Bad news again. comment#15 was wrong; with ORBit2-2.14.4 this error does still occur unchanged when running "make check" (and running gnucash also).
Does that mean 2.14.5 also has the same problem?
It's still there in 2.14.7. I first came across this error using the Evolution 2.8.2-2 .msi from http://shellter.sourceforge.net/evolution/ If I replace the libraries with the 2.13.3 ones it works fine, if I replace them with the 2.14.7 ones it fails again.
FYI, I see all those failures if I make the ior file read-only.
*** Bug 472578 has been marked as a duplicate of this bug. ***
Perhaps an external IP requirement I've noticed is related to this error. My windows firewall (Zonealarm Pro) alert me that both GnuCash 2.1.2 and 2.2.1 installer packages for Windows try to contact an external IP as one of the last steps of their installation on systems without previous GnuCash installations. If the connection fails, as when I tell the firewall to disallow the connection, the installation almost immediately generates the message: "Adding client to server's list failed, CORBA error: IDL:omg.org/CORBA/COMM_FAILURE:1.0" On the other hand if I allow the external IP connection, the installation proceeds fine, the error goes away, and GnuCash works great. I assume the error would also occur if an Internet connection was missing for other reasons, as perhaps in a lab and/or when virus protection is turned off. IP's that have been trapped during external connection attempts are so far are 68.142.91.87, 221.239.63.33, and 72.14.207.99. I have not been able to duplicate the error on a machine that has already had GnuCash installed - perhaps the config files are already set. Perhaps a more thorough deinstall would work, but so far I've had to go to new machines. I now doubt it is a Windows "feature" or virus, since that would be unlikely to target something as specific as a GConf settings. Proposition: GConf and/or ORBit2 do not have self-contained installations, and are configured to contact an external server, without which default configuration setting files are not set and the application becomes unusable?
Created attachment 94929 [details] ZoneAlarm firewall traps GnuCash external IP access
Created attachment 94930 [details] GnuCash config error "Adding client to server's list failed"
Just another data point: Windows XP SP2. I gave up on Gnucash 2.2.1 -- it always froze on startup. But with Gnucash 2.2.0 I can't run it unless I'm actually online -- it dies early in the startup process (before the splash screen). If I'm online it generally works fine, but will often crash out if I lose my Internet connection during a session.
> Proposition: GConf and/or ORBit2 do not have self-contained installations, > and are configured to contact an external server, without which > default configuration setting files are not set and the application > becomes unusable? With 99.9% certainty there is nothing in the (official) GConf or ORBit2 source code that would do anything like this. And even if there was, it's hard to see why it would contact the addresses you mention, see below. It might be some unexpected side-effect of something. Are the connections that ZoneAlarm traps always to the DNS server port (53) of random (?) IP addresses? Have you checked in more detail what these DNS requests are, what information is it that is attempted to be looked up? Are they truly sensible looking DNS queries? Or just random junk? (If ZoneAlarm itself doesn't provide such decoding functionality, use Ethereal. It rocks.) The 72.14.207.99 address seems to be a machine called eh-in-f99.google.com (and this isn't a fake, also looking up that name returns that address). 221.239.63.33 doesn't have any reverse entry in DNS, but based on traceroute is seems to be located in China. 68.142.91.87 is cds192.lga.llnw.net, llnw.net seems to be some internet hosting company or ISP. Could the addresses just truly be random?
I agree this is unusual, and don't know the answers to your questions. However, the key point, and the reason the GnuCash team should care, is this: the GNUCash installation does not succeed without this external IP access - configuration settings become inoperable.
Who says we don't care? We do. But it's not necessarily something that is under our control.
I'm sure you care. However, so far the general reaction to my bug report has been that it is sufficiently strange it either must not be happening or cannot be serious. This is unlikely to result in any useful resolution. I've documented the problem well enough with screen shots of the error messages attached below to clearly indicate a real problem that should not be happening. Specifically my concern is that no-one has figured out why the external call occurs, and whether it is an innocuous coding mistake or a very, very bad thing. I respectfully suggest that someone on the team with more skill than this user should narrow the call down to a line of code, conduct the appropriate analysis, and provide the appropriate fix or feedback to those in charge of GConf or ORbit32 if that is where the problem is.
> I've documented the problem well enough with screen shots Sure, but what about the questions I asked in comment #25? Please note that I am not part of a "GnuCash team".
My reply in comment 26 says that this user doesn't know the answers to your questions. However, I do know this bug looks definitely implicated in GConf or ORbit32. If the external IP contact is not allowed, the subsequent GNUCash error states that "An error occurred while loading or saving configuration information for GNUcash. Some of your configuration settings may not work properly." And indeed, GNUcash is installed but cannot be configured and therefore is unusable. With 99.99% certainty there is something in GConf or ORBit2 that is causing this.
Well, it seems like we're stalling then. I certainly am not *that* deeply interested in this issue that I would start poking around by random without more information, and no further useful information seems to be forthcoming. Over and out.
This is not my issue either - I'm just trying to do the right thing as a user of open source, and report it. If there is somewhere else better to report this, please let me know. Your comment about no useful information is kind of needless. I've diagnosed the issue three different times, provided the exact error messages, identified specific IP addresses, and uploaded screenshots. This is more than enough data that any competent software engineer should be able to track this down. It isn't exactly a subtle, intermittent buffer overflow - an IP access during software installation is kind of in your face, and obviously something that should not be happening. However, at the end of the day, I can see that further banging of forehead against wall is not likely to be productive. I tried, and now give up. Over and out.
The problem is that the two developers who do any win32 work (and really, it's only one, and he's a full-time student) cannot reproduce the problem on his system. So having screenshots isn't useful. Now, if you could provide a remote-access account into your box so that the developer in question (not me) can see the problem first-hand, now THAT would be useful. Unfortunately screen shots dont help -- we KNOW what the error message looks like. We just can't tell WHEN in the process it happens. So unless some developer can instrument gnucash/gconf or run it in a debugger WHILE ITS HAPPENING, there's not much to do from our end. All the descriptions of the problem don't really help us. Can you provide an environment where it fails for the use of a developer?
Thanks for the info. Feedback: 1. The problem can only be reproduced with (a) a firewall that requests permission for outgoing connections, since most trap only incoming, and (b) a windows box that has not previously had GNUCash installed, since once the outgoing connection is allowed, even if you deinstall GNUcash and reinstall, it does not happen again. 2. One possible exception to the above - I encountered it again when I installed a newer version of GNUcash, so version upgrades might repeat the problem on the same machine. 3. With the above specified test environment, the problem should be reproducible every time. I've encountered it on my two windows boxes, very predictable. As long as you deny the IP access, the problem can be repeated again and again on the same box. 4. I can't provide a remote environment, since I've already installed on my two windows boxes, so the problem won't repeat. Not much is gained by watching it happen – the screenshots I provided are the visual indicators. 5. The IP access happens at the very end of the GNUcash installation. The error message attached below is all about configuration settings, and that is what does not work when you then fire up the application. 6. It is possible that this is not Windows specific, especially if it is within the ORbit32 or Gconf code. Many people do not have firewalls that trap outgoing connections, so would not notice it. Possibly the problem can be recreated on a Linux box, if GNUcash has not previously been installed. Possibly just installing Gconf or ORbit32 alone would generate the IP access. Hope this helps. Please let me know if I can provide any other information or clarification.
A couple thoughts: 1. Does the error message text provide something specific enough to be searchable to narrow this down? Here is the error text from the screen shot: "An error occurred while loading or saving configuration information for GnuCash. Some of your configuration settings may not work properly." Is it possible to use automated tools to quickly search the installation package and find out where that text is, and why it is displayed? 2. If this is Windows specific, I guess the questions are why does the Windows installation: (a) require DNS lookups to random (?) IP addresses? (b) operate such that without success the configuration settings are not "initialized"?
I debugged this problem and I found very easy repro on my computer (Windows XP - I have multiple network interfaces with different DHCP IPs on them): 1) Down all interfaces (disable) 2) you should have ONLY 127.0.0.1 available 3) Run tcpview.exe (from sysinternals.com) 4) Run gnucash Now you can observe gconfd listening on some sockets and connections that are being established to this socket using localhost as destination hostname to connect to. All should work fine! And there should be no problem at all! 5) UP network interface (enable); wait until it acquire IP address 6) Run gnucash Now you can observe gconfd listening on some socket and connections that are being SYN to this socket using computer hostname as destination. I have no firewall enabled! And I was able to telnet directly to listening port from terminal (cmd) so the port is listening. What is happening (the bug detailed description): - gconfd is listening ONLY on localhost (127.0.0.1) - client is connecting to the FIRST ip address of the interface (in my case 192.168.0.109) on which gconfd is NOT listening!!! I assume that gconfd works correctly, because it listens on localhost (127.0.0.1). And it listens on localhost no matter if I have other connection UP or DOWN, but... the difference is in client which (if the interface is DOWN) connects to localhost, but if the interface is UP, tries to connect to this interface IP address. What I propose: Let gnome developers take a look on a way in which the destination IP address (or hostname) is aquired. I propose to make it work in exactly the same way in which gconfd tooks the IP address to listen to. Send me emails if you need assistance with this. Regards, Michal
Of course I am talking about client implementation. Because it is the client that has different behavior in these situations.
Probably the bug is here: corba-orb.c (function ORBit_ORB_start_servers): because: #ifdef G_OS_WIN32 static gboolean orbit_local_only = TRUE; #else static gboolean orbit_local_only = FALSE; #endif on WIN32 the 'orbit_local_only' is set to TRUE and than later it gives LINK_NET_ID_IS_LOCAL as argument to 'link_use_local_hostname' function: if (orbit_local_only || (orbit_use_usocks && !(orbit_use_ipv4 || orbit_use_ipv6 || orbit_use_irda || orbit_use_ssl))) link_use_local_hostname (LINK_NET_ID_IS_LOCAL); function 'link_use_local_hostname' sets 'use_local_host' which is later used as an argument to function 'get_netid' in function 'link_get_local_hostname' later 'get_netid' for LINK_NET_ID_IS_LOCAL returns: if (LINK_NET_ID_IS_LOCAL == which) return strncpy(buf, "localhost", len); so the server starts to listen ALWAYS on 127.0.0.1 (localhost) But.. the difference is here: Client makes connection while iterating via 'profile_list' orb member. 'profile_list' is filled in 'IOP_start_profiles' function (iop-profiles.c) where default 'host' value is set to: iiop->host = g_strdup (serv->local_host_info); which end up in taking the same function 'link_get_local_hostname' which uses 'use_local_host' which is set by default to: linc-protocols.c: static LinkNetIdType use_local_host = LINK_NET_ID_IS_FQDN; so it gets IP address in a complelty different way. And this is probably the cause of the trouble on Win32 machines. Unfortunately it is only from static code analyze and I do not have compiler with me to confirm this, but if somebody check this and correct I can test the binary he/she provides. Let me know if you have any questions. Thank you, Michal
(In reply to comment #36) > I debugged this problem and I found very easy repro on my computer (Windows XP > - I have multiple network interfaces with different DHCP IPs on them): > 1) Down all interfaces (disable) > 2) you should have ONLY 127.0.0.1 available > 3) Run tcpview.exe (from sysinternals.com) > 4) Run gnucash > Now you can observe gconfd listening on some sockets and connections that are > being established to this socket using localhost as destination hostname to > connect to. > All should work fine! And there should be no problem at all! > 5) UP network interface (enable); wait until it acquire IP address > 6) Run gnucash > Now you can observe gconfd listening on some socket and connections that are > being SYN to this socket using computer hostname as destination. > I have no firewall enabled! And I was able to telnet directly to listening port > from terminal (cmd) so the port is listening. > What is happening (the bug detailed description): > - gconfd is listening ONLY on localhost (127.0.0.1) > - client is connecting to the FIRST ip address of the interface (in my case > 192.168.0.109) on which gconfd is NOT listening!!! > I assume that gconfd works correctly, because it listens on localhost > (127.0.0.1). And it listens on localhost no matter if I have other connection > UP or DOWN, but... > the difference is in client which (if the interface is DOWN) connects to > localhost, but if the interface is UP, tries to connect to this interface IP > address. > What I propose: > Let gnome developers take a look on a way in which the destination IP address > (or hostname) is aquired. I propose to make it work in exactly the same way in > which gconfd tooks the IP address to listen to. > Send me emails if you need assistance with this. > Regards, > Michal sorry, mistake... it was not XP it was Windows Vista, but the rest is correct
(In reply to comment #38) > Probably the bug is here: > > corba-orb.c (function ORBit_ORB_start_servers): > > because: > #ifdef G_OS_WIN32 > static gboolean orbit_local_only = TRUE; > #else > static gboolean orbit_local_only = FALSE; > #endif > > on WIN32 the 'orbit_local_only' is set to TRUE See this ChangeLog entry: 2006-06-23 Tor Lillqvist <tml@novell.com> * src/orb/orb-core/corba-orb.c: Set orbit_local_only to TRUE on Win32, to go with the use of IPv4 on Win32. We don't want to create world-contactable sockets by default. Note that for orbit_local_only to actually work, a small fix to the linc2 code was also needed, see the ChangeLog over there. You can change its value by using "--ORBLocalOnly=0" as orb command line option.
Hi Tor, I agree with this step to listen on localhost by default on Win32. What I am talking is the default client behavior when making connection to orb (by default it get server IP address in a different way, ending up in a different address than 127.0.0.1 and on Windows Vista it breaks and could not connect to corba). To GnuCash developers: Can you check and maybe "hardcode" the destination hostname to be 127.0.0.1? I think it will also make it works (just like changing ORBit client in order to connect on Windows machines to localhost by default). Some discussion needed...
I'm not Tor ;-) I just wanted to point out that it would be easy to check if orbit_local_only is the culprit as it can be set to FALSE by a simple command line parameter.
Unfortunately I am using gconfd-2.exe from GnuCash and it seems that this option given as an argument has no effect and I cannot connect to other than 127.0.0.1 IP addresses. Any other suggestions how to make it listen on other IPs?
I have just downloaded gconfd-2 with other stuff from http://ftp.gnome.org/pub/gnome/platform/2.14/2.14.2/win32/ and it this option does not work! Still gconfd-2.exe is listening ONLY on localhost. Any suggestions?
(In reply to comment #42) > I'm not Tor ;-) Ups.. Sorry Jules, I was replying to fast :)
Listening only on localhost for Windows applications is good approach. I think that the problem is with the client that on Window tries to connect to other than 127.0.0.1 IP address (taken from the network interface) and the connection could not be made, because server is listening on localhost only.
Basically some update on this Bug and the way how it crashed on Vista: I attached to my WiFi network and got IP 192.168.1.101: Wireless LAN adapter Wireless Network Connection: Connection-specific DNS Suffix . : naramowice.com Link-local IPv6 Address . . . . . : fe80::d0a3:d6db:cbae:cf68%10 IPv4 Address. . . . . . . . . . . : 192.168.1.101 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.1.1 After starting gnucash I saw that gconfd started to listen on: 192.168.1.101:<some port> and ONLY on this IP (from netstat -an): TCP 192.168.1.101:52140 0.0.0.0:0 LISTENING But... Gnucash was starting to connect to: TCP 192.168.1.101:52141 naramowicka.com:52140 SYN_SENT which was the address from "Wireless LAN adapter Wireless Network Connection". When I changed c:\windows\system32\drivers\etc\hosts and added: 192.168.1.101 naramowice.com everything started to work:) So the problem is in a way in which ORBit2 is getting the hostname to connect to. And it gets it from (linc-protocols.c): static LinkNetIdType use_local_host = LINK_NET_ID_IS_FQDN; When server on Win32 is getting IP to listen to from LINK_NET_ID_IS_LOCAL, because orbit_local_only set to TRUE changes this 'use_local_host' to 'LINK_NET_ID_IS_LOCAL'. So the real bug is the difference in a way in which the IP of the server is established by the server and the way in which the IP of the server (in order to make a connection) is established by the client. It is inconsistent! So it crashes. And this is probably the reason why it crashes "randomly" on different windows versions (I assume, that on some versions they are assigned with hostname that reflects local IP and on other it gets hostname that points to different IP - like the problem from screenshot /ZoneAlarm/). The solution is simple.. just make it acquire the IP address in a different way (or at least consistent for both server and client). Regards, Michal Zygmunt
Is there anybody who can create the appropriate patch and prepare bits for test?
In ORBit2 svn trunk it is possible to do something like "--ORBNetID=192.168.1.101" to force a server to expose its objects on a specific IP/NIC. I've tried to CC Tor at Novell as he does the win32 ORBit2 stuff but bugzilla won't let me. I suspect you best "fix" is to use the hosts file as you did. But, nothing should crash just because it can't get a reference to an object. I'll say this belongs with the gnucash folks, not in ORBit2 land.
I'm already on the Cc list and following this discussion... (as tml@iki.fi, the address I have registered as in b.g.o). Once I have time I will look into this again and see if I can find out anything useful.
Aha, OK - I suspected it was you :-)
Hi Jules, It does not 'crash' but it just cannot connect and we are getting this COMM_FAILURE error. The "--ORBNetID=192.168.1.101" in server does not make any difference as long as the client acquires address that it using to connect in a different way (using hostname which on Windows machines does not have to point to localhost because of /very often/ fake hostnames which are assigned by DHCP servers). It server is listening ONLY on localhost on Windows, than the server should (in my opinion) also try to connect to localhost instead of acquiring IP address based on hostname. On Windows we can easily get all the IP addresses assigned to interfaces instead of doing dns query to find IP based on (very often bad) hostname.
*** Bug 484930 has been marked as a duplicate of this bug. ***
fwiw just to clarify the problem I had (comment 21) was during the windows installation, asking for access to an IP at the last step of installation. When blocked by my firewall the error message is displayed, although the installation then proceeds to finish and does not actually "crash". After you start the application, the next sign of real trouble is not until you change configuration settings (currency etc.) and find that changes cannot be saved, making the application unusable for long term use, but not an actual "crash" either. Wrt IP accesses at application start-up, after installation, at least on my system whether you allow or block the requests the results don't interfere with use of the application, the way blocked access during installation does as mentioned above.
WMS: can you check if the IPs that the application is trying to connect to can be acquired based on your computer's hostname or hostname it received from DHCP server?
I think if the next revision is installed on my other Windows machine the problem will repeat, so will try. Specifically, I can duplicate the error under a variety of conditions, such as after reboot and various Internet activity, and obtain the IP's. Can you then tell me how to determine if the IP's found "can be acquired based on your computer's hostname or hostname it received from DHCP server?" I don't know if it makes a difference, but note my computers don't use DHCP but rather have hard-coded IP's because they are "on" the Internet, not behind external firewalls - on board software firewall only :-O
(In reply to comment #33) > The problem is that the two developers who do any win32 work (and really, it's > only one, and he's a full-time student) cannot reproduce the problem on his > system. So having screenshots isn't useful. Now, if you could provide a > remote-access account into your box so that the developer in question (not me) > can see the problem first-hand, now THAT would be useful. > > Unfortunately screen shots dont help -- we KNOW what the error message looks > like. We just can't tell WHEN in the process it happens. So unless some > developer can instrument gnucash/gconf or run it in a debugger WHILE ITS > HAPPENING, there's not much to do from our end. All the descriptions of the > problem don't really help us. > > Can you provide an environment where it fails for the use of a developer? > Would it help to give you a VMWare VM image for a Win2K SP4 system that exhibits the problem?
(In reply to comment #57) > (In reply to comment #33) > > The problem is that the two developers who do any win32 work (and really, it's > > only one, and he's a full-time student) cannot reproduce the problem on his > > system. So having screenshots isn't useful. Now, if you could provide a > > remote-access account into your box so that the developer in question (not me) > > can see the problem first-hand, now THAT would be useful. > > > > Unfortunately screen shots dont help -- we KNOW what the error message looks > > like. We just can't tell WHEN in the process it happens. So unless some > > developer can instrument gnucash/gconf or run it in a debugger WHILE ITS > > HAPPENING, there's not much to do from our end. All the descriptions of the > > problem don't really help us. > > > > Can you provide an environment where it fails for the use of a developer? > > > > Would it help to give you a VMWare VM image for a Win2K SP4 system that > exhibits the problem? > I've got a Win2K SP4 system, where the app works with the stock ORBit2-2.14.0, and the app fails with the stock ORBit2-2.14.2. I built ORBit2-2.14.0, and the app continues to work with the DLLs from my 2.14.0 build. I built 2.14.2, and the build succeeded, but "make test" gets as far as "Testing Async invocations .." and then it shows a dialog box that says, "** ERROR **: file client.c line 1982 (testAsync): assertion failed: (ev->_major == CORBA_NO_EXCEPTION) aborting... Is this expected with Win2K SP4? If it is not expected, could it be at the root of the problem?
> Is this expected with Win2K SP4? If it is not expected, could it be at the > root of the problem? No, it is not expected that ORBit2 on Win2K would behave significantly differently than on XP. As I have tried to say previously, until somebody who actually sees the problm(s) being discussed here also is able to debug them, just guessing isn't going to help much. About the WMware image, yes, that might be helpful. I can promise I will at least have a look. No promise *when*, though.
Found it. As I mentioned above, I'm running Win2K SP4. I did a local build of 2.14.0, and it passed its tests. I dropped the 3 libORBit DLLs into GnuCash, and GnuCash started without problems. I then did a local build of 2.14.2, and it failed in client.c on line 1982. I then changed line 52 in "ORBit2-2.14.2\src\orb\orb-core\corba-orb.c" from "static gboolean orbit_local_only = TRUE;" to "static gboolean orbit_local_only = FALSE;" and rebuilt and retested. The test completed without error. I dropped the 3 libORBit DLLs into GnuCash, and GnuCash started without problems. I then repeated the 2.14.2 process with the source for 2.14.10, with precisely the same results. The line I changed is surrounded with a "#ifdef G_OS_WIN32". In 2.14.2, this #ifdef was added. The comment says that a corresponding change to linc2 was necessary. I don't find a change in the linc2 ChangeLog that mentions orbit_local_only, but there is a note from Tor Lillqvist about changing the processing of localhost. At any rate, changing to orbit_local_only for Win32 clearly breaks the tests and breaks GnuCash on Win2K SP4. (I like Win2K because when run in a VM, and carried on a USB drive from machine to machine, Win2K doesn't require re-activation.)
Hi Kevin, 2 requests: 1) Can you (instead of changing this orbit_local_only) change use_local_host to LINK_NET_ID_IS_LOCAL (linc-protocols.c) and check if it will also work? It will be definite prove to my previous research that the problem is in a different IP/host acquire procedures for client and server which should be corrected and make consistent. Please look below: >And it gets it from (linc-protocols.c): > >static LinkNetIdType use_local_host = LINK_NET_ID_IS_FQDN; > >When server on Win32 is getting IP to listen to from LINK_NET_ID_IS_LOCAL, >because orbit_local_only set to TRUE changes this 'use_local_host' to >'LINK_NET_ID_IS_LOCAL'. 2) Can you send me your bits that works for you? I would like to test it with my gnucash on Vista? Thanks, Michal
Tor, Jules, If Kevin's tests will pass will you be able to make appropriate changes in the code to the main branch of the orbit and schedule them for the next release?
(In reply to comment #62) > Tor, Jules, > > If Kevin's tests will pass will you be able to make appropriate changes in the > code to the main branch of the orbit and schedule them for the next release? Tor is the Windows guy here. I would rather not step on his toes ;-)
*** Bug 490692 has been marked as a duplicate of this bug. ***
*** Bug 476435 has been marked as a duplicate of this bug. ***
*** Bug 477502 has been marked as a duplicate of this bug. ***
*** Bug 483004 has been marked as a duplicate of this bug. ***
I have just tested bits that Kevin sent to me and it seems to solve both Win Server and Win Vista problems with orbit on gnucash. So it looks like we have managed to locate the problem and have at least one solution to make it works. Now if we know that it was the problem it will be much easier to fix it in the main orbit release. Thanks, Michal
*** Bug 491053 has been marked as a duplicate of this bug. ***
Created attachment 98030 [details] Fix created by Kevin Orbit bits that according to Comments: #38, #46, #47, #61
If you mean the change in comment #60 (setting orbit_local_only to FALSE on Win32, too), what we need is an *analysis* of *why* this change solves the problem. (And exactly *which* problem, the discussion here in this bug report is so confusing...) And anyway, if all you want is to toggle that local-only option, surely it is simpler to just set the ORBLocalOnly option to 0 in an orbitrc file, or pass it on the command line, than to have private patches to the upstream ORBit2 and build and distribute "competing" builds of the DLLs? Search the web for exact syntax of the orbitrc file or the corresponding command-line options... Is all the stuff mentioned in this bug even the one same problem at all? My suspicion is that the stuff ZoneAlarm reports mentioned in bug #21 is something completely different. Setting orbit_local_only to FALSE means that ORBit2 listens on not only the loopback interface (localhost, i.e. 127.0.0.1) but also on the externally visible IP address of a machine, doesn't it? Isn't this precisely something one wants to avoid for security reasons? Won't this change then mean that ZoneAlarm will start warning about the program listening on random sockets? (At least the XP firewall will warn about it, won't it?) After all, security reasons is exactly why I set orbit_local_only to TRUE on Windows in the first place. (On Unix ORBit2 by default doesn't even listen on any TCP/IP socket, but only on a Unix domain socket, which is local by definition. Note that orbit_use_ipv4 is set to FALSE on Unix.) My strong belief is that we do *not* want to set orbit_local_only to FALSE. If using local sockets indeed makes ORBit2's "make check" fail on some machines (it works for me...), the root cause for that should be found, and a proper fix done instead.
> the stuff ZoneAlarm reports mentioned in bug #21 Er, comment #21.
Hi Tor, As I mentioned in comment #38 and #47 the problem is not in this orbit_local_only but in particular in a way (DIFFERENT WAY) in which the IP address is acquired on Windows for client and server. And orbit_local_only set to TRUE by default on Windows make it happen (not directly, but changes the way in which one of the components - no matter if it is client or server - is acquiring IP using a different method and finally sometimes gets completly different one than the other component). And it causes problem with connection establishment and caused the bug. So the problem is in inconsistent behavior for client and server when you compare the way in which they acquire IP. The problem with ZoneAlarm is probably the same, as the client is still getting IP using FQDN which can be different and may even point to a hostname that when resolved via DNS gives completly different IP than localone (on which server /because of orbit_local_only/ is listening). So it than makes the client behaves as trying to connect to some "remote" site, because it acquired the IP of the server in a different (inconsistent compared to the way in which server took IP to listen to) way! The orbit_local_only is only one way of making it CONSISTENT! Which is the real problem! I complelty agree that orbit should not listen on global IP address. I just want to say that IP address MUST BE ACQUIRED in a consistent way or at least FQDN must be changed because it is NOT A FULLY VALID way of acquiring LOCAL IP on Windows! So these few sentences above should be a REAL SOLUTION to the problem! P.S. I published Kevin's patch because it is one of the way of making it consistent until futher decision is taken on a way what should be changed in order to make it work in a consistent way (probably FQDN should be the main thing of interest!). I just understand developers who wants to have its job done (cause I am also one of them), but I also understand normal people who are STILL submitting bugs which are changed to duplicate as a current bug. As you see making it work is an IMPORTANT thing! That is why I published "temporary" solution to this problem, NOT a patch with the source change. P.S.2.Most of the people that are using orbit2 on windows are using it with a gnucash application and this app does not accept arguments that are directed to orbit. Instead it is using orbit2 library directly. That is also why rc file does not work. (I have already tested it with no success). So summarizing: I agree that it is not a solution, but I not agree with what you said that "we do not understand the problem". The fact is that the problem is FULLY undrestand and I have already debugged it. So now when we finally have a detailed description why (after almost a YEAR!!!) we cannot just fix it and prepare the next release of the orbit2 for windows.
> So now when we finally have a detailed description why > (after almost a YEAR!!!) we cannot just fix it With "just fix it", you mean set orbit_local_only to FALSE? Is this really the best we can do based on "FULL understanding" (your words) of what happens?
Should this code snippet in linc-protocols.c: if (LINK_NET_ID_IS_LOCAL == which) return strncpy(buf, "localhost", len); be changed to use "127.0.0.1" on Windows instead of "localhost"?
(In reply to comment #74) > > So now when we finally have a detailed description why > > (after almost a YEAR!!!) we cannot just fix it > With "just fix it", you mean set orbit_local_only to FALSE? Is this really the > best we can do based on "FULL understanding" (your words) of what happens? Fix it is not about setting it to FALSE, but fixing FQDN IP acquire rules to something different because it seems that its expected behavior was to return local IP based on hostname, but it works very well on linux, where hostname is very often also inside /etc/hosts which is evaluated to 127.0.0.1 so it works perfectly, but on windows machines hostname is not directly associated with 127.0.0.1 so the computer is doing DNS query to get the IP and gets some (very often different than local!) and this is what caused trouble here. So the trouble is in using FQDN (in a way in which it is implemented currently) to resolve local IP which is NOT a correct way of getting local IP on Windows! So this is the bug! The second bug is the question: "if the binaries are compiled for windows (both server and client) server is listening on IP that gets from LINK_NET_ID_IS_LOCAL and client is trying to connect the same server that previously acquired IP using LINK_NET_ID_IS_LOCAL but the client is using LINK_NET_ID_IS_FQDN ???" - from my point of view this should be also corrected because as we can see now using inconsistent "algorithms" (in general) for server and clients may lead to unpredictable behavior. It is just like sending data encrypted with AES and trying to decrypt it with Blowfish. Mayby for some specific combinations of keys and datas it works, but for most it does not! So I can see here two bugs: - FQDN should be repaired (ot at least NOT used to request local IP - if you want to connect to localhost, just get IP using the same way as LINK_NET_ID_IS_LOCAL is doing this) - what I mean is that we cannot assume that on windows hostname "MUST" be associated with local IP just like in Linux. There is a difference in making things compatible thank making thinking that it is compatible. - the second - inconsistency in IP address acquired approaches (not as important bug, and it can be considered "by design" if we wanted client to be able to connect to remote host using this method, but to connect to localhost it SHOULD NOT take hostname but take IP in the same way as in LINK_NET_ID_IS_LOCAL)
One intention in my Win32 changes to ORBit2 has been that no host names, fully qualified or not, should be used in the local-only case, but only 127.0.0.1. If there is still something in the ORBit2 code that causes it to use a name for the local host (even "localhost" should be avoided in my opinion), that should be fixed. (See for instance comment #75.) If there is something in GnuCash or GConfd that prevents using and/or understanding "127.0.0.1" in place of a hostname, that should be fixed too.
And now the interesting thing: The question really is WHO is calling connect with arguments that causes call to FQDN IP query? Is it directly from gnucash? Than it is a gnucash bug not orbit! If gnucash only calls some "high level" function of orbit, and orbit is the place when this call took place, than orbit is "quilty of a crime" :) "The game is simple, but who is rolling the dice?" Because how the bug works (or I should say why it crashes) is clear.
I'm pretty sure that gnucash just calls gconf APIs. From reading the logs it sounds like it's the orbit client (or gconf) that's using FQDN on Windows to connect to the server. I believe this is "linc.c"? See comment #38 which seems to imply that it's using FQDN on the client, which doesn't match the server.
I don't have an opinion on "the solution," but I can add a little more info. 1. Making the change suggested in comment 75 (127.0.0.1 instead of localhost) does not fix the issue on my machine. 2. Removing all network cards, so that my machine knows ONLY the loopback adapter DOES make the issue go away. Note that simply unplugging the Ethernet cable is not enough to vanish the problem. That lends some credence to the argument that it is a problem of address assignment. I've been approaching the problem from a black-box perspective. I found out about disabling orbit_local_only by applying half of the updates between .0 and .2, testing, and progressively halving the code base until I found one line that would make my problem go away. (The power of the binary search.) I'll start working to learn enough of the code to start white-box troubleshooting. Given the observation that removing the NIC removes the symptom, can anyone suggest specific .c files I should start with?
Doesn't it have to be in ORBit (and not GConf or GnuCash) if the ORBit "make check" fails?
> if the ORBit "make check" fails? It doesn't in all cases. Not for me, for instance.
I wrote: > One intention in my Win32 changes to ORBit2 has been that no host names, > fully qualified or not, should be used in the local-only case, > but only 127.0.0.1. I added debugging printout before all calls to gethostby*() and gethostname() in linc2, and it turns out that ORBit2 still does some rather unnecessary gethostbyname("localhost") and gethostbyaddr("127.0.0.1") calls. I will try to trace down all these (using ORBit2's "make check" and remove them.) Jules, BTW, do you recall why timeout-server passes an explicit "--ORBLocalOnly=0" to CORBA_ORB_init()? Is it just so that the non-local-only case also gets some testing?
Created attachment 98051 [details] [review] tentative patch This patch probably helps somewhat for the non-local case in that it makes ORBit2 use just IP addresses, no names. it should reduce the number of DNS lookups significantly. The patch changes the default use_local_host on Win32 to be LINK_NET_ID_IS_IPADDR. It also changes get_netid() to return the numeric "127.0.0.1" in the LINK_NET_ID_IS_LOCAL case. Most importantly (I think), it changes link_protocol_get_sockinfo_ipv4() to always return the numeric IP address instead of doing any gethostbyaddr() etc lookups at all on Win32. Still, the basic problem remains that if one starts an ORBit2 server that binds to a non-loopback address, it binds to *one* of the machine's addresses, not INADDR_ANY. This is a cross-platform problem as far as I can see. If that address happens to be a relatively ephemeral one and the address changes or the whole interface goes away while the ORB is still running, the ORB cannot be contacted any longer. I think it would be best if non-local servers would bind to INADDR_ANY (0.0.0.0). There is one problem with using INADDR_ANY, though. The "hostname" of an ORB is used both to determine what to adress to bind the ORB to, and what to advertise for clients in IOR strings. Ideally, if a server needs to be reachable from other machines, it should bind to INADDR_ANY. The IOR advertised to clients should then be different for local and remote clients: Local clients should connect to INADDR_LOOPBACK which of course is more or less guaranteed to work always, while ideally each remote client should be given a fresh IOR that contains an actual IP address valid at that time, or a hostname that is most likely to stay valid. Or something like that. While pondering some more intrusive redesign like the above, maybe I should add some heuristics to the Win32 code for LINK_NET_ID_IS_IPADDR in get_netid() to prune out addresses on interfaces that are ephemeral or less usable in some way. Hmm. Not that I have any idea how to do that...
I confirm that the patch applied to ORBit2-2.14.10, passes "make check" on my machine. It also cures the GnuCash startup problem where GC was unable to connect to the config server. This is on a Win2K SP4 machine with 3 NICs plus loopback, where it previously died 100%. Thanks Tor!
(In reply to comment #83) > I wrote: > > One intention in my Win32 changes to ORBit2 has been that no host names, > > fully qualified or not, should be used in the local-only case, > > but only 127.0.0.1. > > I added debugging printout before all calls to gethostby*() and gethostname() > in linc2, and it turns out that ORBit2 still does some rather unnecessary > gethostbyname("localhost") and gethostbyaddr("127.0.0.1") calls. I will try to > trace down all these (using ORBit2's "make check" and remove them.) > > Jules, BTW, do you recall why timeout-server passes an explicit > "--ORBLocalOnly=0" to CORBA_ORB_init()? Is it just so that the non-local-only > case also gets some testing? No. GIOP timeouts is only supported on IPv[4.6] so we need to force its hand so to speak.
(In reply to comment #84) > Ideally, if a server needs to be reachable from > other machines, it should bind to INADDR_ANY. The IOR advertised to clients > should then be different for local and remote clients: Local clients should > connect to INADDR_LOOPBACK which of course is more or less guaranteed to work > always, while ideally each remote client should be given a fresh IOR that > contains an actual IP address valid at that time, or a hostname that is most > likely to stay valid. Or something like that. Just to add to your thoughts... I recently made a checkin to svn that implemented support for something like this: --ORBNetID=192.168.2.45 This is halfway endpoint support. The IP address above is the one on which the local CORBA object will be exposed to non-local clients and it is that one which will be integrated into the IOR. So it is now possible to bind the object to any acceptable NIC by using orbitrc or a command line option.
*** Bug 466260 has been marked as a duplicate of this bug. ***
Tor, in case you are confident enough with your patch, do you think it is reasonable to commit it and have some Tor-built ORBit2 packages within the a few weeks? I ask, because the next GnuCash version really should not suffer from this problem and otherwise I will need to sit down and compile my own DLLs :-D
I will have a fresh look at the patch once more, and if I don't see anything very wrong, then commit. I would prefer not to provide Windows binaries of SVN snapshots, though, so it would be nice if Kjartan or somebody could do a source release then after that, and then I will provide a binary of that release.
Just send me a mail when you want a tarball released :-)
*** Bug 495018 has been marked as a duplicate of this bug. ***
*** Bug 469464 has been marked as a duplicate of this bug. ***
*** Bug 449153 has been marked as a duplicate of this bug. ***
I have been experienceing this same error on Windows Vista & Server 2003/TS installation of GNUCash. Changing the version of ORBit that is istalled still did not resolve the issue. Used Binaries from download site.
*** Bug 498375 has been marked as a duplicate of this bug. ***
(In reply to comment #90) @Tor and @Kjartan: Gnucash would really really love to see this committed ASAP and have a tarball released with this fix included, also ASAP. Actually we've been deferring our latest release for months by now solely because we are waiting for this single bugfix. Could you please try to finish this rather soon? Thanks a lot.
Created attachment 101018 [details] 2.4.10, applied tentative patch from comment 84 These are ORBit2 DLLs for the stock ORBit2 sources, patched with attachment 98051 [details] [review], configured without parameters and build against a recent gnome stack. Please tell me whether they work for you and can be used for a new GnuCash release. Thanks.
These worked well for me on my Vista machine, where the ORBit2 DLLs included with the 2.2.1 release did not.
These worked for me as well.
*** Bug 504249 has been marked as a duplicate of this bug. ***
*** Bug 504262 has been marked as a duplicate of this bug. ***
*** Bug 504319 has been marked as a duplicate of this bug. ***
Yes, this is working now for me too. I can finally use Gnucash again after more than three months. Thanks!
gnucash 2.2.2 on windows was released with dlls from comment #98. Unfortunately we are still receiving reports about DNS lookups with some users, maybe there's an extra DNS lookup somewhere (sorry, I know that's not very specific). Here's a report from our mailing list: "...It starts by requesting access to 127.0.0.1:Port 1734 (the port number varies) which I allow. gconfd-2.exe then requests access to 127.0.0.1:Port 1742 (the port number varies) which I also allow but then gnucash-bin.exe requests access to 81.103.221.14:53 (this port seems to stay the same) which I am denying. This ip address belongs to NTL which is my ISP. On further investigation this is the ip address of my ISP's pop email server???"
> gnucash-bin.exe requests access to 81.103.221.14:53 (this port > seems to stay the same) > this is the ip address of my ISP's pop email server???" It might be a POP server, but 53 is the DNS port, so it probably also is a DNS server, and its address is maybe set up by DHCP when the PC boots. Try running ipstat -all and check what it says about DNS servers. It would be useful to run ethereal on machines where unwanted DNS lookups are taking place to find out what kind of lookups they are, and perhaps from that then be able to deduce what code is doing the lookup. The best would obviously be to get a backtrace when the lookup happens to see exactly what code does it, but that is not likely unless somebody who builds the software him/herself with debugging information goes chasing the issue. Anyway, I am now (finally) going to review the patch once more and then commit it.
Patch committed: 2008-01-21 Tor Lillqvist <tml@novell.com> Rework form of addresses used on Windows. Seems to fix the problems reported in bug #363648. See that bug for extensive even if occasionally misleading discussion. * src/linc-protocols.c [Win32]: Change the default value of the use_local_host variable to LINK_NET_ID_IS_IPADDR. This seems to in general be more useful than FQDNs especially for end-user machines with DNS randomness issues. Add some d_printf() calls to print debugging information when DNS lookups are being done. To see them, recompile with CONNECTION_DEBUG defined in src/linc-debug.h and set the LINK_CONNECTION_DEBUG environment variable. (get_netid) [Win32]: Return the numeric "127.0.0.1" in the LINK_NET_ID_IS_LOCAL case. (link_protocol_get_sockinfo_ipv4) [Win32]: always return the numeric IP address instead of doing any gethostbyaddr() etc lookups. Resolving this bug now then as fixed... please open up more specific bugs for individual well-defined further issues if necessary.
except that the commit breaks the build on linux at least: make[3]: Entering directory `/home/kmaraas/cvs/gnome/ORBit2/linc2/src' /bin/sh ../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I. -I../.. -I../../linc2/include -I../../linc2/include -pthread -I/opt/gnome2/include/glib-2.0 -I/opt/gnome2/lib/glib-2.0/include -Wall -Wunused -Wmissing-prototypes -Wmissing-declarations -DG_DISABLE_DEPRECATED -D_GNU_SOURCE -g -O0 -D_FORTIFY_SOURCE=2 -g -O0 -D_FORTIFY_SOURCE=2 -Werror-implicit-function-declaration -MT linc-protocols.lo -MD -MP -MF .deps/linc-protocols.Tpo -c -o linc-protocols.lo linc-protocols.c gcc -DHAVE_CONFIG_H -I. -I../.. -I../../linc2/include -I../../linc2/include -pthread -I/opt/gnome2/include/glib-2.0 -I/opt/gnome2/lib/glib-2.0/include -Wall -Wunused -Wmissing-prototypes -Wmissing-declarations -DG_DISABLE_DEPRECATED -D_GNU_SOURCE -g -O0 -D_FORTIFY_SOURCE=2 -g -O0 -D_FORTIFY_SOURCE=2 -Werror-implicit-function-declaration -MT linc-protocols.lo -MD -MP -MF .deps/linc-protocols.Tpo -c linc-protocols.c -fPIC -DPIC -o .libs/linc-protocols.o linc-protocols.c: In function 'link_protocol_get_sockinfo_ipv4': linc-protocols.c:922: error: invalid storage class for function 'link_protocol_get_sockinfo_ipv6' linc-protocols.c:922: warning: no previous prototype for 'link_protocol_get_sockinfo_ipv6' linc-protocols.c:980: error: invalid storage class for function 'link_protocol_get_sockinfo_unix' linc-protocols.c:980: warning: no previous prototype for 'link_protocol_get_sockinfo_unix' linc-protocols.c:1024: warning: no previous prototype for 'link_protocol_get_sockinfo' linc-protocols.c:1045: warning: no previous prototype for 'link_protocol_is_local' linc-protocols.c:1065: error: invalid storage class for function 'link_protocol_unix_destroy' linc-protocols.c:1065: warning: no previous prototype for 'link_protocol_unix_destroy' linc-protocols.c:1073: error: invalid storage class for function 'link_protocol_unix_is_local' linc-protocols.c:1073: warning: no previous prototype for 'link_protocol_unix_is_local' linc-protocols.c:1091: error: invalid storage class for function 'link_protocol_tcp_setup' linc-protocols.c:1091: warning: no previous prototype for 'link_protocol_tcp_setup' linc-protocols.c:1132: error: initializer element is not constant linc-protocols.c:1132: error: (near initialization for 'static_link_protocols[0].setup') linc-protocols.c:1146: error: initializer element is not constant linc-protocols.c:1146: error: (near initialization for 'static_link_protocols[1].setup') linc-protocols.c:1149: error: initializer element is not constant linc-protocols.c:1149: error: (near initialization for 'static_link_protocols[1].get_sockinfo') linc-protocols.c:1161: error: initializer element is not constant linc-protocols.c:1161: error: (near initialization for 'static_link_protocols[2].destroy') linc-protocols.c:1163: error: initializer element is not constant linc-protocols.c:1163: error: (near initialization for 'static_link_protocols[2].get_sockinfo') linc-protocols.c:1165: error: initializer element is not constant linc-protocols.c:1165: error: (near initialization for 'static_link_protocols[2].is_local') linc-protocols.c:1175: warning: no previous prototype for 'link_protocol_destroy_cnx' linc-protocols.c:1191: warning: no previous prototype for 'link_protocol_destroy_addr' linc-protocols.c:1222: warning: no previous prototype for 'link_protocol_all' linc-protocols.c:1237: warning: no previous prototype for 'link_protocol_find' linc-protocols.c:1259: warning: no previous prototype for 'link_protocol_find_num' linc-protocols.c:1268: error: expected declaration or statement at end of input linc-protocols.c:1268: warning: control reaches end of non-void function make[3]: *** [linc-protocols.lo] Error 1 make[3]: Leaving directory `/home/kmaraas/cvs/gnome/ORBit2/linc2/src' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/home/kmaraas/cvs/gnome/ORBit2/linc2' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/kmaraas/cvs/gnome/ORBit2' make: *** [all] Error 2
Oops, had removed a line with a closing brace in a Unix ifdef by accident. Fixed.
make and 'make check' are happy on my box now.
ORBit2-2.14.11 is out with this fix now.
Great! A big thanks to all who helped to resolve this bug! Personally, I would like to read one confirmation that ftp://ftp.gnome.org/pub/gnome/binaries/win32/ORBit2/2.14/ORBit2-2.14.11.zip can be used with GnuCash, so that GnuCash 2.2.4 does not need to ship patched DLLs. Thanks.
Created attachment 104149 [details] ipconfig.txt
I have applied ORBit2-2.14.11 to my installation of Gnucash 2.2.3 on Windows XP SP2 and it still has the same problem of hanging for about 5 minutes during the startup procedure of Gnucash. I have two NICs. One is wireless and the other wired. The wired nic has no cable connected. I disabled the wireless network card while it was hung and then Gnucash continued the startup. If I reenable the card after it continues loading, it seems to be fine. The module that it was stuck at was app-init. It does eventually timeout and continue with loading the rest of the program when the card is enabled. Checked the netstat -ano and netstat -bn info and found the same problem as before where a listening socket is attempted for the local host name (in my case STONE-1) on port 0 instead of localhost or 127.0.0.1. When the NIC is disabled, it eventually shows up as 0.0.0.0:0. I have included the ipconfig and netstat info: C:\>ipconfig /all Windows IP Configuration Host Name . . . . . . . . . . . . : STONE-1 Primary Dns Suffix . . . . . . . : Node Type . . . . . . . . . . . . : Unknown IP Routing Enabled. . . . . . . . : Yes WINS Proxy Enabled. . . . . . . . : Yes DNS Suffix Search List. . . . . . : WorkGroup Ethernet adapter Wireless Network Connection 5: Connection-specific DNS Suffix . : WorkGroup Description . . . . . . . . . . . : Wireless-B Notebook Adapter Physical Address. . . . . . . . . : 00-0F-66-3B-64-2B Dhcp Enabled. . . . . . . . . . . : Yes Autoconfiguration Enabled . . . . : Yes IP Address. . . . . . . . . . . . : 192.168.2.9 Subnet Mask . . . . . . . . . . . : 255.255.255.0 Default Gateway . . . . . . . . . : 192.168.2.1 DHCP Server . . . . . . . . . . . : 192.168.2.1 DNS Servers . . . . . . . . . . . : 192.168.2.1 Lease Obtained. . . . . . . . . . : Thursday, January 31, 2008 9:03:04 A M Lease Expires . . . . . . . . . . : Thursday, February 07, 2008 9:03:04 AM Ethernet adapter Local Area Connection: Media State . . . . . . . . . . . : Media disconnected Description . . . . . . . . . . . : Intel(R) PRO/100 S Network Connectio n Physical Address. . . . . . . . . : 00-00-39-77-B2-85 C:\> ----------------------------------------- In the following netstat print out, anything with STONE-1 is replaced with 0.0.0.0 when the NIC is disabled. ----------------------------------------- C:\>netstat -a -b Active Connections Proto Local Address Foreign Address State PID TCP STONE-1:epmap STONE-1:0 LISTENING 312 c:\windows\system32\WS2_32.dll C:\WINDOWS\system32\RPCRT4.dll c:\windows\system32\rpcss.dll C:\WINDOWS\system32\svchost.exe C:\WINDOWS\system32\ADVAPI32.dll [svchost.exe] TCP STONE-1:microsoft-ds STONE-1:0 LISTENING 4 [System] TCP STONE-1:pptp STONE-1:0 LISTENING 4 [System] TCP STONE-1:1026 STONE-1:0 LISTENING 872 [ISafe.exe] TCP STONE-1:1027 STONE-1:0 LISTENING 872 [ISafe.exe] TCP STONE-1:1028 STONE-1:0 LISTENING 872 [ISafe.exe] TCP STONE-1:1032 STONE-1:0 LISTENING 116 [alg.exe] TCP STONE-1:4258 STONE-1:0 LISTENING 4076 [gconfd-2.exe] TCP STONE-1:4260 STONE-1:0 LISTENING 2212 [gnucash-bin.exe] TCP STONE-1:netbios-ssn STONE-1:0 LISTENING 4 [System] TCP STONE-1:427 STONE-1:0 LISTENING 4 [System] TCP STONE-1:1026 localhost:1042 ESTABLISHED 872 [ISafe.exe] TCP STONE-1:1026 localhost:1029 ESTABLISHED 872 [ISafe.exe] TCP STONE-1:1028 localhost:1035 ESTABLISHED 872 [ISafe.exe] TCP STONE-1:1028 localhost:1041 ESTABLISHED 872 [ISafe.exe] TCP STONE-1:1029 localhost:1026 ESTABLISHED 976 [VetMsg.exe] TCP STONE-1:1032 STONE-1.WorkGroup:4244 ESTABLISHED 116 [alg.exe] TCP STONE-1:1035 localhost:1028 ESTABLISHED 976 [VetMsg.exe] TCP STONE-1:1041 localhost:1028 ESTABLISHED 3088 [cctray.exe] TCP STONE-1:1042 localhost:1026 ESTABLISHED 3088 [cctray.exe] TCP STONE-1:4208 localhost:4209 ESTABLISHED 3404 [FIREFOX.EXE] TCP STONE-1:4209 localhost:4208 ESTABLISHED 3404 [FIREFOX.EXE] TCP STONE-1:4210 localhost:4211 ESTABLISHED 3404 [FIREFOX.EXE] TCP STONE-1:4211 localhost:4210 ESTABLISHED 3404 [FIREFOX.EXE] TCP STONE-1:4252 localhost:4253 ESTABLISHED 2212 [gnucash-bin.exe] TCP STONE-1:4253 localhost:4252 ESTABLISHED 2212 [gnucash-bin.exe] TCP STONE-1:4256 localhost:4257 ESTABLISHED 4076 [gconfd-2.exe] TCP STONE-1:4257 localhost:4256 ESTABLISHED 4076 [gconfd-2.exe] TCP STONE-1:4258 localhost:4259 ESTABLISHED 4076 [gconfd-2.exe] TCP STONE-1:4259 localhost:4258 ESTABLISHED 2212 [gnucash-bin.exe] TCP STONE-1:4260 localhost:4261 ESTABLISHED 2212 [gnucash-bin.exe] TCP STONE-1:4261 localhost:4260 ESTABLISHED 4076 [gconfd-2.exe] TCP STONE-1:4244 tutankhamon.acc.umu.se:ftp ESTABLISHED 3404 [FIREFOX.EXE] TCP STONE-1:4246 tutankhamon.acc.umu.se:ftp ESTABLISHED 116 [alg.exe] UDP STONE-1:1203 *:* 1000 C:\WINDOWS\system32\mswsock.dll c:\windows\system32\WS2_32.dll c:\windows\system32\DNSAPI.dll c:\windows\system32\dnsrslvr.dll C:\WINDOWS\system32\RPCRT4.dll -- unknown component(s) -- [svchost.exe] UDP STONE-1:1124 *:* 1000 C:\WINDOWS\system32\mswsock.dll c:\windows\system32\WS2_32.dll c:\windows\system32\DNSAPI.dll c:\windows\system32\dnsrslvr.dll C:\WINDOWS\system32\RPCRT4.dll [svchost.exe] UDP STONE-1:isakmp *:* 1676 [lsass.exe] UDP STONE-1:1043 *:* 1000 C:\WINDOWS\system32\mswsock.dll c:\windows\system32\WS2_32.dll c:\windows\system32\DNSAPI.dll c:\windows\system32\dnsrslvr.dll C:\WINDOWS\system32\RPCRT4.dll [svchost.exe] UDP STONE-1:l2tp *:* 4 [System] UDP STONE-1:4500 *:* 1676 [lsass.exe] UDP STONE-1:1202 *:* 1000 C:\WINDOWS\system32\mswsock.dll c:\windows\system32\WS2_32.dll c:\windows\system32\DNSAPI.dll c:\windows\system32\dnsrslvr.dll C:\WINDOWS\system32\RPCRT4.dll [svchost.exe] UDP STONE-1:1030 *:* 1960 [spoolsv.exe] UDP STONE-1:microsoft-ds *:* 4 [System] UDP STONE-1:1187 *:* 1000 C:\WINDOWS\system32\mswsock.dll c:\windows\system32\WS2_32.dll c:\windows\system32\DNSAPI.dll c:\windows\system32\dnsrslvr.dll C:\WINDOWS\system32\RPCRT4.dll [svchost.exe] UDP STONE-1:1034 *:* 616 c:\windows\system32\WS2_32.dll C:\WINDOWS\System32\iasrad.dll C:\WINDOWS\System32\iassdo.dll C:\WINDOWS\System32\iashlpr.dll C:\WINDOWS\System32\mprddm.dll c:\windows\system32\mprdim.dll C:\WINDOWS\System32\svchost.exe C:\WINDOWS\system32\ADVAPI32.dll C:\WINDOWS\system32\kernel32.dll [svchost.exe] UDP STONE-1:ntp *:* 616 c:\windows\system32\WS2_32.dll c:\windows\system32\w32time.dll ntdll.dll C:\WINDOWS\system32\kernel32.dll [svchost.exe] UDP STONE-1:1900 *:* 1216 c:\windows\system32\WS2_32.dll c:\windows\system32\ssdpsrv.dll ntdll.dll C:\WINDOWS\system32\kernel32.dll [svchost.exe] UDP STONE-1:1033 *:* 616 c:\windows\system32\WS2_32.dll C:\WINDOWS\System32\iasrad.dll C:\WINDOWS\System32\iassdo.dll C:\WINDOWS\System32\iashlpr.dll C:\WINDOWS\System32\mprddm.dll c:\windows\system32\mprdim.dll C:\WINDOWS\System32\svchost.exe C:\WINDOWS\system32\ADVAPI32.dll C:\WINDOWS\system32\kernel32.dll [svchost.exe] UDP STONE-1:1076 *:* 3088 [cctray.exe] UDP STONE-1:netbios-ns *:* 4 [System] UDP STONE-1:427 *:* 4 [System] UDP STONE-1:1900 *:* 1216 c:\windows\system32\WS2_32.dll c:\windows\system32\ssdpsrv.dll ntdll.dll C:\WINDOWS\system32\kernel32.dll [svchost.exe] UDP STONE-1:ntp *:* 616 c:\windows\system32\WS2_32.dll c:\windows\system32\w32time.dll ntdll.dll C:\WINDOWS\system32\kernel32.dll [svchost.exe] UDP STONE-1:4155 *:* 4 [System] UDP STONE-1:netbios-dgm *:* 4 [System] C:\>
I just checked, and we're defaulting to orbit_local_only - so no connections should be accepted from the open IP address. Having said this, by far the best solution here would be to use a Windows named pipe and native Win32 file I/O operations. That would require some extensions & work to ORBit2/linc/ but seems quite do-able. It should also improve performance & security.
Do you have a suggestion for a temporary work-around? Unplugging the network cable or disabling my wireless card in order to use GNUCash seems a little silly.
*** Bug 506195 has been marked as a duplicate of this bug. ***
Sorry to resurrect this bug from the dead, but it has come back. I am using Vista Home and I updated from GnuCash 2.2.7 to 2.2.8 using the installer and the exact same bug happened. I re-installed twice and continued to get the same issue. I reverted back to 2.2.7 and it opens just fine. I tried to follow the string above to see if I could find a work-around. I could follow about 1/2 of the conversation, the rest was a bit beyond my tech skill. I tried several versions of the Orbit bin files, including 2.13.3 and some of the 2.14 versions. This did not resolve the issue. I tried deactivating my network adapter (with the 2.13.3 bin files) and that also did not help. One additional bit of information - I was able to pause-break while one of the DOS windows was open and a message stating that gconfd-2.exe was too large for memory is what appeared in the box. I am more than willing to do a screen share if a developer needs to view the error. I look forward to any assistance, I LOVE GnuCash.