After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 324234 - Using g_io_add_watch_full() to wait for connect() to return on a non-blocking socket returns prematurely
Using g_io_add_watch_full() to wait for connect() to return on a non-blocking...
Status: RESOLVED OBSOLETE
Product: glib
Classification: Platform
Component: win32
2.8.x
Other All
: Normal normal
: ---
Assigned To: gtk-win32 maintainers
gtk-win32 maintainers
Depends on:
Blocks:
 
 
Reported: 2005-12-16 02:39 UTC by Daniel Atallah
Modified: 2018-05-24 10:37 UTC
See Also:
GNOME target: ---
GNOME version: ---


Attachments
testconnect.c (2.57 KB, text/plain)
2006-01-04 00:27 UTC, Tor Lillqvist
Details
Testcase for the spurious OUT event bug (3.85 KB, application/octet-stream)
2007-03-20 05:38 UTC, Steven Brown
Details
Fix the spurious write event error (976 bytes, application/octet-stream)
2007-03-21 01:36 UTC, Steven Brown
Details

Description Daniel Atallah 2005-12-16 02:39:22 UTC
Please describe the problem:
I'm seeing intermittent problems while trying to use non-blocking I/O to prevent
a connect() call from blocking (using g_io_add_watch_full() to wait for the
connection).

What appears to be happening is that the fd is selected by an event other than
the connect being successful so that when I try to actually write to the socket
from my callback function, it fails and WSAGetLastError() returns WSAENOTCONN.

There doesn't appear to be a way to wait for only a connect to happen
(FD_CONNECT) - the G_IO_OUT condition will trigger on FD_WRITE or FD_CONNECT.

An interesting tidbit is that if I set the G_IO_WIN32_DEBUG env. var. to enable
debugging, my callback doesn't seem to get triggered until after the connect()
actually completes - I'm guessing that this is because the debugging causes
stuff to be delayed enough for the connect to finish.

Another thing that I've noticed is that I'm getting the same fd from my socket()
call that I had just closed - is it possible that I'm getting the HUP event
triggered for closing of the previous socket and that is triggering the my
connect callback? (I'm using the (G_IO_OUT | G_IO_HUP | G_IO_ERR | G_IO_NVAL)
conditions (the giochannel stuff is accessed through a wrapper)).

Hopefully this makes some sense, if not let me know and I'll try to clarify.
Thanks.


Steps to reproduce:


Actual results:


Expected results:


Does this happen every time?


Other information:
Comment 1 Daniel Atallah 2006-01-03 00:25:30 UTC
I finally had time and was able to get some debug output:

Here is what is happening:

connect() returns SOCKET_ERROR, WSAGetLastError() returns WSAEWOULDBLOCK

I then create a GIOChannel for the socket and watch it for the conditions: G_IO_OUT | G_IO_HUP | G_IO_ERR | G_IO_NVAL

The following debug information is printed from Glib:

g_io_channel_win32_new_socket: sockfd=1604
g_io_win32_sock_create_watch: sock=1604 handle=0x618 condition={OUT|ERR|HUP|NVAL}
g_io_win32_prepare: WSAEventSelect(1604, 0x618, {WRITE|CONNECT|CLOSE}
g_io_win32_dispatch: pollfd.revents=OUT condition=OUT|ERR|HUP|NVAL result=OUT

My callback function is now invoked, and I try to write to the socket:
send(1604, ..., 24) returns SOCKET_ERROR, WSAGetLastError() returns WSAENOTCONN

I looked through the giowin32 source and I think I've tracked down why this is happening.  This appears to be caused by giowin32.c:717 where the G_IO_OUT is being added to watch->pollfd.revents.  Any idea why this is being done?
Comment 2 Tor Lillqvist 2006-01-03 01:03:51 UTC
Are you sure you don't see any log message "g_io_win32_check: WSAEnumNetworkEvents..." between the g_io_win32_prepare and g_io_win32_dispatch ones?

You wouldn't happen to have a minimal test program...?

Regarding the code at line giowin32.c:717, I don't recall the exact details, but I am quite sure something will break if that would be removed ;-) It has to do with the different semantics of the Windows edge-triggered socket events and the level-triggered semantics of poll() and select() that the GIOChannel polling/watching is designed for. Or something like that... 

Comment 3 Daniel Atallah 2006-01-03 01:50:12 UTC
I'm positive there are no "g_io_win32_check" messages (that had me confused for a while until I noticed what was happening on line 717).

I'm having difficulty getting it to actually happen in a sample program (the timing is a factor - the socket has to actually block for a while during the connect()).  I have a program that I think *should* simulate it, but it doesn't seem to.  Any idea on how to cause the connect to block for longer?
Comment 4 Tor Lillqvist 2006-01-03 08:17:38 UTC
> Any idea on how to cause the connect to block for longer?

Try connecting to a far-away host? To port 25 of some SMTP server on the other side of the planet?

I looked for some code in the GNOME platform libs that would have been ported to Win32 that would use a GLib watch to wait for connect(), and libsoup does that, too. As far as I can remember, that works as expected (libsoup is used by Evolution to connect to GroupWise servers, which certainly did work when I had the chance to check it), but I guess I should try libsoup's test programs again... For some reason libsoup's connect_watch() does a getsockopt(sockfd, SOL_SOCKET, SO_ERRROR, &error, &len). This probably does affect the timings slightly...
Comment 5 Tor Lillqvist 2006-01-04 00:27:32 UTC
Created attachment 56747 [details]
testconnect.c

Test program that works fine for me... with both GLib 2.8.0 and 2.8.5. In what way does this differ from what your app is doing?
Comment 6 Daniel Atallah 2006-01-04 03:11:25 UTC
Thanks for writing the test.

It is similar to mine, but simpler.  One important difference that I noticed is that you're not putting the socket into non-blocking mode before the connect(), so the connect was never blocking.

So... I modified the example a little bit, basically adding:
  u_long imode = 1;
  ioctlsocket(s, FIONBIO, &imode)
before the connect call.
This does cause a WSAEWOULDBLOCK on the initial connect() call.

Unfortunately, I still can't get it to break.

The connect is happening successfully by the time the connect_cb is called.

I'm seeing:
g_io_win32_check: WSAEnumNetworkEvents (1940, 0x790) revents={OUT} condition={OU
T|ERR|HUP|NVAL} events={WRITE|CONNECT} error={0}
In the output prior to the g_io_win32_dispatch calls(which I don't see in the problematic scenarios in gaim).

Perhaps there is something different enouh about the scenarios I'm seeing this in.  At a high level, what happens is:
1- Connect to Server1.
2- Exchange niceties with Server1
3- Server1 provides a IP address to Server2
4- Connect to Server2.
5- Try to write to Server2 unsuccessfully

Perhaps it is because the connection with Server1 is still open or something that it is causing Server2's connect() to block longer?

Comment 7 Tor Lillqvist 2006-01-04 03:31:04 UTC
The call to g_io_add_watch() calls WSAEventSelect() which automatically sets the socket to non-blocking mode.

It's very odd that you don't see the g_io_win32_check() lines. Hmm, are you using the g_io_channel_read() or write() functions on the GIOChannels for the sockets? It might be that the channels are buffered in that case and the call to g_io_channel_get_buffer_condition() in g_io_win32_prepare() already indicates that the channel would be writable, and the check method then isn't called. Or something like that.... Maybe the giowin32.c code should automatically turn encoding and buffering off for channels to sockets?  Or only if they are watched? Dunno....

Anyway, try calling g_io_channel_set_encoding(channel, NULL, NULL) and g_io_channel_set_buffered(channel, FALSE) after creating them.
Comment 8 Tor Lillqvist 2006-01-04 03:50:27 UTC
Aha! I think I could reproduce a situation similar to your problem now. I finally happened to pick a SMTP server that responds rather slowly... kedu.cc.columbia.edu 128.59.59.70. Running testconnect with G_IO_WIN32_DEBUG=y shows:

g_io_channel_win32_new_socket: sockfd=1912
g_io_win32_sock_create_watch: sock=1912 handle=0x774 condition={OUT|ERR|HUP|NVAL
}
connect() at 15625
connect() returned at 15625
g_io_win32_prepare: WSAEventSelect(1912, 0x774, {WRITE|CONNECT|CLOSE}
g_io_win32_check: WSAEnumNetworkEvents (1912, 0x774) revents={OUT} condition={OU
T|ERR|HUP|NVAL} events={CONNECT}
g_io_win32_dispatch: pollfd.revents=OUT condition=OUT|ERR|HUP|NVAL result=OUT
connect_cb at 1453125: condition=0x4

after which I get the error from send() "A request to send or receive data was disallowed because the socket is not connected" etc (WSAENOTCONN, I assume).

As you see from the output, WSAEnumNetworkEvents returns with just FD_CONNECT set. I.e. the socket is connected but still not writeable (a rather odd state?!). The FD_CONNECT bit is however also mapped to G_IO_OUT. Maybe I shouldn't look at FD_CONNECT at all, and set G_IO_OUT only when I get FD_WRITE? Or maybe don't WSAEventSelect() for FD_CONNECT notification at all?
Comment 9 Daniel Atallah 2006-01-04 04:01:47 UTC
>The call to g_io_add_watch() calls WSAEventSelect() which automatically sets
the socket to non-blocking mode.

This doesn't seem to be quite right - WSAEventSelect() doesn't actually appear to be called until g_io_win32_prepare() is called (which is after the connect() function call would have completed).

To further clarify... the g_io_win32_check() function *is* being called (I added some debugging at the top), but none of the socket criteria are being met so that the "g_io_win32_check..." debugging isn't printed.  It seems that if somehow channel->write_would_have_blocked was set to TRUE, (as is actually the case here), then the spurious event wouldn't be triggered.

I don't have access to the actual GIOChannel, so I'm not using g_io_channel_[read|write]().

Calling set_encoding() and set_buffered() on the GIOChannel after it is created doesn't seem to have any effect - I still see the same behavior.

Comment 10 Tor Lillqvist 2006-01-04 04:09:06 UTC
> This doesn't seem to be quite right

Ah, sorry, you are right. It's too late for me, I guess...

Anyway, what do you think, would it be cleanest to simply remove all references to FD_CONNECT from giowin32.c? Then there wouldn't be any stray callbacks for a connected but not yet writeable socket.
Comment 11 Daniel Atallah 2006-01-04 04:18:52 UTC
>Anyway, what do you think, would it be cleanest to simply remove all references
to FD_CONNECT from giowin32.c? Then there wouldn't be any stray callbacks for a
connected but not yet writeable socket.

That would probably help the scenario you're seeing in comment #8, but unfortunately that doesn't appear to help the problem that I'm having (I just tried it :( ).

The key difference here is that no actual socket events appear to be legitimately triggered when g_io_win32_check is called in my scenario (whereas in yours, the FD_CONNECT is actually triggered).

I really appreciate the time you're spending to track this down.
Comment 12 Daniel Atallah 2006-01-09 02:51:45 UTC
After trying a number of things, I still can't make a simple test case that recreates this.

It is, however, really easy to recreate using gaim (I can provide steps to do so if you think that'd help shed some light on the situation).
Comment 13 Sebastian Lisken 2006-02-23 18:13:52 UTC
Well, with gaim I almost never get a connection to IRC servers irc.freenode.net and irc.undernet.org - would that help in constructing a test case? What's interesting is that when I sniff the network via Ethernet, I hardly get any activity, only an aborted TCP connect attempt (SYN from gaim to server, SYN+ACK from the server, RST from client). On the rare successful cases, the first two packets look the same, but the third is an ACK and the connection continues.

I would love to see this resolved, it would be bad to be stuck with GTK+ 2.6.10.
Comment 14 Daniel Atallah 2006-02-23 18:45:16 UTC
To avoid confusion (because wingaim people appear to read this):
The inability to use Glib 2.8.x with gaim is not entirely due to this bug (which I believe is the cause of Sebastian's comment #13). Current releases of wingaim do not use non-blocking I/O for the most part, so the changes in how the win32 GIOChannel implementation deals with sockets (leaving them in non-blocking mode) are part of the problem (see bug #147392 for more information about these changes).  The gaim cvs codebase has been fully updated to exclusively use non-blocking I/O to avoid that particular problem (which still leaves the issues described in this bug report).
Comment 15 Sebastian Lisken 2006-02-23 19:02:51 UTC
Daniel is right, it is gaim's incompatibility with GTK+ 2.8.x that I hope to get resolved. http://gaim.sourceforge.net/win32/#bugs links to this bug. If you're interested, the problem does still occur in the latest beta for gaim 2.0.0 - but that's outside this bugzilla's scope, so if at all, this should probably be continued by email. Thanks.
Comment 16 Michael DePaulo 2006-03-26 10:17:58 UTC
I would like to point out that now that Evolution is out for windows and it requires GTK 2.8. Casual users are going to have to be faced with the choice of being able to use gaim or being able to use evolution unless this is fixed soon.
Comment 17 Sebastian Lisken 2006-03-26 11:36:36 UTC
Good point - the same is happening with The GIMP. The development versions (2.3.x) depend on 2.8 now, so whenever the 2.4 versions come out (I have no idea when) that will probably apply to them as well. I am aware of comment #14 saying that this particular bug is not (completely) about gaim's incompatibility with GTK 2.8, but still hope that commenting on the bug is an semi-appropriate way of alerting the right developers. Sorry if that's not true.
Comment 18 John Ehresman 2006-03-26 17:30:24 UTC
This bug should be fixed, but if gimp & gaim used there own copies of gtk libraries, installing one wouldn't break the other.  As an end-user, I'm always upset when installing one piece of software randomly breaks another -- even when I know enough to fix the problem, I usually don't have the time to.  On linux, it's the distribution developers that usually deal with this, but that's not the case on win32.
Comment 19 Avi Halachmi 2006-04-04 10:14:35 UTC
(In reply to comment #18)
> This bug should be fixed, but if gimp & gaim used there own copies of gtk
> libraries, installing one wouldn't break the other.  As an end-user, I'm always
> upset when installing one piece of software randomly breaks another -- even
> when I know enough to fix the problem, I usually don't have the time to.  On
> linux, it's the distribution developers that usually deal with this, but that's
> not the case on win32.
>

Hi all, just joined this bugzilla, so, a short Hello World to everyone ;)

Anyway, regarding this bug. I'm facing the same issue of Gimp 2.3.7 and Gaim 2.0 beta GTK incompatibility. I think I'll be able to solve it this way or another (using the portable gaim instructions will be my 1st attempt, thx to SimGuy from #wingaim for the suggestion) but "simple" users will definately not be able to, so GTK compatibility across various win32 GTK application should be a major concern for everyone involved.

Without going into massive debate whether Gaim should move completely to statically linked GTK (or at least work with a private copy of GTK), I'd like to promote a portable Gaim distribution and portable GTK package for developers (if there isn't one already). The trend of portable applications is gaining popularity, and with large USB drives these days it's also very usable and usefull. I think it's good if GTK would have a distro aimed for easy static linking.

avih



Comment 20 Vladimir Nicolici 2006-04-27 07:58:35 UTC
(In reply to comment #18)
> This bug should be fixed, but if gimp & gaim used there own copies of gtk
> libraries, installing one wouldn't break the other.  As an end-user, I'm always
> upset when installing one piece of software randomly breaks another -- even
> when I know enough to fix the problem, I usually don't have the time to.  On
> linux, it's the distribution developers that usually deal with this, but that's
> not the case on win32.

Gaim is also affected by bugs from version 2.6 that are fixed in 2.8, like bug 107320 (see gaim bugs 1473836, 1465690, 1445290, 1422246, 1396421 1477409 that depend on it), so having a separate copy of gtk would not fix everything.
Comment 21 Kyndig 2006-07-01 17:34:09 UTC
I'm having this same problem with my software program. It will only work on early releases of 2.6 . If I try compiling it against any 2.8 version, the program will receive output from the connected socket - but is not able to send data. The program will just hang. Here is the network cvs file:
http://cvs.mudmagic.com/co.php/mudmagic_client/src/network/network.c?r=1.14
It uses write()

Was there any fixes or patches that arn't mentioned in this bug entry?

Thank you.
Comment 22 Kyndig 2006-07-01 17:59:24 UTC
If it's any assistance. Sockets work fine on windows for glib version 2.6 . It stopped working at 2.10 ( gtk release: 2.6.8 worked, gtk release 2.6.10 doesnt ) Looking at the CVS changelog for 2.6 to 2.10, I only see these entries which affected windows socketing:


        Implement watches for GIOChannels for write file descriptors on
	Win32 (#333098).
	
	* glib/giowin32.c (GIOWin32Channel): Add a new direction field.
	(read_thread): Initialize direction.
	(write_thread): New function.
	(buffer_write): New function.
	(g_io_win32_prepare): Handle the G_IO_WIN32_FILE_DESC case for the
	write direction.
	(g_io_win32_fd_write): Call buffer_write() if there is a writer
	thread.
	(g_io_win32_fd_close): Set space_avail_event for writer threads.
	(g_io_win32_fd_create_watch): Create the writer thread if
	condition is G_IO_OUT.
	(g_io_channel_win32_make_pollfd): Likewise here.

http://cvs.gnome.org/viewcvs/glib/glib/giowin32.c?r1=1.66&r2=1.67&diff_format=c

Comment 23 Kyndig 2006-07-01 18:32:58 UTC
Apologies for spamming the bug list. To clarify the last versioning information. 

The last Glib version that worked with my write() socket handling was glib version 2.8.0

I've tried all other glib versions up to glib 2.10.3 ( May 27 release ) and the i/o problem is present in these.

Thank you Tor for the help.
Comment 24 oracel 2006-08-26 19:53:49 UTC
Hi, is this bug going to be fixed any time soon? I'm using Gaim 2.0beta3.1 with GTK+ 2.8 and it does not work at all. Please give this a high priority. It has been open for an awfully long time now, and must be affecting alot of potential win32 users.
Comment 25 Bob 2006-08-26 23:16:47 UTC
7 months to fix this, ???
Comment 26 Tor Lillqvist 2006-08-26 23:21:57 UTC
Oh no! The sky is falling! The sky is falling!
Comment 27 oracel 2006-08-27 09:52:10 UTC
I am sure everyone appreciates your efforts to fix this bug as much as I do. But as mentioned before (and as you know) both Gimp 2.4 and Evolution will depend on GTK+ 2.8. This will give users (like me) a choice to either stop using new versions of Gaim, or stop using new versions of Gimp and Evolution.

Or I could just install Kubuntu on this dust bucket..
Comment 28 Tor Lillqvist 2006-08-27 10:39:51 UTC
At this point I am not sure what this bug is about any longer. The discussion has drifted off in irrelevant directions. Remember, for me it is utterly irrelevant whether installers for apps like Gaim or GIMP desperately fight each others trying to insist on just one installed copy of GTK+2 (including GLib) on a machine. My view is that each application should ship with versions of GTK+, Pango and GLib known to work with it. Attempting to have just one copy of GTK+, Pango and GLib installed on a Windows machine has proved itself many times over to be futile. Sorry. 

I know, in an ideal world that's how it should be. Just like on Linux, one GLib and GTK+2 installation per system. But Windows is just too vague and there aren't enough people resources available for maintaining GLib (and GTK+) for Windows to keep all versions 100% backward compatible. There are just so many ways to use GIOChannels attached to sockets, and the sockets themselves, that no way can I notice if some change that fixes some behaviour in one app doesn't break something else in another app that uses GIOChannels and sockets in a different way. Yes, the screams for another level of abstration, i.e. another cross-platform library offering higher-level networking API, that would have a thorough test suite, and prevent the user from even seeing the actual sockets. GNet?

Remember that the stable branch of GTK+ is now 2.10. There will be no more 2.8 source releases as far as I know, and no more 2.8 Win32 binaries either. At least not by me, unless some extremely urgent need would arise. It is very pointless to compare various GTK+ 2.6 (!) versions against eachothers here in bugzilla now. (As for now there hasn't been any Win32 binaries of GTK+ 2.10, but once 2.10.3 is out, I will distribute binaries.)

But actually, forget the previouos paragraph, what this bug talks about is entirely in GLib, not GTK+. Please inform what GLib versions you are talking about, not the GTK+ version that some insaller bundles with GLib. The stable branch of GLib is 2.12.

To Kyndig: surely you aren't using write() to write to a socket? That doesn't work at all on Windows. Use send(). Also, looking at errno after failed socket API calls is useless on Windows. The error code is available with WSAGetLastError().
Comment 29 Sebastian Lisken 2006-08-27 11:38:33 UTC
It's good to see a serious comment from a developer. Like I said before, A paragraph within http://gaim.sourceforge.net/win32/index.php#bugs points directly to this bug, making it responsible for gaim's inability to work with GTK+ 2.8 on Windows. That's why you will be getting a stream of comments, some sounding slightly impatient, from users wanting to use gaim with a recent version of GTK+ but not really able or willing to understand the finer detail of this bug. I include myself in this description.

Personally I've seen Windows apps with their own GTK+ copy - Inkscape has it, so does one of the alternative downloads for gaim; so readers coming from the gaim website should try the installers that do include gtk, in other words, move away from the "-no-gtk" installers even though it would seem nice to have one central installation of GTK+, as a workaround to the gaim problem. I have done that now, and I'm satisfied.

Again, if you've come here wanting to use gaim and another program that requires a recent GTK+ on Windows, my message to you is not to use the "-no-gkt" installer of gaim -- problem solved.

Developers, if you are disturbed by the off-topic comments, maybe you should lobby the authors of the gaim website to change the paragraph that points users to this bug. At the same time, I do hope there will be a point in time when it will be possible to use gaim with a recent GTK+ version under Windows. Could that be when the 2.10 binaries come out?
Comment 30 Sebastian Lisken 2006-08-27 11:46:49 UTC
This is very embarrassing - I didn't test properly. Forget most of I said in my last comment, especially my claimed solution.

Using the "with GTK installer" did not install a private copy of GTK+. It offered to reinstall its GTK to replace the newer 2.8 one. I didn't read that install question properly, that's why I thought the issue was resolved.

So for the moment the problem is still a real one, because there are two apps (gaim and GIMP) that do insist on using a central installation of GTK+. I still wonder if GTK+ 2.10 will not have the gaim problem. Could you tell us?
Comment 31 Avi Halachmi 2006-08-27 12:35:13 UTC
FYI, There are builds of both Gaim and Gimp with a private copy of GTK+. These can be found at http://portableapps.com . This is what I've eventually came to use (following my Comment #19). My suggestions from my previous post still stand.

avih
Comment 32 Kyndig 2006-08-27 23:37:00 UTC
Tor wrote:
"surely you aren't using write() to write to a socket? That doesn't
work at all on Windows. Use send(). Also, looking at errno after failed socket
API calls is useless on Windows. The error code is available with
WSAGetLastError()."

Thanks for the info Tor. I found the error of my ways with write() and did go back with send(). This fixed my problem. Thanks for the details on WSAGetLastError() that is exactly what I needed to get some specific socket functionality ported to windows.

Kyndig
Comment 33 Valerio Messina 2006-08-28 23:08:21 UTC
Tor, can we have different GTK minor version on a Win32 system?
On Linux, in the shared lib directory, normally I keep different version of every lib, so I can resolve all the dipendancies. Is this possible with Windows?
I dont understand what is the difference between:
'gaim-2.0.0beta3.1-no-gtk.exe' (4 MB) and 'gaim-2.0.0beta3.1.exe' (8 MB)
http://sourceforge.net/project/showfiles.php?group_id=235&package_id=253&release_id=440695
I thinked the 8 MB version is statically linked with GTK, but seems that it depend also on the GTK runtime, so I'm confused???!!!
Comment 34 Daniel Atallah 2006-08-28 23:23:30 UTC
I have to apologize for WinGaim users polluting this bug report.  It has never been my intention for people to come and post "me too" and "this needs to be fixed" responses to glib bugs.

In response to Comment #33 that is unrelated to glib - please don't post gaim questions here.  FYI the version "with GTK+" merely includes an installer for a GTK+ runtime, it isn't statically linked.

Gaim users: Please direct any complaints and questions to the gaim forums, mailing lists and bug trackers on Sourceforge.  Only you actually want to work on this particular should you be here.
Comment 35 Kurt Fitzner 2006-09-27 23:34:09 UTC
WRT comment #28:
"Attempting to have just one copy of GTK+,
Pango and GLib installed on a Windows machine has proved itself many times over
to be futile. Sorry."

If this is the case, then GTK+ should be a static library on windows.  

On the one hand we have dll releases and more or less "official" installers that place GTK+ on windows in %PROGRAMFILES%\common, and then on the other hand we are told as developers oh, it's not smart to use a common installation.

If you want to discourage common usage of the library in Windows, then you have a funny way of doing that.

Comment 36 Tor Lillqvist 2006-09-28 07:51:56 UTC
> If this is the case, then GTK+ should be a static library on windows.

Feel free to set up a distribution of a statical GTK+ (and dependencies?).

> If you want to discourage common usage of the library in Windows, 
> then you have a funny way of doing that.

I don't want either to discourage or encourage common usage of GTK+ on Windows. Why would I care as long as I have fun myself?
Comment 37 Tor Lillqvist 2006-09-28 08:08:49 UTC
Sebastian Lisken wrote:

> I still wonder if GTK+ 2.10 will not have the gaim problem.

As I said a couple of comments earlier, the problem (if you mean the problem I am thinking of, i.e. the changes in semantics for GIOChannels to sockets) is not in GTK+ but in GLib. And there hasn't been any newer changes in GLib that would have retracted the changes in GLib 2.8 that initially started causing the problems in GAIM.

Valierio Messina wrote:

> Tor, can we have different GTK minor version on a Win32 system?

As I said a couple of comments earlier, it's some installers that attempt to prevent having several versions of GTK+ (or Pango, GLib, etc) on the same machine. There is nothing in the code itself that would prevent it. I run without problems simultaneously applications using GTK+ 2.6, 2.8 and 2.10. (And correspondingly varying Pango and GLib versions.)

All you need to do is know what you are doing, and set up PATH so that the version you want for each application in question. (Either interactively using shell commands (in a shell function, typically) before starting the app from a shell, with a command file (.bat/.cmd), an explicit non-GTK-using small starter .exe that modifies PATH and starts the main app, having the GTK+ etc DLLs in the same folder as the app's .exe, the App Paths trick in the Registry, etc. Many ways to choose from.)
Comment 38 Steven Brown 2007-02-25 09:29:30 UTC
I think I'm running into this bug - the way to reproduce is to run a timer along side an asynchronous connect (point the connect at a host that's eating packets so it stays pending).  Every time the timer fires, you'll get one single spurious OUT event.  So, the event loop is being woken up to handle the timer, and incorrectly decides each time that happens that activity happened on the socket as well.

Here's what it looks like with G_IO_WIN32_DEBUG defined (ignore the double WSAEventSelect, that's another bug: bug #338943):

g_io_channel_win32_new_socket: sockfd=1916
g_io_win32_sock_create_watch: sock=1916 handle=0x768 condition={IN|HUP}
g_io_win32_sock_create_watch: sock=1916 handle=0x768 condition={OUT}
g_io_win32_prepare: WSAEventSelect(1916, 0x768, {READ|ACCEPT|CLOSE}
g_io_win32_prepare: WSAEventSelect(1916, 0x768, {WRITE|CONNECT}

And then the unrelated timer goes off, causing a single spurious OUT event on the socket as well:

g_io_win32_check: WSAEnumNetworkEvents (1916, 0x768) revents={} 
condition={IN|HUP} events={}
g_io_win32_check: WSAEnumNetworkEvents (1916, 0x768) revents={} condition={OUT} events={}
g_io_win32_dispatch: pollfd.revents=OUT condition=OUT result=OUT
Comment 39 Steven Brown 2007-02-25 09:38:24 UTC
Forgot to mention, the previous comment is using glib 2.12.9.
Comment 40 Steven Brown 2007-03-20 05:38:21 UTC
Created attachment 84945 [details]
Testcase for the spurious OUT event bug

Attached is a test case that will reliably detect the spurious OUT event bug, as of 2.12.11, on win32.  This bug report has two bugs in it it seems, one is an issue with FD_CONNECT, one is an issue with the kind of spurious event my test case shows.
Comment 41 Steven Brown 2007-03-21 01:36:17 UTC
Created attachment 85016 [details]
Fix the spurious write event error

The attached patch disables that line as mentioned that causes the spurious write events, which fixes that problem.

Despite the line being obviously broken, there was surely something it was intended to fix, but I can't find any info on why it was added.  The change it came in on was a bulk change to the iochannel code and the comment on the code itself is less than helpful.  write_would_have_blocked is only used for this case which is pretty much the only hint to go on.  The only thing I can think of that it might have been used for would be if hitting WSAEWOULDBLOCK would prevent the OUT event from being flagged next pass even if drained (it'd not work in that case either AFAIK as there's nothing to cause the event loop to wake up), however, I tested this[1] and it's not the case.  I also tested to see if the OUT event wouldn't be normally re-asserted[2] without it, and it's not the case.

So, seeing as it's quite broken and causing known problems, and no one seems to know if removing it would cause problems, or can show that removing it does cause problems, I think it should be removed.  It'd be better to get a fresh bug report about what bug, if any, this line was actually preventing and to fix that than to leave this sitting in the code unfixed any longer.


[1] https://svn.wiisard.org/WIISARD/trunk/wiisard3/doc/glibmmtest/src/survive_WSAEWOULDBLOCK.c

[2] https://svn.wiisard.org/WIISARD/trunk/wiisard3/doc/glibmmtest/src/reassert_out.c
Comment 42 Tor Lillqvist 2008-08-19 00:50:56 UTC
A similar patch as Steven's also helps bug #548278. I will probably commit it before GLib 2.18 is released.
Comment 43 Tor Lillqvist 2008-08-19 23:55:07 UTC
At least the "Testcase for the spurious OUT event bug" program is fixed by the fix just committed to trunk:

2008-08-20  Tor Lillqvist  <tml@novell.com>

	Bug 324234 - Using g_io_add_watch_full() to wait for connect() to
	return on a non-blocking socket returns prematurely

	Bug 548278 - Async GETs connections are always terminated
	unexpectedly on Windows

	* glib/giowin32.c: Add one more state variable to the
	GIOWin32Channel struct, ever_writable. Initialise it to FALSE, set
	to TRUE when the WSAEventSelect() indicates FD_WRITE, and never
	reset to FALSE.

	Don't do the WSASetEvent() in g_io_win32_prepare() unless
	ever_writable is TRUE. Don't automatically indicate G_IO_OUT in
	g_io_win32_check() unless ever_writable is TRUE.

	This fixes the behaviour of the test case program in bug #548278,
	and the "Testcase for the spurious OUT event bug" in bug
	#324234. It also doesn't seem to break anything. Not that there is
	any exhaustive test suite...

	Add a comment with a list of bugs that are related to the code in
	this file.

Whether it's worth it to keep this bug still open I don't know. A fresh bug report for each specific remaining issue, hopefully with a test case, would be great...
Comment 44 GNOME Infrastructure Team 2018-05-24 10:37:50 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to GNOME's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.gnome.org/GNOME/glib/issues/36.